When the team first started using Scala and Apache Thrift we didn't really think much about testing.  As a [mostly] .NET shop, the learning curve for a new language, toolchain, and a myriad of new libraries was quite overwhelming. Instead of really thinking about testing we plowed ahead and created an elaborate PoC for a business need.  Once we validated our technology choice and that the business need in of itself was sound, we started seriously considering the bigger picture such as recommended design patterns, deployment, a production implementation, and of course, testing!

Our project has a Thrift client and Data Transfer Object (DTO) model generated via twitter's scrooge (a Thrift code generator written in Scala).  For testing I needed a set of test data, but for obvious reasons I did not want to have to connect to the Thrift server in order to run my test suite.  In order to get the data I needed I had a few options.  I could manually new up DTO's manually and one at a time, or I could capture some real data and massage it for my purposes.  As your typical lazy developer, I chose option 2.  This raised the question about how does one actually capture data being sent through a Thrift service?  Well, clearly it's doing a bunch of serialization & deserialization on its own otherwise it would not be a very good choice for transmission of data structures!  Unfortunately, actually figuring out how to manually serialize a scrooge generated DTO was a bit of a challenge and I couldn't find much information on how it was done.

However, since pretty much all of our Thrift code is generated and sitting in the project we have some clues on how to go about the process.  After about an hour in the Scala REPL I was able to serialize a DTO into JSON and vice versa. Below is a helper class I developed to serialize DTO's to and from disk.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import java.io.{ FileOutputStream, FileInputStream }
import org.apache.thrift.transport.TIOStreamTransport
import org.apache.thrift.protocol.TJSONProtocol
import com.twitter.scrooge.{ ThriftStructCodec, ThriftStruct }
 
object ThriftSerializer {
    def serialize(thriftStruct : ThriftStruct, path : String) {
        val fileOutputStream = new FileOutputStream(path)
        try {
            val thriftTransport = new TIOStreamTransport(fileOutputStream)
            val jsonProtocol = new TJSONProtocol(thriftTransport)
            thriftStruct.write(jsonProtocol)
        } finally {
            fileOutputStream.close
        }
    }
 
    def deserialize[T <: ThriftStruct](thriftStructCodec : ThriftStructCodec[T], path : String): T = {
        val fileInputStream = new FileInputStream(path)
        try {
            val jsonProtocolFactory = new TJSONProtocol.Factory
            val jsonProtocol = jsonProtocolFactory.getProtocol(new TIOStreamTransport(fileInputStream))
            thriftStructCodec.decode(jsonProtocol)
        } finally {
            fileInputStream.close
        }
    }
}

When using TJSONProtocol as the data format you will notice that although the output is JSON, it's not in the most human readable form. In order for Thrift to support versioning of DTO's, the fields are represented by the integer of the field as defined in your Thrift file. Below are a few models I've defined in Thrift that I will use in examples for the rest of this post.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
typedef i64 DateTime

struct Bar {
1: i32 x
2: i32 y
}

struct Foo {
1: string styleName,
2: DateTime created,
3: list<Bar> bars
}

...

A more conventional Thrift JSON serializer does exist, the TSimpleJSONProtocol, but it only supports write operations. This is an example of the JSON you can expect out of the TJSONProtocol serializer.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
{"1":{"str":"gangnam"},"2":{"i64":1234567890},"3":{"lst":["rec",2,{"1":{"i32":1},"2":{"i32":2}},{"1":{"i32":3},"2":{"i32":4}}]}}
 
// Here's the same JSON pretty printed.  Beware of pretty printing the JSON and attempting to load it with
// ThriftSerializerand defined above.  It will not work.  Thrift does not recognize the extra whitespace
// formatting.
 
{
    "1" : {
        "str" : "gangnam"
    },
    "2" : {
        "i64" : 1234567890
    },
    "3" : {
        "lst" : ["rec", 2, {
                "1" : {
                    "i32" : 1
                },
                "2" : {
                    "i32" : 2
                }
            }, {
                "1" : {
                    "i32" : 3
                },
                "2" : {
                    "i32" : 4
                }
            }
        ]
    }
}

So, not the most readable, but it's something you can work with. It should be noted that it would be a fairly straightforward exercise to extend the TJSONProtocol to write field names along with the field ID as defined by the Protocol abstraction in the Thrift documentation. I'll save this for a future experiment.

Now to use our test data in an actual test.

I chose to structure our web project using a Service Orientated Architecture with dependency injection. Scrooge genereated a trait that serves as a definition of the contract for the service. I implemented this trait in an adapter I wrote to encapsulate calls to the service. Missing from the code below is a companion object that instantiates the MyThriftApi client using twitter's finagle library. If you're familiar with scrooge generated clients you'll notice that I'm overriding the asynchronous nature of the API call (getFoo(name).get). I did this so that I can define all my asynchronous calls at once further up my stack. As Clamatius pointed out in the comments, overriding the Futures capability of my Thrift client isn't a best practice despite the reasoning mentioned above. Therefore, I've updated the following code samples to make use of the FutureIface that scrooge generated.

1
2
3
4
5
6
7
package data
 
import data.dtos._
 
class MyApiAdapter extends MyThriftApi.FutureIface { 
  def getFoo(`styleName`: String): Future[Foo] = MyApiAdapter.client.getFoo(styleName)
}

My service will inject a MyApiAdapter instance which will give me the opportunity to mix-in something to intercept the actual call to the Thrift client.

1
2
3
4
5
6
7
8
class FooService(myApiAdapter: MyApiAdapter) {
  def DoKungFoo(styleName: String):Pain = {
    val fooFuture = myApiAdapter.getFoo(styleName)
    // ... 
    // do some kung foo via the aforementioned style bring the pain!
    pain
  }
}

Finally, a test for FooService which will mock out the MyApiAdapter and return a deserialized Foo object from the file system. I'm using specs2 for tests, which appears to be the test framework of choice for the Scala elite. Note the use of the cake pattern when passing in the MyApiAdapter to the service. This will force the MyApiAdapter to use the implementation defined in the TestDataReader trait.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
package tests.unit
 
import data.MyApiAdapter
import data.dtos.Foo
import akka.dispatch.Future
import org.specs2.mutable._
 
object FooServiceSpec extends Specification {
   
  // a mix-in for MyApiAdapter
  trait TestDataReader extends MyApiAdapter {
    override def getFoo(`styleName` : String): Future[Foo] = Future{ThriftSerializer.deserialize(Foo, "foo.json")}
  }
 
  "FooService" should {
    "Bring the pain" in {
      val service = new FooService(new MyApiAdapter with TestDataReader)
      val pain = service.getFoo(styleName = "gangnam")
      // ...
      // assert that we brought the pain sufficiently
    }
  }
}

As a noob to Scala, Thrift, and even specs2 I would appreciate any feedback on this subject matter. Please feel free to leave a comment.