Technology Change: .NET to Scala
In our careers as software developers we're frequently pigeonholed to one particular language & framework. Often times these specializations go even further and before you know it you'll find yourself being the front end guy, the datalayer girl, or in my case: the person who gets saddled with anything that has to do with deployment and devops work in general :). This is a story about how Empathica as an organization transformed itself from being a Microsoft .NET shop to a Scala one. It didn't happen overnight, and this is definitely not a howto guide on how to achieve similar change somewhere else. If you're looking for a side by side comparison of Scala and C#/.NET, you won't find it here. This is just an account of how the change itself unfolded. Since a transformation of this kind is thus far unique in my career I wanted to share it with the community and hopefully provide some wisdom for other teams deciding to take a similar plunge into Scala or some other open source technology.
The stage
Empathica was almost exclusively a Microsoft shop as late as the end of summer 2012. There was some legacy infrastructure in Java, but it had remained largely untouched since the company formed in the early 2000's. For all intents and purposes, Empathica embraced using Microsoft technologies throughout the entire stack. Everything was written in C#, all servers ran on Windows Server, OLTP and OLAP databases were SQL Server and Analysis Services. It wasn't until Simon Palmer joined the company as CTO that anyone in the development department gave the technology stack a hard look. After all, once you're fully invested in Microsoft it can be quite difficult to break away.
As an aside I would like to mention that in recent years Microsoft has made great strides in supporting the OSS community. They've open sourced large parts of the CLR, open sourced big projects like Entity Framework and ASP.NET MVC, and have even provided a portable class library license that can be run on platforms other than Windows. There are lots of new Microsoft OSS initiatives as indicated by their OpenTech division. However, I think it will still be some time (if ever) that we see widespread support and adoption of Microsoft works into the larger open source community.
When Simon came onboard it represented a sea change for Empathica's development department. At a higher level it changed the company from being services orientated to products based. The development process was switched nearly immediately to XP. Consultants were brought in from Berteig Consulting to help with the transformation. Internal political red tape was cut and development teams were empowered to do their own project planning, track their own progress, and self organize with as little direct supervision from the management team as possible. Members of the development team were encouraged to challenge their existing beliefs about how to build software.
EDIT 25/10/13: I want to elaborate a bit about when our new CTO came onboard. At the time, Empathica's products were stagnant, we were beginning to lose big clients, and something needed to be done. Simon was brought on to give the company more effective technical direction because up until then there were a few people wearing CTO-type hats, but not one voice in particular. Simon brought a wealth of software development leadership to the organization. He's incredibly tech savvy, a serial entrepreneur, and has a great deal of experience applying Agile practices and principles to large development teams. One of the first changes Simon made was to bring Agile to Empathica. It's my opinion that that change is what set the stage for developers to try new technologies like Scala.
As usually happens with change, it didn't resonate with everybody. A lot of the existing development staff weren't too keen on what was happening to the department. There was a slow exodus of technical staff who left to seek new opportunities. This gave the management team the opportunity to hire a new team of smart and passionate developers, product managers, and user experience people that were onboard with the new direction. It was around this time that I was fortunate to be offered a developer position at Empathica in July 2011.
The new team was challenged to reimagine our space (Customer Experience Management) by developing a new line of products. XP gave us lots of flexibility to operate as we saw fit. Teams were flat; the most junior developer had as much of a voice during an iteration planning session as a veteran of the craft. I had more input into the development process and product direction at Empathica than I did as a Team Lead and Development Manager in my past positions. I credit this open and collaborative culture as the main reason that we were able to try new technologies like Scala in the first place.
Why Scala?
Now that the stage is set, I'll elaborate on what exactly got the ball rolling to try Scala. Before every development project we have a "technology discussion". We're always open to trying new things, but for the most part new projects chose a base technology stack of C#, ASP.NET MVC, etc. We kicked off a number of new projects around August of 2012. One of these projects turned into a Scala project that I'll affectionately refer to as BeetleJuice for the remainder of this post. The kickoff for BeetleJuice started off as a "Hey, wouldn't be cool if we had something like this to compete with competitor X?" In fact BeetleJuice initial requirements were so scarce we didn't even know if it would be something we would release to our clients. It was an experiment; a grandiose software spike into an area of the business we never fully explored before. This gave us an enormous amount of liberty to try a new language and framework because:
- The project wasn't overly complex
- As with most software spikes there was an implicit permission to fail and to try new things
- At the end of the day it didn't really matter what language or framework we chose for our application server because of its simple nature
It was then I challenged the rest of the team to honestly consider other technologies. "This is our chance to try something new. Let's take it." I said. Personally, I had had some experience with Python and Django, but I was also interested in Ruby and the Rails framework, so I proposed both. Another colleague tossed Node.js into the ring. And finally, one of our newest hires at the time, Steven Skelton proposed Scala.
I didn't know much about Scala. I gave myself the weekend to read up on it and I quickly appreciated its elegance and concise syntax. I recognized the opportunity that since it is JVM based we implicitly gained awesome platform interoperability and native OSS library support. Scala is statically typed, which seemed refreshing to me given the mainstream's infatuation with dynamic languages like Ruby, Python, and JavaScript over the past 10 years. It has a reputation as being a great language to write highly performant concurrent systems in. We were sold and in the next iteration the BeetleJuice team decided to give Scala a try.
The experiment
We created a Play! 2 web project. It was an easy transition from ASP.NET MVC from a framework perspective. We started developing BeetleJuice and it wasn't long before we knew that we could deliver a functioning Scala web application with a minimal hit on productivity.
Once we began demonstrating our progress at weekly iteration demo's it became clear that BeetleJuice was evolving out of its experimentation phase and into something that had real business value. When this became apparent, the management team decided to take a closer look at our technology choice and started asking tough questions like "How do we deploy this in a production environment?" and "Will this framework and language be around next year?" and inevitably "Am I going to be able to find people to support this technology in 1, 3, and 5 years?" I started investigating the maturity of Play! and to my delight found that it had an great list of endorsements from tech companies around the world (LinkedIn, Klout, theguardian, to name a few). The Play! 2 framework was still new when we first began using it, but its version 1 had been available to the community for some time. From Play! we soon learned more about the Typesafe stack and services. We came to the conclusion that due to Typesafe's inclusion of Play! into its stack and the successful history of the project itself, that we had picked a winner.
It was also becoming clear that Scala was being heavily adopted in the industry based on data from job sites like Indeed.com, Typesafe's impressive client list, the number of native Scala OSS projects available, and more anecdotal information like the number of recruiters contacting me on LinkedIn! This chart on Indeed.com shows that in early 2013 employers looking for developers with Scala experience for the first time outpaced the demand for those with Clojure experience or any other JVM language other than Java itself.
EDIT 26/10/13: I felt it important to only compare 2nd generation JVM languages. IMO it's not fair to include the Java language itself because it dwarfs adoption of 2nd gen languages due to its omnipresence in the industry for over 20 years.
Simon and some of our other technical management began dabbling with learning Scala themselves. I remember Simon telling me that the power of the language is obvious, but that it would be easy misuse it and get yourself in trouble. Why of course. With great power comes great responsibility. This reminded me of that old programmer humour post "How to Shoot Yourself in the Foot in Any Programming Language" by Mike Walker.
Scala: you stare at your foot for 3 days without any sleep, you then figrue [sic] out how to shoot yourself in the foot with one line of code… recursively.
Granted, this addition came from a commenter about 7 years after the original post, but I think the author described Scala perfectly within the context of the meme!
Legacy integration
Integration with the rest of our Microsoft platform and products was a concern from the start. It was time to start investing in backend API's. Not only to integrate with our .NET projects, but because of a separate long term goal to abstract away our persistence layer so we could change it in the future.
Our API initiative began with serving only the data needs of BeetleJuice. We started a spike on different API technologies. The first was a tomcat based web service that let you abstract queries based on an ANTLR domain specific language. You could craft HTTP requests that contained projections, filters, and joins (like SQL) and it would return a JSON representation of the data. It worked, but it was quite complicated to use and the queries looked a whole lot like our data access queries themselves.
Our second API implementation was written by Steven and based Apache Thrift and Twitter's Finagle. The API endpoints had methods you could call much in the same way as a traditional RPC service. Thrift/Finagle gave us the capability to generate native C# and Scala code so we didn't need to manually write our own data transfer objects, clients, and services. Finagle has a ton of functionality out of the box such as easy horizontal scaling with ZooKeeper, excellent stats using Ostrich, concurrency made easy with Twitter's library of threadpooling logic, and an easy way to instantiate servers and client connections. Thrift/Finagle is an excellent choice to write API's in. Steven has blogged about Thrift and Finagle extensively and even gave a short talk at a local Toronto Scala Meetup. You can find his slides here.
Once we had a Thrift API ready and in production we were able to get BeetleJuice and our other .NET projects to connect to it and start consuming data.
A moment of doubt
BeetleJuice and our new API were proving to be a success. It wasn't long before a large new project, Grail, was on the table and a new technology discussion began. This discussion was different than those past because we now had some actual Scala experience on the team. We even hired top Scala talent such as the likes of Katrin Shechtman (who runs a cool startup on the side called Becipe that runs on a Scala, Play! Mongo and Heroku stack). I'll admit that I had my reservations about the language and I thought it was my responsibility to contrast some of the Scala evangelization that had been growing on the team and play devil's advocate.
My main argument against Scala was a common complaint about the tooling. The compiler's slow, the IDE's are buggy and lacking in features, the preferred build and dependency management framework, SBT, has a tremendous learning curve. I was seriously questioning whether or not the rest of the developers on our team were going to get onboard with Scala knowing that using it would mean taking a productivity hit at first. I made my arguments, but ultimately it was decided it was worth the risk. I was relieved that management and my developer colleagues were onboard despite the drawbacks. It was at this time that I decided to make it a priority to really embrace Scala and find new ways to cope with the different development practices and toolchain from what I was used to in .NET.
An endorsement of Open Source technology
After Grail had been humming along for a few weeks, Simon made an official announcement to the development department that endorsed the use of Scala, Linux, and other open source tools and technologies. This was a critical moment in our organization because up until then C#, .NET, and the Microsoft stack were our default choices. This was Simon's announcement outlining our long term technical strategy.
To: Empathica Developers
Subject: Technical StrategyFolks,
I seem to have set the Scala cat among the C# pigeons yesterday, so I wanted to lay out my view of our long-term technical strategy in the hope that it will allay some fears and clear up any ambiguity that I may have caused. And I’d like to apologise if anyone felt disenfranchised, or as though there were important decisions being made without their input.
First, some principles:
- We make a point of recruiting the brightest technical people we can find and our expectation is that you will be comfortable working in whatever technology is required for the job at hand. I think you all fit into this category and I hope that you feel the same way, but if you are working for a long time in a single technology a rut can look a lot like a groove.
- The job at hand will require different technologies depending on the need, and we will not be bound to any single technology. A technology choice will be made appropriately for the needs as we see them. The choice of technology is one of the hardest because it happens early and is somewhat irreversible. It also has many dimensions, some of which are hard to discern at the start of a project.
- We have to balance our agile rapidity against good architecture. It is one of the core criticisms of agile, and particularly XP, that you too easily ignore architecture as you ricochet from one user driven feature to the next. This point is particularly relevant to us now as we are embarking on two significant architectural changes, one being a core data model change, the other inserting an API layer.
- There’s never a good time to make a large architectural shift and you will always be balancing tactical commitments against long-term aspirations, as we are right now. Furthermore you never get given the time to do it. A data API is one of the key missing components in our platform. What we have done with the [API] is an excellent example of how to start to insert a formal tier in the architecture, and what we are contemplating for [Grail] and [MultiPass] access is a good natural extension. The success of any future revolution in our data tier will be contingent on us having done the [API] work well.
- A profusion of service endpoints has a somewhat bad architectural smell. You should have good reasons to fragment your APIs into many pieces and expedience is rarely a good reason. A single endpoint may not be the answer either, but there should be a good architectural reason to split it apart.
With these in mind here is what I would like to see us move towards in the medium to long term:
- Minimal dependence on Microsoft, both O/S and data stores
- Increased infrastructure in “the cloud”
- Horizontal scalability as a fundamental architectural principle
As a consequence this probably means:
- Less C#/.Net, more Scala/Java/Open Source and
- Less SQLServer/OLAP Services, more NoSQL/Columnar Storage/Mongo/Vertica
- Less Windows, more Linux
- Less cabinets/pizza boxes, more AWS/Private cloud
Taking a leaf from the agile manifesto wording, I mean: “while I see value in the things on the left, I see more long-term value in the things on the right”.
So, no hard decision has been made about our future languages or technologies, and there are no rules that all new development must happen in Scala. Instead I want us all to align on the principles and then make the right decisions as we balance our short-term needs against our long-term aspirations.
If you have any further questions please feel free to come and find me. I am stuck at home today because the four snowmen of the snowpocalypse arrived in Oakville this morning, but I’ll gladly take a phone call.
Simon Palmer
Chief Technology Officer
Empathica Inc
Dealing with technology change
Not everybody was pleased with the transition to Scala. Giving up years of experience with one particular language and framework just to be a "beginner" with a new one can be a tough pill for some people to swallow. Most people inherently don't like change. For a programmer, changing technologies can have a profound impact on your productivity and in some cases your mental stability! In our industry you have to fight to overcome these innate reactions.
In the technology world there are no excuses for people who don't adapt. Software development doesn't have a deep body of knowledge like some engineering practices do (I know calling software development an engineering discipline is a slippery slope, but please indulge me). For example, the know-how and best practices for building a bridge have been known for thousands of years. What we consider software engineering may have been around for close to 50 to 80 years, depending on how liberal your definition of what defined the beginning of software engineering is. It could be decades or even centuries before we reach some kind of steady state of technological development. Until that time people in the software industry need to learn to adapt and embrace what's new or be left behind.
A good carpenter finds the right tool for the job. This analogy easily extends to software, especially when the tools and jobs themselves change at such an astonishing pace! Compiled, scripted, imperative, or functional. Pick something the community ascribes to to get the job done. Haven't used it yet? Try it out! Those who don't strive to learn new languages and technologies are in for a rude awakening when the years of experience they have with Blub don't count for anything with employers.
A data renaissance through API's
There became a need to extend our Scala Thrift/Finagle API as Grail was being developed. It was decided that when there were data needs for new projects they would be put behind our API. This began a data renaissance of sorts; not only did we put Grail's needs behind an API, but as we identified overlapping needs of our existing products, we retrofitted them to make use of the API as well. This had a snowball effect of establishing API's for the remainder of our legacy projects that were still under active development.
A significant allocation of our development team was refactoring projects to use Thrift while at the same time establishing data contracts for API's. For a number of months we checked off boxes on our API wishlist. An incredible amount of code was written, refactored, rewritten, or simply not needed any more. Our product middle tiers shrank down to a token of their former sizes. Introducing an API layer forced us to establish clear lines between our apps and their data. This separation of concerns was a huge boon to the distributed architecture of our entire infrastructure and in line with the sentiments Simon made in his technical strategy.
We started thinking a lot about concurrency and distributed systems. We started investigating new data access technologies to standardize across our API layer such as Slick for RDBMS and Couchbase for caching. We had already been using Mongo for parts of our infrastructure, but we also started planning an overhaul of the rest of our RDBMS infrastructure from SQL Server to cheaper and ironically, more scalable solutions like Vertica (a columnar relational database) and "No SQL" databases. All of these decisions were made that much easier because of the introduction of our API layer and the myriad of library choices available in the Scala and Java OSS ecosystem.
Training and education
A lot of Empathica developers were interested in using Scala. Unfortunately, other than those that had been working on BeetleJuice, Grail, and our API, there weren't a ton of people with experience in the language. We entertained the thought of bringing in instructors from Typesafe or their Canadian Scala training partner, tindr.co (more on them later), but in the beginning we looked to Coursera and their growing catalogue of Massive Open Online Course's (MOOC's).
Martin Odersky, the inventor of Scala and a key figure in the Java community, provided video lectures for the Functional Programming Principles in Scala course on Coursera. This course has been wildly popular and is regarded as one of the most successful software programming MOOC's ever.
We had more than 50,000 registered students— an unfathomably large number in the context of traditional teaching. While large, that number doesn’t tell the whole story; as is typical for a MOOC, a statistical majority of those students participate no further beyond watching a couple of videos to find out what the course is about. Of the 50,000, about 21,000 students participated in the interactive in-video quizzes that are part of the lectures, and a remarkable 18,000 unique students attempted at least one programming assignment. A whopping 9,593 students successfully completed the course and earned a certificate of completion— that’s an incredible 20% of students, which blows the average 10% rate of completion for MOOCs out of the water.
- Functional Programming Principles in Scala: Impressions and Statistics By Heather Miller and Martin Odersky
With the blessing of the management team we decided to allocate one day a week for the length of the course to watching video lectures and working on weekly assignments. On Thursday morning we would meet together in a room, discuss last week's assignment and watch the lectures for the week. We often paused and attempted to solve in-lecture exercises and to discuss the concepts being taught.
Some of us had doubts of how useful a functional programming course would be in day-to-day work at Empathica, but with the explosion of LISP dialects in recent years and the need for side-effect free code that can run concurrently there's been a huge demand in the industry to apply these skills to new systems. The course exercises do involve a lot of algorithms and basic data structures, but it's also an excellent introduction to Scala syntax such as pattern matching, for comprehensions, Scala's collection types, and much more. I would strongly recommend this course to anyone interested in learning Scala, functional programming, or both!
As I mentioned earlier, there are many vendors that offer a variety of Scala training options. tindr.co not only runs training courses based on Typesafe curriculum, but they've also introduced a new program called the Scala Developer Factory; a rigorous training course that promises to deliver skilled and productive Scala developers! I've kept a line of communication open with Mike and Eric at tindr.co to discuss possible future on-site training options for the developers at Empathica.
Embracing the Scala community
Toronto has an active Scala community. Several meetups are planned each year and are usually hosted at software shops that have adopted Scala into their organizations. The Toronto Scala Meetup has run meetups for a few years. Chris Dinn has often organized these events. Speakers from the community are invited to present on any Scala related subject they like. Representatives from Typesafe and tindr.co often show up as well.
Katrin proposed the idea that we host a meetup at Empathica's downtown office. At the time, our office was a dingy top floor on Peter St. atop an infamous night club in Toronto's entertainment district called "Time". Plans to host the meetup were shelved until we setup shop in our new digs on Spadina ave. With renewed vigor, Katrin began preparing a schedule. Our RSVP list grew. We ran the event with 3 speakers and representatives from both Typesafe and tindr.co. Empathica provided the space plus refreshments and the folks at tindr.co were kind enough to buy pizza's for a group of roughly 50 people. It was a huge success and the social gathering after the talks allowed for lots of networking and prompted many interesting discussions. I can safely say on behalf of the devs at Empathica that we're psyched to continue to attend, contribute, and host these events in the future.
Reflection
There are now several dev teams at Empathica working on Scala projects. Tool chains are being developed. Coding standards are being established and enforced. Libraries are being standardized across our projects. It's starting to return to a comfortable pace of software development for everyone.
That's it. I'm known for being verbose so I apologize for the lengthy read, but I wanted to get this story out in its entirety if for no other reason then for Empathica's own posterity. It wasn't an easy transition. Looking back, there are things I probably would have done differently, but I think with the information available we did a pretty good job. I hope that for those of you thinking about a technology change that our experiences help you come up with a plan to bring about that change. I'm sure there are also some people reading that have gone through such a change and I welcome your input in the comments whether good or bad.
If I were involved in a significant technology change again I would definitely take a different tack both in its proposition and implementation. Some things I would do differently and some things I would not. I've compiled a summary of what I think are the most important matters to address.
- Consensus building. Get the whole development team's input and suggestions. Be as accommodating as you can, but don't expect unanimity. It's important to get input from everyone you work with on a daily basis because they will all be affected by the decision.
- Start with something small that has potential to grow it into something big.
- Budget for training in your iteration/sprint/project plan. Some people prefer learning on their own by reading a book, some like the classroom environment, and some just want to start writing code. See who falls into what group and try to accommodate them all.
- Ask hiring prospects how they would feel if they worked with a different technology than one they have experience with.
- Include OSS experience as part of your hiring criteria. Either experience contributing to specific projects or experience working with open source languages and frameworks.
- Be patient. Significant technology changes don't happen over night. But with the right people and the right attitude amazing things can happen.
I'll conclude this post with a link to a little bit of internet history. This video made the rounds in 2010 and although it does not specifically endorse Scala it's still a harrowing story about leaving the Microsoft nest and embracing open source technologies. Plus it co-stars the lovely Scala Johansson :)
(NSFW, depending on where you work!)
EDIT 08/09/14: If you would like to know more about our transition on .NET to Scala then check out my follow up post: .NET to Scala developer Q&A with Typesafe.