Engineering Blog - June 29, 2015

Supercharge your API development with code generation

Brian Backhaus - VP of Engineering

At Zoosk Engineering, one of our biggest focuses is to make sure everything we do is as simple and as efficient as possible. This translates into everything from our code and tooling to our processes as well. This means constantly experimenting with new ideas in an effort to improve in these areas. One of those experiments, which we permanently use now, was to use code generation for creating server/client contracts.

The Old Process

Zoosk, like most current web applications, uses a RESTful API to communicate contracts between server and client. Knowing this, our engineering teams are broken down based around this separation into: API, Web, Mobile Web, iOS, and Android. The following graph visually represents a typical feature’s lifecycle from a API/Client tech kickoff to feature completion.

The first step is to have a technical kick-off between the client/server teams. In this meeting these teams agree on the contract for any new endpoints that will need to be created. Once this happens, the server team documents the agreed upon endpoint on our wiki and starts the implementation of the endpoint. Once these contracts are agreed upon, the client teams have what they need to start building their model objects and parsers that correspond to the endpoint spec. When completed, the client team is able to build a subset of the views/controllers while the API finishes work on the endpoint in parallel. But, client work can progress only so far before those teams become blocked by not having functioning API endpoints. In this case, developers typically shift to other projects while waiting for the endpoints to finish development.

Drawbacks

While this process worked well for us for awhile, we began to notice a couple areas that we could improve on.

Context Switching

If the server and client teams begin development of a feature at the same time, (which often happens at Zoosk due to how quickly we move) there is always some time we lose when clients become blocked by the server team. The topic of context switching as a whole is out of the scope of this article, but suffice to say that it’s significant. We do a good job of limiting that time by mocking out endpoints on some of the clients, but that requires work across all of our client teams.

One idea to limit context switching is to ensure all API endpoints are completed a sprint before client teams start working on them. There are a few issues with this process in practice.

The overall timeline of a feature goes up. We do our work in two week sprints, so this means that all decent sized features now take a minimum of four weeks to go through our pipeline.
Managing these dependencies between teams adds overhead to the process. It’s not huge, but when you have a number of projects you’re tracking, it adds up.
Endpoints built in isolation often need to be reworked. Because the endpoints are built in isolation without much collaboration from clients (as clients are busy on other projects), we often end up having to backtrack and make changes to completed endpoints.

We’ve solved our context switching problem, but we’ve introduced inefficiencies in other areas of our process.

Boilerplate Code

Another issue we noticed was that there was a large amount of boilerplate code that would have to be rewritten, albeit slightly differently, for all API endpoints by the server/client teams. Any time you’re spending writing boilerplate code could be time spent NOT writing boilerplate code. From the definition of the term, boilerplate code is redundant and always a candidate for code generation. Each individual team alleviated some of these issues using IDEs or tools built within their teams at varying rates of success.

In PHP

On the server side, we have a lot of code to handle and create responses interchanged between the clients.

https://gist.github.com/bdb1234/1dfbf1080bf332dbcf27

In javascript

We use the Google Closure Compiler and Tools extensively at Zoosk which led us to a one of a kind integration with Angular and Closure for our web and mobile web SPAs (Single Page Application). That said, as great as these tools have been for supporting a SPA with over 500K lines of javascript, they are definitely not easy on the typing. The jsdoc annotations and coding patterns have been great for code quality and stability but require a lot of extra keystrokes. For example, every new API endpoint requires a few different javascript classes to get the models ready for consumption by our views and controllers.

https://gist.github.com/bdb1234/e12732ec62ad574f7ced

In Java

On the Android client, the patterns that we follow there are a lot less verbose, but there is still boilerplate code we have to write for every new endpoint we build.

https://gist.github.com/bdb1234/1f6623897bdfe3f8a710

In our Wiki

We keep very thorough documentation that must be created for all API endpoints.

Enter Code Generation

Because an API is by definition a well-defined contract between the server and clients, we realized we should be able to easily build code generation tools to alleviate most of the redundant code. We coined this new codebase – Lotus. Lotus is written in Python and provides an Abstract Syntax Language called LSL (Lotus Syntax Language) to define model and endpoint specifications. Using Lotus, we can generate code for all platforms that depends on the API interface.

Here’s an example of what LSL looks like:

https://gist.github.com/bdb1234/b42738a1f5c373c08e3f

All the code referenced earlier in the article will be generated from the minimal LSL mentioned above. This includes:

PHP code for our API team to parse and build responses
HTML for the documentation for our internal Wiki
Google Closure Code for models and parsers for Web and Mobile Web teams
Java code for Android data models
XML for .plist files for iOS
Unit test skeletons
Live endpoints with mock data

Mock Endpoints on the Server

The other critical improvement Lotus provides us with is the ability to generate mock endpoints as soon as the API contract is agreed on between server/clients. Using LSL, we can generate functional endpoints on the server using mock data based on the data types that LSL supports. This essentially allows us to have working endpoints on day one and permits the clients a much longer window to develop before becoming blocked.

Old Process

In the chart below, you can see how after the advent and maturation of Lotus, the timeline of our feature development has changed significantly. All of the boilerplate code has disappeared, and we can see that the context switching cost has disappeared as well.

New Process

Code Generation Options

Lotus was developed in-house for our API that we’ve built through the years, but there’s one promising code generation tool I’ve seen that’s built around a similar idea. It’s called Swagger. Swagger allows you to build RESTful APIs and generate the corresponding client SDKs very similar to Lotus. This would be impractical if you are already supporting a large API codebase, as we do at Zoosk, but it seems very appealing for greenfield development.

We’re always trying out different ways to be faster and work more efficiently at Zoosk, and in this case, leveraging code generation in Lotus has fundamentally changed the pace at which we’re able to ship features at Zoosk. Happy coding!