One of the things that we like to pride ourselves on here at Zoosk is the User Experience, and one key facet of that experience is performance. This is often assumed during the development cycle, but rarely is it explicitly called out as a “feature.” This was best put in a blog post on High Scalability, which stated:
The less interactive a site becomes the more likely users are to click away and do something else. Latency is the mother of interactivity. Though it’s possible through various UI techniques to make pages subjectively feel faster, slow sites generally lead to higher customer defection rates, which lead to lower conversation rates, which results in lower sales. Yet for some reason latency isn’t a topic talked a lot about for web apps. We talk a lot about about building high-capacity sites, but very little about how to build low-latency sites. We apparently do so at the expense of our immortal bottom line.
Many techniques have been developed over the years to address the issue of latency and interactivity, the most obvious being the rise of Web 2.0 and the dynamic loading of content after page render. Ajax has been the panacea of low latency websites, allowing for non blocking page loads and deferring high cost operations until such time as that content is needed or requested. This has gotten us far, but we can do better.
In 2010, Facebook posted a blog entitled, BigPipe: Pipelining web pages for high performance. In it, they outline a rather novel solution for delivering highly dynamic and content driven websites with low latency. I will leave you to read the article for the specifics, but to summarize, a page is broken down into highly decoupled widgets, or pagelets as they call them. Each pagelet is represented on the page by a div, which is a placeholder for where the content will be displayed once it has been processed. This skeleton is flushed immediately to the user, however instead of closing the connection, the server keeps the request alive. In the background, asynchronous processes are triggered which fetch and process the data for each pagelet. Upon completion, each process will flush some data to the page as JSON, which initializes the pagelet with the desired content. Once all processes have competed, the request is closed and the page is “complete.” Facebook found that by moving to BigPipe, they were able to decrease the time to interact latency by about half. This is a significant win, and one that I wanted to see if we could replicate here at Zoosk.
In order to test BigPipe with our product, I decided to try out a new stack, one which lent itself more readily to asynchronous programming. We have typically been a LAMP shop here at Zoosk, but for this, I decided to use Node.js. Having most recently rebuilt our site as an HTML5 single page application targeting smart phones, I found the Javascript event loop ideally suited for the task at hand. For the test case, I chose to rewrite our internal admin tool. While not a user facing application, the admin tool is the backbone of our user operations team and an ideal candidate for optimization. The end result? We were able to reduce TTFB (time to first byte) by 50 percent, and the time to page load was reduced by 35-50 percent. For an initial, unoptimized proof of concept, this was a huge win. I am convinced that with further tuning, we could easily drop the latency even further.
While the Facebook article does a good job outlining the BigPipe methodology at a high level, I want to draw attention to a few items that were not as immediately obvious as candidates for optimization:
Over the coming months, we will be applying the lessons learned to our core product, resulting in a snappier, more delightful Zoosk.