Engineering Blog - March 6, 2015

The Magic of Automated Testing at Zoosk

Astro Ashtaralnakhai

When I came to Zoosk in September of 2013, I didn’t know much about automation testing and how helpful it could be. As far as I knew, Quality Assurance (QA) testing was done manually and programming skills were not involved. I soon discovered the impact that automated testing could have on the productivity of a QA team. I also found that writing an automation test framework could be an amazing experience interlaced with some interesting challenges.

To give some insight into how we perform automated testing at Zoosk, I’ve broken down how our test lab is set up and what our framework code does.

Zoosk Automation Lab

Jenkins Server – The brains behind the operation. This creates and delegates jobs to all the “slave” nodes.
TestRail Server – Test case management software with some helpful .NET APIs found on NuGet.
75 Windows 7 Virtual Machines (VMs) – Each Windows VM is a Jenkins “slave” node which can run Android or browser tests.
8 Mac mini’s – Each Mac mini is a Jenkins “slave” node which can run iOS, Android, or browser tests.
4 Test Server VMs – A self contained pre-production test environment.

Automated Test Framework

The logic behind our framework is to gather the list of tests from TestRail, trigger each test to run on a Jenkins slave node, then to update the TestRail plan with the result.

We use Selenium to test the web platform and Appium to test the mobile applications (iOS and Android).

We have two types of tests: parallel and serial. Parallel tests can run at the same time as other parallel tests against the same server without disrupting the outcomes of the tests. Serial tests can only run against the server one at a time. Serial tests typically modify the database, so it is a destructive test.

Over the last year and half, we have transformed the initial automation framework into a faster and smarter application. Two of the main bottlenecks we’ve tackled have been 1) increasing the number of parallel tests which can run at one time and 2) reducing the number of serial tests.

We tackled these bottlenecks in two phases:

Phase 1: To increase the number of parallel tests we could concurrently run, we did the simplest thing possible – created more servers we could test against. (Now we have a total of 4 servers.) Then, the automation code had to decide which tests run on which server. The approach we came up with was to farm out all the parallel tests to one server while sending the serial tests to the other three servers. After all the serial tests were done, we then targeted the parallel tests against the other servers.

Phase 2: To reduce the number of serial tests, we investigated why these particular tests were destructive. In most cases, it was due to the database. We were truncating a database table at the beginning of the test. Since multiple tests needed those tables, the tests would interfere with each other. We needed to find a minimally invasive solution. Our solution involved a locking mechanism and/or deleting the database table row, which helped us reduce the number of serial tests by about 90%.

The Numbers

Our results have been amazing. By finding the bottlenecks with our test harness, we were able to increase the number of tests while dramatically reducing the amount of time they takes to complete – a 14x speed up.

Date	# of Test Scripts	Time to Complete
September 2013	297	~14 hours
July 2014	340	~4 hours
January 2015	427	~1 hour

These results far exceeded our expectations. Going forward, we will continue to find ways to expand our test automation coverage and maintain acceptable throughput levels.

Originally posted on the Astro Ashtaralnakahai blog.