Engineering Blog - October 7, 2014

Zoosk Test Tool “Test2k”

Jay Kremer

Using production data in a preproduction environment.

There are many challenges presented in testing a service with millions of users. No matter how many scenarios our team can envision and test, our users will inevitably find paths and situations we haven’t anticipated. These challenges are enhanced in an engineering environment that ships as quickly as Zoosk, where we take pride in moving fast and trying new things. It’s not unusual for a new feature to go from concept to testable code within a few days. Our QA team’s vision is to validate the features as fast as the developers can produce them. This requires us to maintain a test environment that allows us to test a massive number of scenarios in an extremely short period of time.

We have a number of ways to accomplish this—automation tools, multiple configurable environments, and good relationships and lines of communication between our developers and product managers. But our best implement by far is a set of user data that is preinstalled on our test servers. This takes the form of test users with pictures and profiles already in place. Once a server is installed, we can log in to an existing user account, and browse and interact with other users.

Our first set of test user data was created entirely in-house. The user pictures were from sources freely available on the Internet, and the profile entries all consisted of “Lorem Ipsum” text to fill the page. 500 users with these characteristics were preloaded on every server in the test or dev environment. Upon installing a new feature, a developer or tester could immediately interact with the new feature by accessing a test account. It shortened manual test time significantly, but lacked the completeness of actual production data, not to mention the turbulence and unpredictability that real users bring to any web service. The metadata from our users contains a lot more than just a picture and some profile data; it includes messages between users, connections, interests, winks, and the myriad data points that make up the core of our Behavioral Matchmaking™ engine. Without this kind of data, our test environments would only be a pale shadow of our production experience.

A while back, my team launched a project called “Test2k”to recreate real production data on our test servers, and build a set of test users with the kind of metadata we would expect from a production user account. This project had three main requirements: 1. Respect the privacy and the data of our customers as is standard for all projects we do at Zoosk, 2. Include real production metadata for our default test users, 3. Make it easily installable, so any developer could make use of this test tool.

We started by copying a cross section of metadata from some of our most active users. We cleaned all the personal data out of this selection, replacing user names with randomly selected names from our own list, changed their emails to Zoosk test accounts, and replaced the contents of conversations and profiles. Instead of using “Lorem Ipsum”, we used text from freely available online books, to better emulate real human writing. We also tweaked the data so that existing connections and conversations matched up with other members of the same data set. In other words, if a test user had a conversation with a user who was not in the data set, that conversation metadata was connected to a user who was within the data set.

The resulting test users had the data profile of an existing production user. When a developer does a fresh install of the Zoosk product on a test server, they can log in to a set of users who have existing conversations, interests, carousel matches, flirts from other users, views, and a host of other metadata options to choose from. For all intents and purposes, it looks like logging in to production accounts with production data. We also included a set of international users from different countries where Zoosk is popular, so we can log in as a user from those countries with his or her language and settings.

This gives us a much more realistic and powerful preproduction environment to test our Behavioral Matchmaking™ features, and helps us catch important issues before they reach our users.