Engineering Blog - March 28, 2011

What is “quality” and how do we make it?

Doug Wehmeier

My name’s Doug Wehmeier, and I’m the lead engineer for mobile development here at Zoosk. I write technical articles about mobile development on my own blog from time to time, so here on our team blog I decided I want to focus more on the cultural aspects of what we do here at Zoosk. That said, for my first article I thought I’d start with the basics and talk a little bit about how we maintain and measure quality when we release products. This stuff is by no means unique to us–in fact most of this is pretty standard. Still, it’s worth starting at the beginning.

When we talk about quality here at Zoosk, we’re talking about a few different things:

Does the product work?
Does the product delight the customer?
Does the customer find value in the product?

Each of these topics is worth taking a look at individually, so I’ll touch on each of them in time, but for now let’s talk about the first step, making sure things work.

Does the product work?

The first step towards success is making a product that works consistently. Here at Zoosk, we measure this primarily in terms of availability. If our website is down, or our iPhone application crashes, then our product is not available. Worse, if it crashes, that means that it’s not available at the exact moment a customer wants to use it! That’s obviously a really bad time to be unavailable.

In addition to making sure our product is available, we also want to make sure that every part of the experience is working as intended. For now, our focus is only on whether things work or not. There are many other questions we want to ask later about how they work, but step one is to just make sure the product does what we tell the customer it’s going to do. If they send a message to another user, we want to make sure it gets there. This is really important stuff. Just one “lost” message, and a user’s confidence may be damaged forever. It takes a long time to build that confidence up, so keeping it is incredibly important.

To make sure we deliver that message each and every time, we use a few different processes:

Best Practices
Unit Testing
Functional Testing
Interaction Testing
Ongoing QA Testing
Metrics

Best practices doesn’t mean that we don’t innovate. Far from it. Using best practices means that when one person figures out the best way to do something, he shares it with the team. This is both the most important, and the most difficult step in ensuring quality. When you have 3 or 4 developers, it’s not a problem. But when you have 30, 100 or 1000 programmers, it becomes extremely difficult. Documentation, wikis and mentoring are all things we do to manage our best practices here.

Proper unit testing is a polarizing topic. Most people would agree that unit tests are good. The debate comes up over when to write them. You might have heard of “test driven development“. I’ll save my own opinions on the matter for another post, but I think it is fair to say that proper unit testing depends on the situation. Test driven development has many benefits and some people swear by it. Here at Zoosk, we find the regression detection properties of unit tests to be the most valuable part, and you get that part regardless of when you write your tests. Earlier is better of course, but we find that writing tests too early when developing large systems restricts your freedom to make radical changes. Really the biggest benefit we get from unit testing is increased confidence to make changes to code, knowing that if we do something really bad, it will trigger a unit test failure.

Functional tests are pretty similar to unit tests. We run suites of them against our builds, and they fail or succeed in similar ways. The difference is that unit tests are deterministic, and functional tests are not. A functional test might contact a server that will probably respond, or fire events at random intervals that **should** generally behave the same. We usually fire these off along with nightly builds. We take failures here with a grain of salt, but it’s a good early warning system for systemic problems.

I personally draw a distinction between “functional testing” and “interaction testing“, so I’ll explain what I mean. Interaction tests, like functional tests, are not deterministic. The difference is that they exercise products at a higher level of abstraction. Where functional tests interface directly with the source code, interaction testing is run by a piece of software that interfaces with the product in the same way a person would. Selenium is a popular example of a testing framework for this kind of testing.

Along with all this automated testing, we also do ongoing QA. We have dedicated QA people (who are awesome!), but we also socialize QA testing across development teams. We run internal betas and we use the products all day long to make sure everything is working. Automated testing is great, but it’s just can’t put the spit-polish on a product the way people can (yet!).

Finally, we track and measure an immense set of different metrics in our live environment. In a system as large as ours, sampling is really the only way to keep in touch with the living health of the system. I can’t talk enough about how important it is to be familiar with the characteristics of your system before a problem comes up. When you visit a doctor for a sinus infection and she takes your blood pressure, she’s not worried you might have a heart attack right there. She’s establishing a baseline to give her a point of reference if problems come up later. In our case, we track things like “flirts per minute”. If that number drops, we might have a problem. But unless you’ve been watching it for a while, you have no idea what size of drop represents cause for concern. Think about what characteristics define your system as healthy, and become familiar with them.