One of the first things taught to medical students is Primum non nocere, which roughly translates into First, do no harm. This is understood to mean that it’s better to do nothing than to treat a patient in a way that you do more damage than good, or put another way: Don’t let the cure be worse than the illness.
So what does this have to do with FriendRunner? Lots – in fact it was the inspiration for our latest feature that you don’t even get to see since it works behind the scenes, but will be thankful it’s there. To understand why, you need to think about the two jobs that FriendRunner performs:
- Calls the test application in the same way that Facebook would so the application can experience the load. While doing this, FriendRunner monitors the time it takes the application to respond to determine its health. (Steps 2 and 5 in the diagram below)
- Allows the application to call the FriendRunner ”proxy” Facebook server to service any Facebook API calls that the application needs to make. (Steps 3 and 4 in the diagram below)
That’s really all that FriendRunner does. These two functions act together to form a system that can put enough load on the test application so we can learn something meaningful, and hopefully make the application more scalable and better.
But what about FriendRunner? What happens when it gets large loads put on it? Once we start running with a few thousand virtual users, will the proxy Facebook server be able to satisfy API requests fast enough? And what if it can’t? Well, that’s pretty easy to answer: a poorly performing API will result in poor performance at the test application level. In other words, FriendRunner (the tool used to measure an app’s performance) will be the reason that the app being tested exhibits poor performance.
This is the stuff that keeps us up at night. However, in reality, things really aren’t so bad. FriendRunner’s server that replies to API calls is not a single server, but a collection of state-less scalable servers deployed onto a computing infrastructure with almost infinite expandability. It’s very unlikely that it would ever be the cause of a problem and make an application under test appear to be slowing down. Yet, we still think about these things to ensure that our customers get our best service.
Our initial attempts to allay these concerns were to simply monitor the servers that the FriendRunner software ran on. We’d monitor the health of the computers, and watch the memory and CPU usage. This works pretty well at the macro level, but still doesn’t really say what happens at the individual call level. The server’s memory may be doing just fine, but we still may be affecting the performance of the application under test. Truthfully, we’ve never seen any evidence of this, but it still weighs heavy on us. Remember: First, do no harm. We take our customers very seriously, and don’t want to be the cause of misinformation.
Which brings us to our latest feature, Advanced Monitoring. We’ve now instrumented our Facebook API server so that we can measure the performance of the FriendRunner server at the API call level to prove that the system acts consistently. And that’s a very important point to this whole process – we aren’t interested in measuring our performance because we want to make it faster. No, we measure to prove consistency – something that’s very important when measuring the performance of another system. Our calls should return in about the same amount of time regardless of whether there are ten or ten thousand users running in the system. That’s a difficult standard to live up to, and not only do we do it, but Advanced Monitoring can now prove that this is what happens.
Unfortunately, despite this very sophisticated system working very well, there’s really nothing that you, the user, can see. But we’ll sleep much better at night knowing that FriendRunner is not doing any harm. And you … well, you should be happy that FriendRunner will always provide you with a large and consistent load so that testing and tuning your application is as simple as possible.