Now the dust has settled on one of the more recent public outages in the news – Census, we thought we would share some methodologies from our Application Testing Solution that we use elsewhere.
High volumes of traffic is not unique to Census, recently the Canadian Immigration site was inundated with visitors during the recent election, Ticketek and the Adele ticket sale frenzy. There was also the Clickfrenzy fail a few years ago.
This article is not implying that the testing @ Census was incomplete – just a timely opportunity to share our experience and how more and more sites are being subjected to loads that are beyond capacity.
The use case:
ACME Corporation is about to release a new and improved web site. This new site includes lots of new features and functionality and has been designed to handle increased load from a brand new marketing campaign or upcoming significant event. To make it as realistic scenario, the site goes live in 2 days and no performance testing has been done. To add further realism, there is no budget left for testing, so if you can make it as cheap and quick as possible.
This is also not a plug for tools or solutions, use your own if they can generate the required traffic. This is simply a methodology and a risk reduction exercise.
The type of test is what we would call a 1 arm test, where the load testing equipment will interact with the actual web application. Note a 2 arm test is where client and servers are emulated around a network (an example would be a Firewall / IPS scalability test).
The deployment is a 2 tier application, with a web layer and a database layer. We will also assume perimeter protection with a next generation firewall. The firewall itself could have a suite of standalone tests that won’t be covered in this article.
Expected traffic volumes are in the realm of 50,000 concurrent users at anyone time. This is based on current network data and projected load expectations.
This article doesn’t go into the actual design of the application – we’ll leave that for folks that know. But it assumes redundancy in the different layers like web and dB. You might want to compare different web servers such as nginx vs apache vs xyz though and leave everything else consistent – valid test to rerun the below scenario.
On Acme Corporations new website, www.thiswontfail.com. Our test takes into account what a unique visitor would do based on previous metrics supplied by Acme Corporation:
This mix of actions will exercise the web to db connection as opposed to hitting the home page and being served static content. A connection will stay open for at least a minute. We won’t do a fabricated connections test where we setup and teardown connections and download no data.
One key to successful testing and isolating performance issues is real time monitoring of each device under test (DUT) in the architecture. So in this scenario we would expect resources to be monitoring the Firewalls, Web Servers and Database Tiers along with network traffic statistics.
An example of the metrics for each platform. Subject matter experts you have can add more:
When doing an end to end test, there are many cogs in the wheel that can break so you need to be monitoring as much as possible in real time.
In an ideal scenario, after running each load test you would reboot each element to have consistent results. At least any element that was driven to failure. This also depends on how much time you have available.
Acme is going for a static deployment with 4 web servers and 4 databases all behind load balancers (this same test scenario could be used if web and database layers autoscaled):
Here is our list of fundamental tests:
By completing the above tests you will know the performance of your application deployment as well as any performance bottlenecks.
For future testing, you would take 70% of the maximum metric. This would be used as your regression test load for future changes and validation.
For further information about this test methodology or require information on other testing methodologies, reach out to email@example.com
ARTICLE BY THE MATRIUM TECHNOLOGIES TEAM