Sunday, February 07, 2010

Random stuff on jmeter testing

Update : I realized that much of what I wanted to say has already been said and in a far better manner in Perfomance testing patterns and practices.
One of the questions asked on the JMeter mailing list was
How do I analyse my application with JMeter?
There are various reasons one might run a load test broadly detection, simulation, verification and analysis. The type of test or the scripts you might write, what you would measure or track would vary a bit depending on what you want to do. JMeter is good for simulation and verification. It is an aid to detection and analysis but you usually need other tools to help you out.

Detection
  • You want to find out if your application behaves correctly when accessed by multiple threads. Whether your database starts showing deadlocks or whether you might have race conditions.
  • You want to find out if your application has memory leaks.
Simulation
  • You want to find out how your application behaves under standard load or how it behaves under some peak load(e.g. shopping during the holidays). Response times etc.
  • In the specific case of Java based web applications you want to know how often your GC cycles might run or how long.
  • You application makes remote calls(e.g. webservice calls) and you want to know whether all the resources are recovered correctly. perhaps you have some throttling mechanisms in place and you want to see that its working correctly

Verification
  • You want to verify the result of changing some tuning parameters.
  • You want to check your SLA's are met

Analysis
  • You might already know there is a problem and you want to simulate load while you are profiling the application.

These areas overlap and I've used the above categories quite broadly to merely illustrate that your objective and hence your test script will vary based on your objective.

For e.g.
You want to find out if your application behaves correctly when accessed by multiple threads - In this case your test script would only be concerned with running some parts of the applications at exactly the same time. You'd want to exercise multiple parts of your application. Perhaps you are aware that some part of the application internally spawns threads and you'd run a test that exercise that area for a long time or with a high load. You don't at this point really care whether these are unrealistic scenarios or non representative scenarios, nor are you really looking at what the response times are. All you care about is do you see stuck threads or deadlocks. Do you see a really long wait time for most threads though some threads finish really fast.

Or perhaps you want to tune a memory parameter and you want to verify the change
In this case Response Times / Throughput really matter (for the same test of course). You'd first take a baseline reading without the tuning, and then another with the tuning. The test scripts must be representative of actual user behavior.

Perhaps you want to check whether your site can handle holiday shopping onslaughts. In this case you would modify your tests to show bursts of activity, you'd closely monitor response times but you also want to check what happens on the server. How much memory, How much CPU. You might also want to see what load might actually bring down your servers. You might want to check if your load balancers evenly distribute the load.

Or perhaps you have certain Service Level Agreements and you need to know response times accurately for the load specified in your agreement. In this case you need a representative user journey and you also need representative background users.

All of which means there is no easy answer to ' How do I analyse my application with JMeter'. It can only be answered by What is it that you want to analyse (normally answered as well, performance).

Lets take the most common use case, what is the 'response time' for my application.
However actually getting the response time is more difficult than reading the response time calculation from the JMeter test results.
This is problematic due to
a. JMeter is not a browser and does not render the page. Different browsers take different times to render the same page. Compare older versions of Internet explorer with Chrome for e.g.
b. A returning user with some files cached will probably show lesser times than a first time user.
c. The network / connection speed from which the user is accessing the application may be significant. And your users may be spread out throughout the world.
d AJAX based applications / DHTML applications are difficult to predict because not only does it vary by browser , but the number of calls that a browser may make in parallel is also different, but some calls will be made in parallel and its difficult to know that.

So any response time would have (roughly speaking)
a. The time it takes for the application to actually respond with all the data
b. The time it takes for this data to be transferred over the network
c. The time it takes to download static files (bearing in mind that not all files may be downloaded and that browsers may request multiple static files in parallel)
d. The time it actually takes to render the page.

JMeter can help you out with a, b, and c. but what it is good at, is finding out a. for the network on which it is running on.

Typically your requirements might define an Service Level agreement for your site as Browsing operations must take < 6seconds 90% of the time and shopping operations must take <8 seconds 90% of the time. You also know how much large your pages are and you can guesstimate how much time it would take for the page to be transferred over the internet. You might take an average with some safety factor or you might take a worst case scenario. Using a browser tool like YSlow or Googles PageSpeed , you can also have some insight on how your static are downloaded , how long they take etc. And you might add some time for how long the browser takes to render. After considering all of this you might arrive at a new figure that on a high bandwidth intranet (which thereby eliminates most of the network variables) your browsing operations must take < 2 seconds just to get the data and your shopping operations must take < 4 seconds for your SLA's to hold because the rest of the time has already been used by the other factors.
After this you would have to write a script which generates representative loads (for the operations being verified and the operations that would happen in the background), run the test and verify the 90% percentile lies below the value you have calculated above. But perhaps it doesn't. Static files can be optimised by reducing their number, their size, gzipping them adding expiry headers etc, but maybe you have already done this. The Clients network and browser aren't within your control so there isn't much you can do there. The next step is figuring out where your problem lies. JMeter can't help you there, you need a different set of tools. But JMeter can help you to simulate load or parts of it so that you can monitor your application with the tools of your choice. Some of your findings may be infrastructure related, Some may be code you'd have to make changes and retest and repeat.