Thursday, June 02, 2016

(non standard) Use cases for a load test tool

There is a saying that goes "If all you have is a hammer then everything looks like a nail"
Over the years that I have used JMeter , I've used it to implement solutions that would have traditionally been the domain of a programming/scripting language.  What follows , is a discussion of the types of problem that I have solved with it.

I've excluded the standard cases for a load test tool - Performance tests, Capacity planning, Soak tests etc.

Exhaustive tests

A long time ago , when I was a trainee software engineer , I had read a book on testing. One of the examples was to write down the number of test cases that one would need to run to cover all scenarios for the following example - Given 3 lengths as input  , a  program determines if this can make up a legal triangle or not. How many sets of test data do you need to input in order to be confident that the program always works correctly (the answer was seven or 17 and I guessed 2 lesser) - The take away was that testing is hard! In the triangle case the theoretical possible input data is infinite so you have to think to come up with good test data (Hey agile unit testers, ever think about that instead of unit tests and coverage?)
However , there are cases where you can indeed run all possible test data - In my case , there are "Category" pages - There are about 3000 of them, they each have a bunch of attributes (images, titles, alt text, overviews, descriptions etc) - some are mandatory , some optional. If we approach testing this as lets come up with all the possible combinations , it's likely going to take some time. Also as attributes get added or removed , there is a chance something will be missed.
But we could use a different approach - The category (and data) are listed in some DB , its reasonably simple to extract out all of them (in the case of JMeter to a CSV data set) and then just have the test script access each and every one of them. The script can then add validations as needed (e.g. did the alt text appear for image?). The number of categories can increase for 3000 to 30000 or even 300,000 - JMeter would do just fine. As a bonus , I could use this to generate some background load for other tests. Doing it this way takes out some of the guesswork that needs good testers to come up with good variations.
Note : this doesn't , still eliminate the need for good testers (for e.g. we still need to remember to run negative tests - does the expired category show the correct error message ? does a non existent category correctly return the 404 error page ? Does passing an apostrophe in the URL mess up the javascript and is the site vulnerable?)

Exhaustive tests , take 2 , 404s

There is a bulk load process that uploads some safety documents to the site- An index file contains metadata , one of which is a PDF file path to the PDF. The system processes the file , indexes them for search and transfers the PDF to the webserver
This being a real system , coded by real developers, there are bugs. However the bug got discovered 6 months later. Not all the documents are searchable and not all PDFs got transferred (about 300K documents)
Now we could re-request all the documents from a different system (but that would take time) - What we needed to determine is which documents are missing from search and/or are missing PDFs
With JMeter , this was a 30 minute activity to script , and took a couple of hours to run (mainly because it was run against production , we could only use 10 threads in parallel)
The script goes something like
Extract all the bulletin numbers as the data file
Write the script to establish a user session , then search each bulletin num
If result not found those samples are unsucessful
If the results are found , do a HEAD request for the PDF
If not found then that is unsuccessful (but logged by a different listener to its own file)
Listeners only log errors in a CSV file , load it into excel and we are done - we have the bulletin numbers that we pass on to the generating system and say can you please give us all of this again ?

Search tuning

One of the most complained about features on our site is Search. We were fiddling with some of the relevancy settings - but how do we know whether the end result is better or worse?
We used the approach below.
Take the top 500 terms that were searched for. These were then distributed to various business units who specified what they thought the engine should return (we could have also used an independent engine that does better than our internal search i.e. Google). This is the expected result
Then script a test which executed a search for these terms in production. We came up with a formula that scored how well actual matched expected.
Then make the relevancy tweak (in test which was synchronized with production). rerurn the test and see whether it is better / worse than production.
The tool used for the script / report ? JMeter!

ERP migration

We skip to a point where we have migrated ERPs and for some reason , someone(s) decided that we should not have migrated all contracted customer prices - Instead we will manually inspect orders as they come in and fix the price. After doing the above for three months , someone(s) finally realised this isnt a good idea . Ours not to reason why - ours just to sit to cry (over why our salaries have one digit lesser than the worthies who come up with such schemes)
But some higher up also wanted to know - well how many such problems do we have ?
The only way to find that out was to figure out the contracted price for Each customer * product combination (millions of rows) and compare it with actual.
This data cannot be extracted out - Theres only a webservice that takes a single customer -many product and calculates the price.
So I ended up scripting a simple test - The first part hit the old system and saved a data file - The second part hit the new system and flagged discrepancies. Running time - a day (again mainly due to having to run the test against production)

Deadlock detection

We use a framework for our web application - unfortunately a recent code change triggered an existing bug - when some user operations happened in parallel ( a double click or refresh page or back button)- The framework had threads deadlock with each other due to the way the MVC framework (3rd party , commercial , yay!) had synchronizations in its code. We thought we had fixed the problem - but how do you verify ?
Easy - script it in JMeter (simulate issue with bad code , retest with new code)

Cache statistics

We use an old primitive cache. Using the logs from production we could simulate a JMeter test which in turn allowed us to obtain how well/badly caches were being used. As a bonus we also detected a code bug - where a developer had put in an object as a cache key - but hadnt overriden equals - effectively ensuring the cache never got hit

Downloading files

We have a functionality where a user can download a run file for a plate . This "file" is dynamically generated and isn't present in the system anywhere. There was a request from our business unit that they wanted to load all the run files into an instrument. So I scripted a test that spoofed the user actions for each plate and simulated the download and saved it. Bonus - Thats when we discovered that plates with a "/" in their name wouldnt save the file correctly inside the zip.

Website migration

We migrated our website from an old platform to new. Many DNS names, many vanity URLs each with some differing behavior had to be migrated. The tool to check that everything worked ? A JMeter script.