Wednesday, May 27, 2015

Graphs for Jmeter using Elasticsearch and Kibana

Disclaimer : I have just done some initial tests with Elasticsearch (Thank you Google) - I have no production experience with it and I have no idea how to set that up. It's possible the example below has really basic mistakes (in which case , please do point them out!). Note also that the easiest way to integrate this is to configure the JMeter result to be a CSV file and simply import that using LogStash - but then I dont have a blog post to write.


JMeter 2.13 came out with some added support for a BackendListener i.e. a way to send your results to some store where you can analyze the test results and draw pretty pictures for the powers that be. This was always possible using a custom listener , but the person writing it would have to make it perform reasonably well. The new Listener runs on its own thread with a Buffer so it makes it somewhat easier to implement. I also stumbled upon the LogStash + Elasticsearch + Kibana family while solving some log parsing problem and so here is an attempt to integrate that with JMeter.

Pre Requisities - This assumes you have downloaded Elasticsearch and Kibana and have managed to get them running and connected to each other.
https://www.elastic.co/downloads/elasticsearch
https://www.elastic.co/downloads/kibana
I used Elasticsearch v1.5.2 and Kibanav4.0.2

Step 1) Define the mapping for the data that we are going to send to ElasticSearch
This step is strictly speaking , not necessary - Elasticsearch can dynamically add indexes or types or fields for a type - More on why I needed to do this later.
For now everything( referred to as a  document) we store in Elasticsearch goes into an Index and has a type. The type is similar to a Class definition - it defines what fields we store , whether they are Strings or Numbers or Dates and what we expect Elasticsearch to do with it (index, analyze, store etc).
In our example, we will store our data into indexes whose name will always be jmeter-elasticsearch-yyyy-MM-dd (I appended the timestamp because Logstash does it by default - This limits the size of a particular  index and that's probably good when you have many tests and distribute Elasticsearch over multiple nodes)
We will define a type called SampleResult that will closely match the JMeter java object SampleResult. We don't really need all the fields , and if we wanted to optimise the interface , we'd probably drop a few. For now we'll just take almost everything other than the response data.

Elasticsearch allows you to define an Index Template , so that all created indexes that use that template have the same setting , so we'll use that feature

Here is the template we will use
{
   "template" : "jmeter*",
   "mappings" : {
            "SampleResult": {
                          "properties":    {
                                 "AllThreads":{"type":"long"},
                                  "Assertions":{"properties":{"Failure":{"type":"boolean"},"FailureMessage":{"type":"string","index":"not_analyzed"},"Name":{"type":"string","index":"not_analyzed"}}},
                                 "BodySize":{"type":"long"},
                                 "Bytes":{"type":"long"},
                                 "ConnectTime":{"type":"long"},
                                 "ContentType":{"type":"string"},
                                 "DataType":{"type":"string","index":"not_analyzed"},
                                 "ElapsedTime":{"type":"long"},
                                 "EndTime":{"type":"date","format":"dateOptionalTime"},
                                 "ErrorCount":{"type":"long"},
                                 "GrpThreads":{"type":"long"},
                                 "IdleTime":{"type":"long"},
                                 "Latency":{"type":"long"},
                                 "ResponseCode":{"type":"string","index":"not_analyzed"},
                                 "ResponseMessage":{"type":"string","index":"not_analyzed"},
                                 "ResponseTime":{"type":"long"},
                                 "SampleCount":{"type":"long"},
                                 "SampleLabel":{"type":"string","index":"not_analyzed"},
                                 "StartTime":{"type":"date","format":"dateOptionalTime"},
                                 "Success":{"type":"string","index":"not_analyzed"},
                                 "ThreadName":{"type":"string","index":"not_analyzed"},
                                 "URL":{"type":"string","index":"not_analyzed"},
                                 "sku":{"type":"string","index":"not_analyzed"},
                                 "RunId":{"type":"string","index":"not_analyzed"},
                                 "timestamp":{"type":"date","format":"dateOptionalTime"},
                                 "NormalizedTimestamp":{"type":"date","format":"dateOptionalTime"}
                 }
          }
   }
}

We indicate that our template should apply to any index whose name begins with jmeter*
We also define a mapping called SampleResult and we define what properties it uses and their types. We also define some properties to be "not_analyzed" - Mainly this field should be indexed and searchable but no analysis is needed on this field. If I didn't set this value then some of the URLs in my example use '-' which is generally a word separator for Search tools (like Lucene which is what Elasticsearch uses) and that was messing up my graphs. This is the main reason for me to define the mapping explicitly (especially as Elasticsearch doesnt allow you to change this value once defined , short of deleting and recreating)

We now need to tell Elasticsearch about this template. Thats done by making a HTTP PUT request to a URL. I use Poster for this purpose. Fill in the server and (http) port used by Elasticsearch and paste the template into it and PUT.

The server should acknowledge it.
You can view that all is successful by typing http://[elasticsearchserver]:[httpport]/_template

Step 2 : Implement JMeter's BackendListenerClient
In order to get JMeter to pass our code the results , we need to implement the BackendListenerClient interface. This has following methods
setupTest/teardownTest - As the name suggests these are initialization/cleanup methods . They are called once each time the test is run.
getDefaultParameters - These are useful to hint things to the end user (e.g. the parameter name for the Elasticsearch server and a default value for it like 'localhost')
handleSampleResults - This is the method JMeter will invoke to provide the results of the tests. A List of SampleResult is provided as input , the size of the list will probably be upto the Async Queue Size field that is specified when the Listener is configured
createSampleResult - This allows you to create a copy of the SampleResult so that if you modify it in some way , it doesn't change anything for the rest of the Listeners.

The javadoc for BackendListenerClient helpfully tell you to extend AbstractBackendListenerClient instead of implementing BackendListenerClient directly -so that is what we will do. This class has a few additional methods (and since I haven't been able to figure out what they do , we will simply pretend they don't exist). We can also download the Java Source code and read the existing implementation for GraphiteBackendListenerClient (basically copy-paste as much as we can - a Java development best practice)

So with that here are some snippets - The complete source / project / jar is at [1]
    @Override
    public Arguments getDefaultParameters() {
        Arguments arguments = new Arguments();
        arguments.addArgument("elasticsearchCluster", "localhost:" + DEFAULT_ELASTICSEARCH_PORT);
        arguments.addArgument("indexName", "jmeter-elasticsearch");
        arguments.addArgument("sampleType", "SampleResult");
        arguments.addArgument("dateTimeAppendFormat", "-yyyy-MM-DD");
        arguments.addArgument("normalizedTime","2015-01-01 00:00:00.000-00:00");
        arguments.addArgument("runId", "${__UUID()}");
        return arguments;
   }
We define some parameters that we will use. For e.g. the elasticsearchCluster element will define which Elasticsearch server(s) we need to connect to. The indexName will define the prefix used to create the index (remember that our template will get applied to all indexes that begin with "jmeter". The dateTimeAppendFormat will be a suffix we use to append to the indexName. We can ignore the other parameters for now

Next by referring to the Java API for Elasticsearch we can write some code to connect and disconnect from the cluster in the setup and teardown methods.
@Override
public void setupTest(BackendListenerContext context) throws Exception {
        String elasticsearchCluster = context.getParameter("elasticsearchCluster");        
        String[] servers = elasticsearchCluster.split(",");
        
        indexName = context.getParameter("indexName");
        dateTimeAppendFormat=context.getParameter("dateTimeAppendFormat");
        if(dateTimeAppendFormat!=null && dateTimeAppendFormat.trim().equals("")) {
            dateTimeAppendFormat = null;
        }
        sampleType = context.getParameter("sampleType");
    client = new TransportClient();
    for(String serverPort: servers) {
        String[] serverAndPort = serverPort.split(":");
        int port = DEFAULT_ELASTICSEARCH_PORT;
        if(serverAndPort.length == 2) {
            port = Integer.parseInt(serverAndPort[1]);
        }
        ((TransportClient)client).addTransportAddress(new InetSocketTransportAddress(serverAndPort[0], port));
    }
        //snip
    super.setupTest(context);
    }


@Override
public void teardownTest(BackendListenerContext context) throws Exception {
    client.close();    
    super.teardownTest(context);
}


The setupTest code reads the elasticsearchCluster parameter (this would get set in the JMeter GUI by the user), parses it , sets the apropriate values to the Transport client (note that the JAVA API port - default 9300 is different from the HTTP API port - default 9200). The teardownTest shuts down the client.

And with that we get to the bulk of our code (again this code might be inefficient)
First lets create a Map from a JMeter Sample result object. Elasticsearch can treat a Map object as if it was a JSON object.
private Map<String, Object> getMap(SampleResult result) {
    Map<String, Object> map = new HashMap<String, Object>();
    map.put("ResponseTime", result.getTime());
    map.put("ResponseCode", result.getResponseCode());
        //snip lots more code
    return map;
}

Next lets use the API to create a document
@Override
public void handleSampleResults(List<SampleResult> results,
        BackendListenerContext context) {
    String indexNameToUse = indexName;
    for(SampleResult result : results) {
        Map<String,Object> jsonObject = getMap(result);
        if(dateTimeAppendFormat != null) {
            SimpleDateFormat sdf = new SimpleDateFormat(dateTimeAppendFormat);
            indexNameToUse = indexName + sdf.format(jsonObject.get(TIMESTAMP));
        }
        client.prepareIndex(indexNameToUse, sampleType).setSource(jsonObject).execute().actionGet();                    
    }
}

We simply write code to dynamically create an index name based on the timestamp of the SampleResult object and we tell Elasticsearch to prepare the Index (which it will create if its not existing, then pass in the Map object(the document we want to store) that we just created and execute this - i.e. add this document to the index.

Next we compile and create a JAR file for our code and put it in $JMETER_HOME/lib/ext.
Dependencies are uploaded in .classpath file

We also need to add the dependent jars for JMeter (Elasticsearch API , Lucene) into $JMETER_HOME/lib. Copy them from the Elasticsearch installation.
elasticsearch-1.5.2.jar
lucene-core-4.10.4.jar
lucene-analyzers-common-4.10.4.jar

Step 3: Modify the JMeter test to use the Listener we created
We should be able to add the BackendListener like any other Listener to our JMeter plan (Add --Listener -- BackendListener)
In the Backend Listener Implementation dropdown , we should get the class we created in the drop down and selecting it should fill the Parameters section with the parameters we returned in the getDefaultParameters method. If not then we need to check the logs, see that we compiled our code with the same version that we are running JMeter on or whether we copied the JAR to the right place, whether we copied the dependencies etc). Next we update the value for elasticsearchCluster parameter to point to our server and we can then run the test.


Lets assume we are running the test and the Log isn't printing error after error.
Elasticsearch HTTP API will allow you to see data - but we can also use a plugin like Elasticsearch Head . Install this plugin. Next we navigate to http://server:httpport/_plugin/head/ and we can see if our index is created and whether it is getting data. Looks like it is



Step 4 : Looking at the data in Kibana
If Kibana isn't started already , start it

Navigate to the Kibana Server :Port in the browser and click Settings and the Indices tab
Our indices are named jmeter so thats what we will give in the index pattern field. Ensure that Index contains time based events is checked and in the drop down for Time-Field name choose timestamp


On the next screen click the star icon to make this index pattern the default - We can also see the fields that match what we defined in our template.


Now click Discover - If you just finished running your test or it's still running , you should be able to see some data. Kibana's default is Last 15 minutes , but you can change that by clicking the top right hand corner. Note that the index selected is jmeter* - which is what we want . Change it if its something else by clicking the settings on upper right hand side




Lets try and see 90% response times. Click Visualise. It asks you what type of chart you want - lets use a Line chart and a new Search. When prompted use jmeter* for the index pattern.

So on the Left hand side - we can choose for the Y axis , the aggregation is "percentile" , the field as "Response Time" and value for the Percentile as 90. For X axis we choose Aggregration as Date Histogram and the field as Timestamp and the Interval as Auto.
And apply .


And we see a nice graph for the 90% across all Samples - Nice but not very useful.

But thats  ok - we can also filter data by typing in queries - In the example below , only use the Category pages(Where the JMeter Sample Label is "Request Category Page" to calculate the percentiles.


Lets try and drill it down further. In my example JMeter script , it iterated over a number of URLs (category pages) and the JMeter Request had a common label "Request Category Page". Lets see the top 3 worst performing categories. The Y Axis remains the same. For the buckets we first choose Split Line and for the  Aggregation we use "Terms" for Field we choose URL (as each Category has its own URL) and we choose  the Top 3 URLs ordered by 90%  and then we add a X axis Sub Aggregation of Date Histogram on the timestamp field.


You can even save this visualization and add it to a Dashboard.
Once the data is in Elasticsearch ,all the features within Elasticsearch and Kibana are available.

Lets look at a couple of more involved examples

Saving variables
Lets assume that one of the steps in the test script is a Search script. The script reads a set of terms from a file into say a variable (sku) and executes a Search transaction. However the application we are testing has some issues. If a search term is a complex phrase, returns too many results or has too many rules defined for it (Synonyms, etc) then it takes time. We want our results to be analyzed either by Search as a whole or by the specific terms that we searched for. One option is to modify The Sample label to be something like Search - ${SKU} and use wildcards whenever we want to see Search etc. But it would be cleaner if we could still have the SampleLabel as Search and somehow save the variable "sku" and its value as part of the document we index.
A normal listener has access to the vars object or can get the variables from the Thread. But the BackendListener interface only has a SampleResult and no access to the original thread due to the Async nature. An option would be to modify the SampleResult (by a different listener) to add these variables into it (in a Sub SampleResult or something). But in this example since I only want a single variable , lets just use a poor mans delimiter with SampleLabel.
So what we will do is Change the SampleLabel to be something like
Search~sku=${SKU}
a) We will tell Elasticsearch about this field in the mapping (well actually we already did it - If you look at the mapping we specified we already had a "sku" field defined  , a not_analyzed string
b) We then modify our JMeter script to change the SampleLabel

c) In our Listener we change our code to parse the SampleLabel and set the values correctly in the map method , where we transform the JMeter result into a Map object
String[] sampleLabels = result.getSampleLabel().split(VAR_DELIMITER);
map.put("SampleLabel", sampleLabels[0]);
for(int i=1;i<sampleLabels.length;i++) {
        String[] varNameAndValue =sampleLabels[i].split(VALUE_DELIMITER);
        map.put(varNameAndValue[0], varNameAndValue[1]);
}

And thats it (well compile , copy the Listener jar and restart JMeter). Run the JMeter script and we should start seeing data. Heres the 90% graph for Search in General - Y Axis is aggregated by Percentile and X axis is a Split Line on a TermsAggregation on Field SampleLabel sub aggregated by a Date Histogram on timestamp

And heres the same thing for individual SKU -Y Axis is aggregated by Percentile and X axis is a Split Line on a TermsAggregation on Field sku sub aggregated by a Date Histogram on timestamp


Any variable can be done in the same way (or multiple variables) using ~ as a separator and key=value for the variable..


Comparing runs
A common performance issue is you have a set of data and you want to make a change and compare the runs before and after the change. Lets try and compare graphs. I actually couldnt figure out how to do this in Kibana so lets do a quick and dirty fix -
1) We'll settle on an epoch date . Lets say 2015-01-01 00:00:00. No matter what time we started our test actually, we are going to pretend we started it on 2015-01-01 00:00:00. For all requests that are part of the scripts , we are going to shift the time so that the times are all relative to 2015-01-01 00:00:00.
2) Because of 1) , we now don't actually have to compare different timeseries , all the results will appear to be on the same time frame. However we do need a way to differentiate the two runs. So we'll introduce another variable and call it runId. This will just need to be something unique so that we can differentiate the runs

a) We add these new fields to the mapping for Elasticsearch(well we did it already) - the fields are NormalizedTimestamp and RunId
b) We modify the java code to add these parameters and default values - that was done in the first code snippet too - normalizedTimestamp and runId
c) Next we modify the Listener code to add this new timestamp
We calculate the offset of the startTime from the epoch
String normalizedTime = context.getParameter("normalizedTime");
if(normalizedTime != null && normalizedTime.trim().length() > 0 ){
    SimpleDateFormat sdf2 = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSSX");
    Date d = sdf2.parse(normalizedTime);
    long normalizedDate = d.getTime();
    Date now = new Date();
    offset = now.getTime() - normalizedDate;
}


And then we simply subtract this offset from the current time in the mapping code
map.put("NormalizedTimestamp", new Date(result.getTimeStamp() - offset));

And now we can run our tests (after compile, copy jar and restart JMeter). So lets run the test once  and then run it again (after an interval), preferably make some changes (for e.g. I turned off cache and then turned it back on
On the Kibana homepage , if we change the interval to cover both the runs , we can see that we do have some data for two separate runs . (We could note down the RunIds and just filter for them - This  would be needed if we had run tests in between - but we didnt so we are good)


Next go back to Visualize and this time the Y Axis can still be a 90 Percentile Aggregation of Response time - Then use a Split Lines over a Terms Aggregation of RunId , A Split Chart over a Terms Aggregation of SampleLabel , and finally an X Axis date histogram of NormalizedTimestamp . Note that the time filter still operates on the timestamp field so we have to ensure that it covers the actual times of both runs


[1] Source code and Jar available at https://onedrive.live.com/?cid=1BD02FE33F80B8AC&id=1bd02fe33f80b8ac!953
Note JAR built with Java 1.8