The basic purpose of this document is to give a high level introduction to Software Load and Performance testing methodologies and strategies. This document is intended to facilitate Software test Managers, Project Managers, Software Engineers, Test Leads, Test engineers, and QA leads — anyone who is responsible for planning and/or implementation of a successful and cost effective performance testing program.
the scope of all the conceptualization mentioned in this document is only in the Test Execution in Automation Context.
In this context the attributes of Load & Performance testing covered are as follows,
Load / Performance Test Planning
Load / Performance Tool Evaluation / Selection
Load Test Process / Methodology and Test Strategy
Load / Performance Test Start/Stop Criteria
Test Environmental Setup and Pre-Requisites
Test Scenarios Definition including Load Scenario, Data Volume, Virtual Users Ramp Rates and Scripting Guidelines etc.
Pass / Fail / Exit Criteria
Analysis and Report Generation
2 Load / Performance Test Planning
To make any operation/mission successive planning plays the most vital role and according to the 80-20 theory, 80% of the time should be spent in planning and 20% in real time plan execution/implementation. In the similar fashion, Software performance test planning is very crucial as well.
Any Software Performance Test Plan should have the minimal contents such as,
Performance Test Strategy and Scope Definitions.
-Test Process and Methodologies to follow in different test phases.
-Test Tool Details (Tool Evaluation, Selection, Configuration, Addins, Third Party Utilities Integration, Os Configurations etc)
-Test Cases Details including Scripting, Library Development and Script Maintenance Mechanisms to be used in Every Performance Test Phases.
-Resource Allocations and Responsibilities for Test Participants.
-Test Life Cycle Tasks Management and Communication Media.
-Risk Management Definitions.
-Test Start /Stop Criteria along with Pass/Fail Criteria Definitions.
-Test Environment Setup Requirements. (Hardware Setup, Software Setup, Os Configurations, Third Party Utilities Setup etc.)
-Multi-Platform Performance Test Strategies.
-Application Deployment Setup Requirements for Load Test.
-Virtual Users, Load (Iterations Vs Users), Volume Load Definitions for Different Load/Performance Test Phases.
-Results Analysis Algorithms and Reporting Format Definitions.
3 Load / Performance Tool Evaluation & Selection
Tool Evaluation are yet another important task in Performance Test Automation wherein there are several things to be considered. In cases wherein the tool evaluation and selection is completely done by the Client then readers may skip this Topic and proceed to next.
While selecting any tool for Load or/and Performance testing the following things should be analyzed such as,
-Test Budget Vs Available Load/Performance tools in the market mapping to the Budget.
-Protocols, Development & Deployment Technologies, Platforms, Middle-Ware tools / Servers, Third-Party Integrations of the Application Under test Vs Support for these factors in the available tools with prioritization of the availability in the tool for the Scope of expected test.
-Complexity of the Tool Usage Vs Availability of the tool experts along with the timeline requirements for the tool scripting / load scenario creation / tool configuration with respect to Man-hours and Other Resource Requirements.
-Tools Limitations and Work-Around factor mapping with the current scope of testing.
-Tool’s Integration / Portability Factors with Other Tools used to Monitoring, Analyzing and Test Management.
-On Evaluation and Selection of Base tool, third party monitors / tools to be used in Integration with Main Load Testing Tool should be defined.
(Third Party Monitors / Tools like ‘Optimize IT’, ‘Web logic’, ‘Oracle Tools’, ‘Spotlight On Oracle’, ‘Fog-Light’ and Test Management Tools like Mercury-Test Director, Segue-Silk Plan Pro, Compuare-QA Center etc.)
4 Test Process / Methodology and Test Strategy
Test Process, Methodology and Strategy for any Load / Performance testing will definitely vary for every project based on the Client Requirements / Company Process Implementations but still here we are going to put some light on the common generic performance test process, methodologies and strategies followed in the industry.
By using a methodology a project can make sure that resources are applied to the problem effectively and that the people involved approach the work in a structured way. In Performance testing Process there are couple of classifications as mentioned below,
This task tests the newly coded performance features to assert that they do actually improve the performance of the application. There is always a danger that a new performance feature decreases the performance of the application instead of increasing it. Tests should be constructed so that they executed a standard benchmark style test with the feature switched on and with the feature switched off.
The first time this particular task is performed the task becomes developing the benchmark tests. When this task is performed in later iterations it becomes re-running the benchmark tests with the new version of the code in order to track whether or not the code is improving.
Benchmark tests are good for a basic understanding of what is happening and are very useful for tracking improvements over time but they are not so good at isolating the reasons of the next major performance problem. Analysis Testing refers to designing tests that attempt to isolate the next major performance problem. These kinds of tests may do something completely different to the benchmark tests in order to explore what is happening in the target application. The tests are designed to explore theories as to where the next performance problem may be.
This is also where supplementary tools are used most often. Additional tools such as method-level profilers and operating system performance monitors can be useful in working out where a problem might be occurring.
Long Running Tests
It is important to include some tests that run for long periods of time. There are a number of problems that may occur only after the target application has been running for more than a day or even after a week. Memory leaks in particular may take many hours before they become measurable. The length of these tests should be related to how often you plan to run the target application before restarting. If you plan to restart the target application every day then the test length should simulate one day’s worth of work but if you plan to restart the target application only once a week then the test length should simulate a week’s worth of work.
This type of testing involves long periods of time and this causes problems such as,
1. Since the tests take so long you are not typically able to perform as many of them.
2. These tests take so long that they tie up resources for long periods of time.
One thing that you can do to be compact the work performed over one day into a short period time by making use of how normally a server machine is not run at one hundred percent of CPU for the whole day. By creating a sustained workload for a shorter period of time you can usually simulate on days worth of work in a shorter period. For problems like memory leaks it doesn’t normally matter how long the test takes what is important is how much work the test does. A quick half-hour test may not exercise the application for long enough but a sustained high-workload test that simulates one days worth of work in six hours should hopefully identify any problems.
For example a typical interactive application that services internal users might look like the following chart with peaks at around 11am in the morning and 2pm in the afternoon.
By making the CPU do more work in less time you can compact the overall length of the test into a shorter period and so perform long running tests more easily.
Due to the hassle and restrictions involved in this kind of testing it should be done after the application is behaving reasonably well with shorter testing periods. But also it shouldn’t be left too late either since it takes so long to run these tests it is a good idea to get started with them before the end of the time allocated to performance testing.
Another issue that often comes up is where the clients are located to do their testing. If the clients are located near where the server is running then the test may not adequately simulate the eventual production configuration. If the eventual production configuration is to have the client processes distributed nation-wide over a company’s private network then some testing should be done to simulate this otherwise if there is a problem with the network then testing with the client processes near to the server may not uncover this.
There are generally practical problems with doing this kind of testing. There may be many different sites from where the clients may be executed and managing the load testing clients out on all of these sites can be a hassle. Due to these sorts of problems not all of the performance testing has to be done in a distributed way but some of it should be. Also like the long running tests this sort of testing should be done after the application is behaving reasonably well in a non-distributed way. This way the methodologies can be implemented based on the scope of testing requirements at a particular phase of performance testing.
If we talk about the Test Process in general practice, once the Benchmark attributes like Load Scenario with load (Iterations / Scenario / Script), Virtual users with ramp-up rates, Business Transactions Definitions, Finalization of Tools for testing, Hardware Environment, SW-Environment like OS configuration, DB, Data Dependencies, Tool configurations, Application under test deployment details are finalized and documented the initial base benchmark test can be conducted in a normal circumstances to make sure that the application is in a stable state to go through the heavy load test. If any issues found during this test, then release cycles may be repeated till application reaches enough stability to start heavy Load / Performance test.
The next task at the end of Baseline Performance test can be to start the Load Test. Load testing is an important task accomplished through Stress and Data Volume testing. This methodology of testing is used to establish an initial sound structure upon which the product is developed. A load test is meant to exercise the design of the application and reveal weaknesses in the targeted product before “General Availability”. By revealing problems during the development we do not only prepare for a successful deployment, but also a long prosperous product life cycle.
Load testing comprises of Individual Module testing run on Iterative basis. Load tests are conducted against the Application under Test using a varying number of virtual users whose results are used for developers/architects to validate/tune the code / design /configurations for optimizing the product performance. The metrics established from the load test will be compared against the baseline to identify performance discrepancies.
The cycles of such load tests and tuning can be repeated till the product stability reaches enough to go to production OR stability as defined by the client.
Along with load testing, Stress testing is implemented as a sub-set of load testing. The stress is nothing but applying the load in the ramp-up of virtual users in the groups at the pre-defined rates. (I.e. Group1 – Users will ramp at rate of 5 users/scenario – 5, 10, 15, 20). There are cases when Client expects to benchmark the bottlenecks for specific application functionalities commonly used/accessed by all users the most and that too at a time. For such cases, we can put rendezvous points at appropriate transaction actions wherein all the users will ramp up at the pre-defined rate but when it comes the time to execute the action of rendezvous action, users will wait till all Virtual Users reach till that point and perform that action at a time. So in such a scenario bottlenecks can be easily located. Basically Stress tests are intended to be modular in design so each functional script can be run independently.
In the performance testing process, there are cases in which some part of application is data volume dependant like Search Screens wherein the Queries / Stored Procedures Performance based on amount of data volume existing in the database, then for such cases before starting the test, the dump of some huge amount of data should be inserted in the database through some SQL Scripts or Automation Scripts etc. so that the results of such screens can help in database tuning / optimization.
At the end of all such tests, the results from the load testing tool, monitoring tools give a bright idea about the tuning requirements in the system like CPU, Memory, OS, HW, SW, Code, Application Configuration, Middle-tier servers etc. and such Load Test and Results Analysis Vs tuning cycles can be repeated till the application reaches stability as expected by the client.
5 Load / Performance Test Start/Stop Criteria
In case of any performance testing there should be a defined / documented Start and End Criteria to avoid the adhoc process resulting in weird outputs. This start / end criteria should be well defined in the Performance Test Plan along with all the risk management factors.
The start of any Performance or load testing should be done for any application or any module of the application only when the Design, Database Structure and Application Development platform is finalized and there will be no major changes at any layers of the application, which affect directly or indirectly to the Performance Factor. It really again depends on the scope of testing such that the test is going to be conducted for a specific Module or the whole System.
Once the Load or Performance Test is conducted, results are collected and analyzed the cycles of tuning will carry on but yes definitely the end criteria should be well defined to Stop this Performance Testing and Tuning Cycles based on various factors like product GA release timelines (Project Deadlines) and Client Requirements in terms of System Stability, Scalability and reliability under load and Performance Tolerance Criterions.
In some projects the end criteria is defined based on the Client Performance Measures Requirements defined for each section of the application so if product reaches to that expected level then that might be considered as the end criteria for Performance Testing.
6 Test Environmental Setup and Pre-Requisites
In the performance testing, by the term test environment here, it’s composed of hardware environment (Machine, Memory, Virtual Memory, Number of CPU’s etc), Software environment like Os, Os configuration, Files System Settings, Free Space for logs, Application Configuration along with Middle Tier Server and associated Configurations, Database, Network etc.
Among all of the above environmental variables defined few or all might be used for setup based on the Deployment Setup Client / Customer expects to have in production.
This setup also involves the Automation Tools installation, configurations, Agent Installations and setup in Load Clients, Third Party Monitors on Clients / Servers etc.
There may or may not be any pre-requisites if all of the above variables setup is all set before starting the test but some pre-requisites like Reference Data, Test Input Data in the database Insertion etc.
Also System Settings at Os Level, Tier-Level, and Server Level etc. should be listed in a document as Checklists and should be verified before starting any Performance Test.
7 Test Scenarios
Test Scenarios Definition includes various tasks like Business Scenarios Definitions, Writing Automation Scripts for Test Cases Corresponding to defined Business / Functional Scenarios, expected Data Volume for Test mapping to Virtual Users Ramp Rates and Scripting Guidelines etc.
In the test scenario development, the scripts written can be made data driven to facilitate Iterative testing in Load Test Scenarios. Once these details are well defined in Test Automation Plan, then Virtual User Test Scripts can be written to simulate User Client Actions / Client - Server Request-Response Hits etc. On completion of baseline scripting, based on data-drive requirements the test scripts can be parameterized to make them compatible to execute the same functionality / User Actions for different set of Test Data on Iterative basis.
In some typical cases in few scenarios wherein the performance of some sections of the application is based on the volume of the data residing in the database then Data Volume Based Performance Test is a must. In such cases, the data insertion SQL Scripts can be written and executed before the execution of Load Test and Data-Volume based Test Scripts should be mapped to this data.
Also along with these factors definitely the number of Virtual Users, amount of load (i.e. Preset Number of Iterations/script) should be predefined from the Customer Production load Volume in peak and non-peak hours. Also in the Stress test which is a part of load test itself, the Virtual Users need to be set in Ramp Up by the pre-defined rates like 5,10,15,20,25 Users etc. Along with the Stress Ramping Rate setting of the Virtual Users in the load scenario, there are cases wherein if the requirement is to perform a particular Action/Operation by all the virtual users at a time for finding out the bottle neck points, Test Scripts should have rendezvous points for such actions so that whenever all the virtual users are performing different operations for different test cases, they will wait till all Virtual Users get to the rendezvous points and execute that action at a time and at the end of that action then again the Ramp-up of Virtual Users is re-started.
8 Analysis and Report Generation
Once the results are ready, analysis is the most important job in the whole performance test if the requirements of load or performance test are not defined.
In the analysis, the main criteria for analysis should be to find out the bottle neck points wherein we should be able to track out the transactions / users / hits creating the highest level of performance reduction. The main factors to be consideration are,
Request / Response Timings to-fro from Client to Server
-Transaction Timings Vs Users
-Transaction Timings Vs Load
-Transaction Timings Vs CPU Utilization
-Server Memory Utilization
-Database Server Performance using DB Server Monitors
-Web / Application Server Performance
-Paging Rate Ratios
-Open Cursor Factors in Database
-Load Balancer Factors
-Bean Performance Factors using Monitors like Optimize IT for Java Based Applications
-Other Monitors like Application Server (e.g. Web-logic) Monitors etc.
Thus now after consolidating all the results for the above points, the statistical analysis should be done and the Performance Test Report should be created such that the Development, Design-Architecture, Database / IT, PMG group can get a clear idea about the Application Performance and all the factors affecting it.
After Analysis then based on the scope the corresponding tuning like Os tuning, Database Tuning, Memory Up-gradations, CPU counts in Multi-CPU machine, Code-Tuning, Application Server Cache tuning or any other tuning as per the scope can be done and same tests executed earlier should be repeated in the tuned environment to make out the differences in the results before and after tuning. These tuning cycles can be repeated until the pre-defined required Performance is achieved. In the next version of this document we will be placing the examples for each type of Performance Analysis with the corresponding graph with all the algorithms used for analysis.
9 Pass / Fail / Exit Criteria
At the end of the Load / Performance Test execution, we can find the number of failed users, failed transactions or failed scripts but the Pass / Fail Criteria for a particular test should be well defined before execution of the test as basically this criteria entirely depends on the scope of testing and the environment. For e.g. in some cases few transaction steps failed which has some workarounds or the reasons for failing the transactions or virtual users are due to unusual attributes behavior which is not related to the application then the results, or few times the tests / scripts / virtual users / transactions failure might happen due to Load Tool Configuration or Load Tool Defects also so its very important to analyze and make a precise differentiation of the Defect occurred in the test and then finalize whether the test was pass or fail.
Also based on the failed Items due to any of the performance test parameters the Failed Item Details should be included in the Applications Limitations List. By doing so, the limitations list can give a clear picture about the Applications Scalability / Load or Performance Limitations with respect to corresponding Load Parameters which in turn will give inputs for the tuning / enhancements for the Application.
Based on the Test Results tuning cycles will be driven and test phases will as well be repeated but the exit criteria should be pre-defined to avoid hazardous tuning life cycle.