6.5 million software developers and still going strong

Java Developer Magazine

Subscribe to Java Developer Magazine: eMailAlertsEmail Alerts newslettersWeekly Newsletters
Get Java Developer Magazine: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Java Developer Authors: APM Blog, Stackify Blog, Pat Romanski, Hollis Tibbetts, Andreas Grabner

Related Topics: Enterprise Architecture, Enterprise Application Performance, SOA & WOA Magazine, Java Developer Magazine

Article

A Maturity Model for Application Performance Management Process Evolution

A model for evolving organization’s application performance management process

As IT systems form the backbone of business operations, their performance plays a key role in business growth. Understanding this fact, organizations work toward obtaining best performance from the software systems to maximize ROI on IT. Now an application's performance can be improved by tuning numerous factors like the underlying infrastructure, deployment configuration, application architecture, design, workload, etc. Yet there is another important factor driving the performance of all applications of an organization - the performance management processes adopted by an organization. The performance management process consists of activities performed to get a better understanding and control of application performance.

Here we present a maturity model that will help organizations evaluate and evolve their processes on certain key dimensions. The scope of the model presented is limited to the activities that have to be carried out as part of the performance management process. In addition, as the people and technology used for process implementation are equally important to achieve the required success, equal emphasis has been given to them. The model describes a six-level evolutionary path to progressively mature performance management processes in an organized and systematic manner. With the increasing maturity of the performance management process implementation, organizations can see a positive impact of the performance engineering adoption on the business.

Levels of Performance Management Maturity
Based on the research and experience gained through various performance engineering exercises, we define a performance management process maturity model with maturity levels ranging from level 0 to level 5, as shown in Figure 1.

Figure 1: Performance Management Maturity Model

Next we detail the characteristics of a project at various maturity levels. The necessary activities, team, and technology required to achieve higher maturity are also highlighted.

Level 0: Ad-hoc
At this level, the application's performance gets the least attention as the major focus is on meeting the functional requirements. Performance gets attention only when there is a problem in the production environment. Organizations function in an ad-hoc manner as no performance management process exists. All the vendors, performance and technical domain experts are called in to fix performance problems as and when they occur. The use of technology is also not planned, so available tools are only used to identify and resolve the performance issues whenever they occur. This unplanned use of people and technology increases the application maintenance cost and downtime.

Transition to Level 1: Systematic Problem Resolution Mechanisms
Level 1: Systematic Performance Resolution

As the application downtime to resolve the ad-hoc performance issues in production increases, resulting in a heavy loss of business, the importance of a properly organized performance exercise becomes apparent. An immediate priority then is to define and follow a systematic problem resolution mechanism, ensuring that the performance problems are resolved as quickly as possible, reducing system down time.

Systematic performance resolution processes with phases like discover, detection, isolation, and resolution are used to fix the problems. Though a process is followed, it focuses only on performance issue resolution. The same tools used at maturity level 0 are used for problem identification and resolution, but in a more structured manner. Performance management strategy is limited to bottleneck analysis and tuning, so technology domain experts are called in to fix the performance problems in production. They follow defined methodology to do application (configuration) tuning until the performance targets are achieved.

Though using a systematic process reduces the application downtime, performance problems and occurrence risks still persist. The steps taken are still for problem resolution, not for identification and the removal of imminent performance problems. So under unusual circumstances, such as holiday transaction volumes, performance SLA violations can occur.

Transition to Level 2: Robust go-no-go gating mechanism
Level 2: Performance Testing

With increasing performance problems in a production environment, organizations realize the inadequacy of ad-hoc and systematic processes to resolve them in a production environment. As problem resolution exercises increase, organizations start feeling the cost of resolving a performance problem in production. Subsequently management realizes the importance of identifying and eliminating performance problems before applications go into production.

To ensure this, a comprehensive performance testing methodology is put in place so that all applications are deployed in production without any performance bottlenecks. Such robust gating mechanism ensures that applications are tested thoroughly for all the business-critical steps under production-like situations.

At this level, performance testing is recognized as a separate function of the software development process. The focus shifts toward a more planned process of problem identification before the deployment of an application. During performance testing, various load testing and monitoring tools are used for performance problem identification. Exhaustive performance test plans are prepared by experienced performance analysts who understand the application and its performance goals. Performance test engineers, skilled in using performance testing and monitoring tools, record and run the scripts. In addition, the performance experts analyze the performance test results and recommend the appropriate steps to be followed to resolve the performance bottlenecks. Thus, there is a cost for the specialized people involved in the project.

Though a problem is detected and resolved before it occurs, it's a onetime activity performed at a much later part of the software development life cycle. For early detection of performance issues, there is a need for a more strategic method that puts a planned performance analysis process in place throughout the application development.

Transition to Level 3: Early detection of performance issue through validation
Level 3: Early Performance Validation

With performance problems resolution at a pre-production level, organizations get fewer performance problems reported and reduced downtime, and thus start realizing the significance of early problem detection and resolution. During the course of performance testing, issues identification and resolution, it discovered that it's more costly to fix the issues injected during the early life cycle stages such as architecture or design. Organizations look for processes to achieve resource conservation, in terms of both effort and cost. This also takes them to the next level of maturity.

At maturity level 3, organizations use processes that can help them identify problems as the application is built by validating each stage of an application development life cycle. The defined architecture, along with each of the individual components and modules, are validated for performance. The deployment configurations and hardware capacity are also validated before procuring them.

The "V" model performance validation process is followed to validate the performance at each development life cycle phase. The process can include activities like NFR analysis, workload analysis, architecture validation using performance modeling, and code profiling. Performance profiling, testing and monitoring tools are used to continuously track the performance, which is an important task for performance validation. Performance modeling and simulation tools are used to carry out the architecture and design validation. Here the performance analysts are involved in all phases of the development life cycle, unlike level 3 where they are required only at the system integration phase. The technology experts are required to review the design and code for performance. Also the people knowledgeable about performance modeling, profiling and testing are required to verify the performance at all stages.

Even if the strategy followed ensures that the performance issues are identified and eliminated quite early in the development life cycle of an application, it's still carried out as a reactive process. It, thus, remains a onetime reactive activity performed a little earlier as compared to the level 2 maturity processes.

Transition to Level 4: Proactive Holistic performance engineering solution
Level 4: Performance Engineering

At previous levels, all the processes followed are reactive, that is they are reacting to performance problems after they occur. The strategies make incremental improvements along the levels of how early the problem is detected and fixed. However, with the reactive approach, a lot of effort is spent identifying and resolving the performance issues in the application. Thus organizations need a complete performance engineering approach that aims to prevent the injection of any performance issues in an application.

At this level, a proactive and holistic performance engineering process is followed to take care of application performance during all phases along with the validation phases to further ensure their absence. The exercise starts right at the requirement gathering stage wherein performance requirements are also captured. Then throughout the development life cycle and across the technology stack of an application, it's tracked for how performance is getting engineered into an application by following available best practices.

A performance engineering process is integrated along with the development process to build the performance into an application. During the requirements-gathering phase, performance requirement gathering is included. The architecture and design phase involve architecture creation using patterns, anti-patterns, modeling and validation. The build phase follows technology best practices and includes code performance review and profiling. The system testing phase is followed by performance testing and analysis phases to deliver a scalable application.

The technology adopted is more for the engineering exercise starting right at the beginning of the application development life cycle. Non Functional Requirements gathering is done using workload modeling tools and techniques. Then performance modeling and simulation tools for architecture validation are used. Later stages use profiling tools for code validation, and performance testing/monitoring tools to benchmark the application performance.

Performance experts are involved in the project from the requirements gathering phase to focus on the application performance. Architects, experienced in the underlying technology, create the application architecture and performance architects review it. People with performance modeling knowledge evaluate the architecture for bottlenecks and scalability. Next the performance analysts review and test the design and code. Last, but not the least, specialized testers, engineers and architects do the performance testing, analysis and tuning.

Though here the organizations have a proactive approach for preventing performance-related problems, it ends once the application is developed and deployed at a production environment. It lacks a feedback mechanism for converting a proactive approach toward performance to a continuous one.

Transition to Level 5: Proactive continuous performance optimization solution
Level 5: Continuous Performance Optimization

While the applications are in production, they need to cope with constantly changing requirements and an ever-increasing workload. Managing the applications' performance is of utmost importance for an operations team. At earlier levels, the objective was restricted to attaining the required performance goals till the deployment of an application.

Organizations, at level 5, make sure that the application's performance goals are always met in production even as the time passes. They establish continuous monitoring, enhancement and optimization mechanisms. An application's performance is continuously monitored for tracking the performance behavior. Performance modeling and simulation exercises are carried out to foresee any imminent performance issues. A feedback mechanism is put in place to take corrective measures before the performance starts degrading.

At this level, the Performance Management process is considerably evolved, with a proactive and continuous vision toward application performance maintenance.

A proactive and continuous performance engineering process and capacity planning process is put in place to take care of the application development, re-engineering and problem resolution.

A thorough capacity management solution, including tools for monitoring, modeling, forecasting and simulation tools, is used. These tools help convert a simple performance engineering exercise as follows at level 4 to a continuous and proactive engineering solution.

The profiles of people involved in project development is the same as at level 4 as all the activities are done here as well. In addition, at this level, system administrators can use the available set of tools for continuous monitoring and will understand the alerts generated at various stages.

The performance management process at maturity level 5 has the following characteristics:

  • Integrated approach: Performance management approach integrates the three dimensions of people, process and technology throughout the application life cycle.
  • Enduring performance of application: Continuous performance management principles embedded throughout the development and maintenance results in better application performance at all times.
  • Faster outputs: The process is able to address the performance concerns proactively and continuously in parallel rather than sequentially, which produces faster results.
  • Higher quality of project: Well-documented and repeatable processes contribute to higher project quality eventually.

Table 1 summarizes the performance management characteristics across the dimensions of each level of maturity.

Alignment Among Dimensions for Transitioning to Higher Maturity Levels
The three dimensions of people, process and technology are critical for the success of any project. The same is reflected in the last section where all maturity levels are explained in terms of usage of those dimensions. Here further emphasize placed on the performance management process maturing to higher levels, and alignment between all the dimensions is required. Achieving maturity at any one particular dimension is never enough. For example, involving all the experts in performance exercises alone is not enough. Providing best technologies to the best people in business of performance engineering without laying down a process is not sufficient. It's more effective when all the required technologies and processes to handle them are provided to performance experts. Alignment of these dimensions in a management approach decides where an organization stands on the maturity model. To illustrate this, three classes of organizations focused on different dimensions are shown in Figure 2.

Figure 2: Alignment of 3 Dimensions in different organizations and their maturity level

Impact of Maturity Levels on Businesses
Organizations thatevolve their application performance management process guided by this maturity model can improve their application's performance and downtime in the production environment, and subsequently increase customer satisfaction. [1] Also with the increased predictability of an application's performance, risks of failure are minimized. So organizations save a lot, directly and indirectly, on their cost.

Cost
For immature organizations, a high percentage of development effort is spent on fixing performance issues (i.e., on rework). As no heed is paid to performance practices, scores of issues go undetected in production. When the application is used by end users, a "tsunami" of performance problems emerge and overwhelm the project with ad-hoc and/or systematic tuning work. With mounting defect levels, the application's downtime increases, resulting in business loss which increases the costs multiple times. Mature organizations (i.e., at levels 4 and 5) proactively target and resolve performance problems and make improvements to performance predictable at all times. Thus, though the development cost is increased by some percentage, the cost saved on downtime and tuning, brings down the net performance management cost.

Figure 3 depicts the application's cost distribution across various buckets at each maturity level of the performance management process. The cost is divided into percentages spent at each stage of an application's life cycle. Organizations at lower levels do not consider the business impact of the abrupt degradation of an application's performance and subsequent ad-hoc responses. So at level 0, the majority of performance management exercises are carried out at the production stage, mostly as a response to a performance issue in an application. As the cost spent on performance is on an ad-hoc basis and highly unplanned, it becomes difficult to keep track of the ROI on system development and maintenance. As an organization moves higher up the maturity level, the importance of performance management becomes apparent, so the cost spent on performance management increases at earlier phases of application development life cycle and reduces in production.

As seen in Figure 3, the cost spent in production reduces across the maturity levels while for other phases involving design, development and testing, it slightly increases with increasing maturity.

Though there are no immediate returns from the investments on performance exercises during the earlier phases, there is a long-term vision in place. This guarantees a lower overall expenditure on performance issues resolution, as shown in Figure 3.

Figure 3: Change in cost distribution across SDLC for different maturity levels

Risks
For performance considerations, software risk can be defined as a combination of the probability that a software system may fail or become slow and the severity of the damages caused by the failure [1]. So when an organization uses the processes at level 0 and 1, there is a high risk of failure or poor performance for normal system usage as well. A lack of process leads to reduced predictability, in turn proliferating performance-related risks. Since the system's performance is considered in an ad-hoc manner, user complaints are a regular feature for application performance and downtime.

As an organization's processes moves to level 2 and 3 using performance validation to filter out critical performance issues, the risks of getting the performance issues in deployment decrease. Yet performance is not very predictable, any changes in the application's usage, enhancements and/or environment can increase the risks of failures.

To tackle this problem, the next level process involves system architecture, design and code reviews and validations to ensure performance for changing environments, thus lowering the risks.

At the next level, system behaviors, metrics affecting system performance and other business measures are captured, reported, analyzed and predicted to help find ways to deliver good performance continuously, thus minimizing the risks of failure proactively. Consequently there is a lower risk of failure and customer dissatisfaction.

Summary and Conclusion
Performance exceptions under unusual circumstances such as holiday transaction volumes incur high maintenance costs and a huge loss of revenue and image to business. Thus performance management processes need to be focused and improved. The performance management process maturity model proposed in this article describes the key aspects of processes at various maturity levels. The article also highlights the benefits obtained by mature organizations such as high performance products and highly available systems. It is recommended that organizations following ad-hoc performance tuning processes should move to systematic and robust gating processes. Further the companies having reactive strategies need to adopt proactive and continuous performance engineering methodology for pre- and post-deployment performance management.

Reference

  1. It has to be noted though that setting an appropriate performance management strategy does not guarantee the mentioned benefits. It has to be supported by correct infrastructure design and sizing along with other application architectural decisions.

More Stories By Shyam Kumar Doddavula

Shyam Kumar Doddavula works as a Principal Technology Architect at the Cloud Computing Center of Excellence Group at Infosys Technologies Ltd. He has a MS in computer science from Texas Tech University and over 13 years experience in enterprise application architecture and development.

More Stories By Nidhi Tiwari

Nidhi Tiwari is a Senior Technical Architect with SETLabs, Infosys Technologies. She has over 10 years of experience in varied software technologies. She has been working in the field of performance engineering and cloud computing for 6 years. Her research interests include adoption of cloud computing and cloud databases along with performance modeling. She has authored papers for international conferences, journals and has a granted patent.

More Stories By Amit Gawande

Amit Gawande works as a Technology Lead at Infosys Labs, the research wing of Infosys Limited. He has gained considerable experience in Performance Engineering methodologies and cloud computing over his 5 years of experience in the field. His research interests include Performance Modeling and Simulation techniques.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.