IT Counterpoint chat: APM done right
April 27, 2011 at 4:00 PM
Mike Sydor, a software solutions architect at CA Technologies, specializes in helping clients define, begin or restart application performance management (APM) initiatives. Drawing on the experiences he’s had and strategies he’s devised over the last five years, Sydor recently authored a definitive book on the topic: APM Best Practices: Realizing Application Performance Management. In this IT Counterpoint interactive chat, moderated by Beth Schultz, contributing editor, Sydor talks about the differences between APM and availability monitoring, the customer experience, picking the right APM tools – and more. The following is an edited transcript of the chat. For the on-demand version, click here.
Mike, start off by sharing your background with our audience. What makes you an APM expert, and why’d you decide to share that expertise in this book? I am a software solutions architect who has worked on large-scale distributed computing solutions for over 20 years now. I specialize in helping clients define, begin or restart APM initiatives. So the book is really a distillation of my experiences and strategies from the last five years.
Why do I need APM? Is it more of an imperative with today’s IT environments than previously? Begin [by realizing] that availability monitoring (platform up or down) is not helping you as much as it used to. When your helpdesk is a better indicator of your application performance than your monitoring dashboards, then it is definitely time to consider APM.
The emphasis today is on using the customer experience as the ultimate measure of a software system’s effectiveness. APM gives you visibility into the customer experience, as well as the underlying software and physical infrastructure.
So what are the problems with availability monitoring, then – it once was adequate but no longer? Why? The immediate difference between availability and performance monitoring is the resolution at which you get updates. Availability runs at about five to 30 minutes, depending on the agent or polling capabilities, while APM is at reporting intervals of one minute or less.
APM is also able to handle a much richer set of metrics – many thousands – while availability is often limited to about a dozen or so.
You mention customer experience as being the measure today. How does that fit into APM tools? The customer attention span is the subjective measure of a good experience. If you don’t bore or annoy the customer, it’s a great Web experience! Whether you measure this with synthetics, real transactions, logging or actual component interactions, you really need to have your measurements at the same resolution as your customer interactions. If they will sit for a minute, grabbing your page, then you need to measure at a minute (or less) to know when they are experiencing a degraded experience.
In order to know what the customer/end user is actually doing, you need to install a piece of hardware to collect the actual end-user experience, don't you? Hardware to capture the end-user experience is not always necessary. The HTTP protocol has some handshaking that can be leveraged so that you can get accurate measurements o their interactions. This becomes very powerful when you have a population of users doing the same thing. Now you get statistics and the measure of customer experience becomes quite precise.
When the front-end user interface is not Web, how can you attempt to understand the transaction generated by it? This is a bit more difficult and you have to look at the protocols being used, client-server, in order to see if there is an opportunity. Some green-screen applications have this constraint. In those cases, we look to the terminal server for the monitoring opportunity.
Why is APM restricted to Java/.NET? Java and .NET are JIT (Just In Time) compilations. The compiler generates codes that are interpreted at runtime. This presents an opportunity to look for specific signatures and help guide where instrumentation can be introduced. Older languages are less component-architected and thus do not have this advantage.
APM is NOT, however, limited to Java/.NET. The best practices are about defining the processes and organization to reliably get useful metrics out of whatever apps you have. Choosing the right APM technology is an important activity – to use what is most appropriate, without doing too much, or too little.
So how do you pick the right APM technology? The assessment activity, in particular through an application survey, is when these determinations are made. This collects together some of your organizational DNA about the apps, including the business contacts, technology, deployment scope, software architecture, transaction types, etc. The book goes through the assessment strategy and questions in some detail. But until you have some details, I generally try to have either a minimum of log file information, and then add on other technologies as the customer experience becomes more relevant.
One of your stated goals in the book is to help APM stakeholders get derailed deployments back on track. From the work you’ve done with clients, why do APM projects most often jump off course? The biggest is from a mismatch of expectations between the business sponsor and the IT team. The IT team is expecting to ‘stand-up’ the technology and hand it off to someone to use. The business is expecting ‘root-cause analysis’ and deep understanding of the application, based on the performance information. The missing piece is any person, technique or process that helps the team move from installation to power user.
This gap comes about because APM looks very familiar. It looks just like the SNMP technology that it augments. SNMP required very little expertise to interpret and employ. Mostly, you just got alerts when systems went down. With APM, you get alerts when systems are ‘in trouble.’ What does that exactly mean for the operations team? What are they expected to do when an app is having problems but otherwise is not going to be restarted? This turns out to be a big gap, and where expectations are trashed.
Is APM a good place to track batch job completion and status, jobs started by an automated process such as Cron? Even when there are 50 or 100 different jobs? Batch jobs are nicely handled by tools like AutoSys, etc. While batch is critical for many businesses, it is largely sequential. As long as the sequence order flows and completes within its time slice the batch is good. When the batch fails to complete is when availability monitoring is the best fit. When the batch starts to go long, due to a performance/resource issue, it looks like an opportunity for APM but it is more often not. A sequence of database operations, as many batch jobs are, simply doesn't allow any 'space' to hide instrumentation, so you will likely incur overhead, which further slips the processing window.
You also do not have any client interactions, so transaction filtering and synthetics are pretty much not a help here as well. What is important for batch, in the APM processes, is to capture baselines of typical processing loads. These can be used to qualify new architectures or processing strategies. So we want batch baselines to help provide a holistic view of the solution capabilities for the various business sponsors, following APM processes (even as APM technology may not be appropriate).
My company is choosing to stay away from the User Experience CA package aspect at this time. Should I still call Introscope APM without UE? APM is any combination of logging, protocol capture, synthetics, real transactions and instrumentation. I don’t explicitly favor one approach over another – it is what is most appropriate for the types of applications you have to manage. What differs is the amount of visibility you have and the timeliness in accessing that data. If you are monitoring performance then you are going to benefit from APM techniques even if you do not have a full suite of technology.
While an APM can do a good job with customer experience via HTTP, how about transactions generated by a system, going to another system via TIBCO? Integration middleware, like TIBCO, is a bit more difficult, from the protocol perspective. The approach here is to use instrumentation to pull the internal metrics 'up' from the protocol activities. It takes a little more work to set up initially but this is already productized and could be done for any type of integration middleware (Vitria, etc.).
What is the approach to instrument an application that is not Java or .NET – i.e., you have just a set of executables? The application may be communication to other components in the environment such as file services or database services, and the intent of the instrumentation is to get to understand where and when a supporting component may start to respond slowly and degrade performance of the application. I always try to start from the bottom up, in terms of APM technology – protocol, logging, synthetics, transactions, instrumentation – and then select the level that gives the best visibility. In your case, instrumentation (via bye-code) is just not an option. So I would go back and look first for compatible protocols, then logging, maybe a command interface that I could poll for information, sham transactions that I could use as synthetics. It is surprising what an established application may have available that generates interesting metrics.
After this initial bit of engineering, to get some metrics, the APM best practice in then to use load testing to validate that the metrics are useful and if they will be predictive as load increases. I talk about how to do stress testing, to maximize the benefit to APM as well as the operation of the application, in the book.
Whenever you suspect that you are experiencing degradation, you need to confirm that by comparing with a baseline of 'normal' for your application. APM is not magical; it is a kind of discipline to take the early steps so that when you encounter a problem you will have the best information at hand.
Firefighting, which is when you address performance problems against apps that have not enjoyed the discipline of APM best practices, requires a lot of experience. The book takes you through the skills and processes that you need to do this but – and this is important – you should never assume that because a product is easy to use that you will be able to solve really tough problems with it. Tough problems require a very disciplined approach, which is actually fairly easy to train folks for. But it is not something that you should attempt after your first day, or month, with APM.
What are some other APM best practices you’d like to be sure to share? The most significant best practices are:
1) Be able to deploy your chosen technology quickly, and by anyone who can read an e-mail.
2) Understand how to audit an application and extract the baselines for configuration, transaction visibility and performance so that you can operate the app with confidence.
3) Use the assessment techniques to get a hold of your application landscape, look for opportunities to show value, and get APM started on your turf.
What topics are you interested in having IT Counterpoint dig into through interactive chats and debates? E-mail your ideas here.
Beth Schultz , contributing editor, has more than two decades of experience as an IT writer and editor. You can find her work at a number of leading IT publications, where she writes on a variety of topics including cloud computing, mobility, network/systems management and security. Find her Linkedin profile here or e-mail her here.
Please add a comment
For people who are using a CPAP machine, you need to take notes to give to your doctor. If you experience any symptoms, like snoring, that were eliminated when you started using the CPAP machine and they come back, you need to let your doctor know. Only your doctor can properly assess any problems.
Consider doing a few very specific exercises before going to bed each night, to alleviate some of your sleep apnea symptoms. Exercising throat and tongue muscles has been proven in scientific studies to reduce snoring, improve breathing and lessen the more profound effects of sleep apnea when done according to doctor's orders.
If you have sleep apnea and cannot break the habit of sleeping on your back, try sleeping in a t-shirt with two tennis balls sewn into the back. Making this sleep shirt is a simple project to do at home, and it can help to break you of sleeping on your back. Every time you try to roll over on your back in your sleep, the tennis balls will remind you to roll back onto your side.
If you tend to snore a lot and have difficulties staying asleep, you should go see your doctor right away. You might have a condition known as sleep apnea: find an effective treatment so you can get enough sleep and go through your daily activities without being held back by your health problem.
Attempt side sleeping. Many people with sleep apnea are used to sleeping on their backs. When you sleep on your back it can cause your throat and mouth tissues to impede your airways. Instead, you should sleep on your side and that can help your breathe much better. Put a pillow on your side if you always find yourself moving around during sleep.
If you have been diagnosed with sleep apnea, it is important to avoid drinking alcohol. Alcoholic beverages will relax the muscles in your throat, which makes it more likely that they will block your airway during your sleep. At the very least, avoid any alcoholic beverages in the evening before you get ready for bed.
A great way to ensure that you do not sleep on your back and cause sleep apnea to occur is to use a tennis ball to prevent rolling onto your back. You can place one in a pillow behind your back and when you roll over in your sleep, the tennis ball will make you roll back on your side.
Minimize your risk from the conditions causing sleep apnea. Some sleep apnea risk factors cannot be changed, like genetic or hereditary reasons. But others, such as weight smoking and drinking, can be controlled.
These tips were written especially for people who suffer from apnea, whether directly or indirectly as a bed partner. While it can be quite difficult to completely remove apnea from your life, there are ways to live with it, and these were touched on in this useful sleep apnea article.