Getting the Big Picture – Part 2
Because applications are becoming more distributed it requires a different kind of monitoring in order to understand trends, faults, or reduced performance.
One of the easiest things to monitor is duration. How long does it take to call a service and receive a response? Perhaps your measurements need to be balanced by the complexity of the work performed. Keeping an average of throughput, duration and success/fail for a service helps balance out the edge cases.
What is lost in this kind of monitoring is what actually causes change. If it is caused by the physical failure of a specific piece of hardware, other monitoring will be required to determine the cause and report the issue. Your monitoring trends are the basis of assisting lower level monitors to narrow down issues.
If you have expected norms for your services at different layers, then when things begin to degrade it allows you to narrow down the issue by finding those services impacting your detected problem.
Going back to the example of a customer’s concern for reduced application performance, you can more readily narrow down the issue by tracking the different services called for your customer’s application. Comparing current performance for their execution with historical trends provides immediate feedback regarding the veracity of their claim. It also assists you in determining what service has degraded if there is any. This information should be enough for you to continue further examination at the service level.
Do you have techniques for instrumenting your software for monitoring you’d like to share. Please do so by leaving a comment below, or sending an Email to btaylor@sswug.org.
Cheers,
Ben
$$SWYNK$$
Featured White Paper(s)
Key Considerations in Evaluating Data Warehouse Appliances
read more)
Featured Script
admin db – baseline and trends using sysperfinfo
Procedure to capture and use performance data from the sysperfinfo system table. Additional details for this script can be fo… (read more)