<?xml version="1.0" encoding="UTF-8" ?><!-- generator=Zoho Sites --><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><atom:link href="https://www.highperformance.tech/blogs/tag/work-metrics/feed" rel="self" type="application/rss+xml"/><title>High Performance Technologies - High Performance Technologies Blog #work metrics</title><description>High Performance Technologies - High Performance Technologies Blog #work metrics</description><link>https://www.highperformance.tech/blogs/tag/work-metrics</link><lastBuildDate>Sun, 08 Mar 2026 22:17:55 -0700</lastBuildDate><generator>http://zoho.com/sites/</generator><item><title><![CDATA[Monitoring What Matters]]></title><link>https://www.highperformance.tech/blogs/post/monitoring-what-matters</link><description><![CDATA[<img align="left" hspace="5" src="https://www.highperformance.tech/files/img/systematic-approach.jpeg"/>I've monitored many different types of systems over the past 20 years: from modem banks, routers, and authentication systems to storage area networks, ]]></description><content:encoded><![CDATA[<div class="zpcontent-container blogpost-container "><div data-element-id="elm_AbogcubNRV--Pu_xkxfaWQ" data-element-type="section" class="zpsection "><style type="text/css"></style><div class="zpcontainer-fluid zpcontainer"><div data-element-id="elm_4-2cqjXhT9-fGnrjszM6zQ" data-element-type="row" class="zprow zprow-container zpalign-items- zpjustify-content- " data-equal-column=""><style type="text/css"></style><div data-element-id="elm_4PbGJjKHRAGU_CWYi2cPnw" data-element-type="column" class="zpelem-col zpcol-12 zpcol-md-12 zpcol-sm-12 zpalign-self- "><style type="text/css"></style><div data-element-id="elm_WTlwRss9ueRlnDCxN6oI3Q" data-element-type="text" class="zpelement zpelem-text "><style> [data-element-id="elm_WTlwRss9ueRlnDCxN6oI3Q"].zpelem-text { border-radius:1px; } </style><div class="zptext zptext-align-left " data-editor="true"><p><span style="font-size:16px;">I've monitored many different types of systems over the past 20 years: from modem banks, routers, and authentication systems to storage area networks, database clusters and analytics systems. It’s not always easy to know what to monitor, <i>especially</i> in complicated interrelated systems that each produce tons of metrics. App developers and product managers still find it hard to know where to start, even when it comes to monitoring their own creations. In lieu of a better alternative, many of them start where I once did – the four core resources:</span></p><ul><li><span style="font-size:16px;">Processor</span></li><li><span style="font-size:16px;">Memory</span></li><li><span style="font-size:16px;">Network</span></li><li><span style="font-size:16px;">Disk</span></li></ul><p style="font-size:12pt;"><span style="font-size:16px;">It's a reasonable place to start, because you can trace many performance issues back to those resources. But it can take a long time to figure out which metrics matter, and it misses an entire class of interesting issues. Now I prefer a more focused approach, which I’ve found yields better results faster.</span><br></p></div>
</div><div data-element-id="elm_txjMo6UP0AScEk-uFMHnFA" data-element-type="heading" class="zpelement zpelem-heading "><style> [data-element-id="elm_txjMo6UP0AScEk-uFMHnFA"].zpelem-heading { border-radius:1px; } </style><h3
 class="zpheading zpheading-style-none zpheading-align-left " data-editor="true">Start with the End in Mind — What's the Work?</h3></div>
<div data-element-id="elm_YyWiZ5UM1Wnn0S6RDaHXVg" data-element-type="text" class="zpelement zpelem-text "><style> [data-element-id="elm_YyWiZ5UM1Wnn0S6RDaHXVg"].zpelem-text { border-radius:1px; } </style><div class="zptext zptext-align-left " data-editor="true"><p><span style="font-size:16px;">I believe that technology should help people accomplish their goals. It should be in service of others. It should generate useful output. It should <i>just work</i>. So, I prefer to start there, with the end in mind. When analyzing any system, technical or otherwise, I ask myself &quot;what work does this system produce?&quot; Consider it<span> from the perspective of the person who consumes that work, too — that's where the system's value is captured, after all. For example:</span></span></p><ul><li><span style="font-size:16px;">For a visual analytics system such as Tableau or Power BI, the work product might be to render a dashboard, or compile a list of dashboards that a user can select from. It might be to send an email with time-sensitive information in order to give context to an executive who is making a decision.</span></li><li><span style="font-size:16px;">This method works for non-technical systems too. Consider receptionists sitting at an office's front desk. Their work might be to answer incoming phone calls, or to greet office visitors, or to sign for and distribute packages.</span></li></ul><div><span style="font-size:16px;"><br></span></div><div><span style="font-size:16px;">This &quot;work&quot; might be considered the system's purpose, and you would probably be interested in knowing when the quality/quantity of work changes.</span></div></div>
</div><div data-element-id="elm_5r2riq9HHbvklDNL9vgr8w" data-element-type="heading" class="zpelement zpelem-heading "><style> [data-element-id="elm_5r2riq9HHbvklDNL9vgr8w"].zpelem-heading { border-radius:1px; } </style><h3
 class="zpheading zpheading-style-none zpheading-align-left " data-editor="true">Measure What Matters — Work Metrics</h3></div>
<div data-element-id="elm_0lqwvvhDc6i52Xmh9RCrGw" data-element-type="text" class="zpelement zpelem-text "><style> [data-element-id="elm_0lqwvvhDc6i52Xmh9RCrGw"].zpelem-text { border-radius:1px; } </style><div class="zptext zptext-align-left " data-editor="true"><p><span style="font-size:16px;">Once you've named a system’s work product, think about how to measure that output. Common types of measures for work include:</span></p><ul><li><span style="font-size:16px;">Throughput - How much work per unit of time is the system doing?</span></li><li><span style="font-size:16px;">Success/Error Rates - What percentage of work is considered successful during a specific time frame? What percentage of work is considered erroneous?</span></li><li><span style="font-size:16px;">Duration - How long does it take to produce an output or unit of work?</span></li></ul><p><span style="font-size:16px;"><br></span></p><p><span style="font-size:16px;">You should also account for the different dimensions associated with the work, to help spot patterns that might otherwise remain hidden. In many monitoring systems, this additional info can be added to metrics as tags. For instance:</span></p><ul><li><span style="font-size:16px;">Visual Analytics System: Break the above metrics down per node, per end-user, per location, per dashboard, etc... This allows you to view the metrics across different dimensions, and quickly isolate the relevant variables.</span></li><li><span style="font-size:16px;">Receptionists: Capture metrics per receptionist, per location, or per interaction type (greeting a caller on the phone vs. greeting an office visitor).</span></li></ul><p style="font-size:12pt;"><span style="font-size:16px;"><br></span></p><p style="font-size:12pt;"><span style="font-size:16px;">You can infer quite a bit about a system's internal state given only these work metrics. For instance, tracking the &quot;duration&quot; metric would allow you to quickly see when a dashboard takes longer to load than it normally does. This allows you to get in front of problems before they spiral out of control! We can take it one step f</span>urther though, so that we can zoom in on the _cause_ of a problem. How do we zoom in? Resource metrics!</p></div>
</div><div data-element-id="elm_ySf7yF51-if86Sq9RcgIsw" data-element-type="heading" class="zpelement zpelem-heading "><style> [data-element-id="elm_ySf7yF51-if86Sq9RcgIsw"].zpelem-heading { border-radius:1px; } </style><h3
 class="zpheading zpheading-style-none zpheading-align-left " data-editor="true">Record the Resources</h3></div>
<div data-element-id="elm_5gJpNbI5_Ua9YrjCydSycg" data-element-type="text" class="zpelement zpelem-text "><style> [data-element-id="elm_5gJpNbI5_Ua9YrjCydSycg"].zpelem-text { border-radius:1px; } </style><div class="zptext zptext-align-left " data-editor="true"><p><span style="font-size:16px;">Once you know a system's output, list the resources the system uses to generate that output. It's okay if it's an incomplete list at first, because people tend to be surprisingly good about understanding the important ones. For example:</span></p><ul><li><span style="font-size:16px;">Visual Analytics System: In order to render a dashboard for an end user, the system might depend on a database that contains the dashboard definition; the data to populate the dashboard; a connection pool that maintains open connections to those databases, ready to query; a place to cache results; a way to determine which user can see what data; a way to crunch the numbers; a way to send the results to the end user; the list goes on. Notice that I've included a few of the four core resources, but I'm not focused specifically on them — <i>there's so much more to this system</i>!</span></li><li><span style="font-size:16px;">Receptionists: The phone, pen and paper, the desk, the lobby, a pushcart for packages -- all of these are resources that a receptionist might use to do their work.</span></li></ul></div>
</div><div data-element-id="elm_HG0FIixROFXkZGqI7ycg-A" data-element-type="heading" class="zpelement zpelem-heading "><style> [data-element-id="elm_HG0FIixROFXkZGqI7ycg-A"].zpelem-heading { border-radius:1px; } </style><h3
 class="zpheading zpheading-style-none zpheading-align-left " data-editor="true">Measure What Matters — Resource Metrics</h3></div>
<div data-element-id="elm_1tpGEkcXt4G54Lzv4jUW9g" data-element-type="text" class="zpelement zpelem-text "><style> [data-element-id="elm_1tpGEkcXt4G54Lzv4jUW9g"].zpelem-text { border-radius:1px; } </style><div class="zptext zptext-align-left " data-editor="true"><p><span style="font-size:16px;">After you're satisfied with your list of resources, determine which metrics would help you understand how the resource is being used:</span></p><ul><li><span style="font-size:16px;">Utilization - The percent of time the resource is not idle, or how much of a resource's finite capacity is used. For example, the percent of time that connections were querying a database; or the percent of time a phone was in use.</span></li><li><span style="font-size:16px;">Saturation - The amount of work waiting to be serviced by the resource, such as disk queue length or the number of calls in the queue for the receptionist.</span></li><li><span style="font-size:16px;">Errors - The number of errors that might not be visible in the work/output itself, like cache misses or TCP retransmits, or failed call transfers due to invalid/misdialed extensions.</span></li><li><span style="font-size:16px;">Availability - The percent of time the resource is available to respond to requests. Alternatively, the percent of time the resource&nbsp;<i>did</i>&nbsp;respond to requests. A server that can handle multiple requests at once might be non-idle but still available, leading to Utilization metrics &gt; 100%. On the other hand, a receptionist might be tending to a customer in the lobby, and therefore unable to answer an incoming phone call.</span></li></ul><p><span style="font-size:16px;">&nbsp;</span></p><p><span style="font-size:16px;">Don’t forget to tag these with the same kinds of dimensions you tagged your work metrics with!</span></p></div>
</div><div data-element-id="elm_P3zJVNojC2JyQ6ZnHMXeEw" data-element-type="heading" class="zpelement zpelem-heading "><style> [data-element-id="elm_P3zJVNojC2JyQ6ZnHMXeEw"].zpelem-heading { border-radius:1px; } </style><h3
 class="zpheading zpheading-style-none zpheading-align-left " data-editor="true">Getting Resourceful — Zooming In</h3></div>
<div data-element-id="elm_vgKHLNLOpG0AcoP68mfEgQ" data-element-type="text" class="zpelement zpelem-text "><style> [data-element-id="elm_vgKHLNLOpG0AcoP68mfEgQ"].zpelem-text { border-radius:1px; } </style><div class="zptext zptext-align-left " data-editor="true"><p><span style="font-size:16px;">You may have realized that a &quot;resource&quot; could be considered another system altogether. From the perspective of a visual analytics system, a database server is a resource it needs to do its work: its query result is one of the inputs needed to render the dashboard. But from the perspective of the database server, a query result&nbsp;<i>is</i>&nbsp;the work, and it uses different resources to generate that output. When we treat each resource as a system of its own, with its own work/resource metrics, we can zoom in and solve problems faster –&nbsp;<i>especially</i>&nbsp;in complicated systems.<br></span></p><p><span style="font-size:16px;">&nbsp;</span></p><p><span style="font-size:16px;">There are a couple notable differences between this and the traditional &quot;just watch the four core resources&quot; approach:</span></p><ul><li><span style="font-size:16px;">We can account for logical resources, so we can find issues that don't manifest themselves physically. Connection pool exhaustion is a great example of something that can affect end-user experience while being difficult to spot via traditional IT monitoring tools.&nbsp;</span></li><li><span style="font-size:16px;">We may identify resources we don't yet have visibility into. For instance, we might be aware that an application has implemented a cache, but have no way extract its metrics. We might also know that we are using shared equipment on a public cloud environment, but we don't really have a way to know when a “noisy neighbor” is affecting our work. I believe “known unknowns&quot; are better than &quot;unknown unknowns&quot;, because we can still trace issues to their probable cause, and we can focus efforts on improving observability <span style="font-style:italic;">when and where it matters most</span>.</span></li></ul></div>
</div><div data-element-id="elm_FCpPiAWibF707gpqsNOWHA" data-element-type="heading" class="zpelement zpelem-heading "><style> [data-element-id="elm_FCpPiAWibF707gpqsNOWHA"].zpelem-heading { border-radius:1px; } </style><h3
 class="zpheading zpheading-style-none zpheading-align-left " data-editor="true">Enrich with Events</h3></div>
<div data-element-id="elm_I7MFbizEDTMTuL2rHsE3uw" data-element-type="text" class="zpelement zpelem-text "><style> [data-element-id="elm_I7MFbizEDTMTuL2rHsE3uw"].zpelem-text { border-radius:1px; } </style><div class="zptext zptext-align-left " data-editor="true"><p><span style="font-size:16px;">If you've ever called tech support, you might be familiar with the question &quot;have you made any changes recently?&quot; It's a relevant question, because you can normally trace changes in a system's behavior back to a specific event. Recording events as they happen can help you quickly find an issue’s root cause. When you enrich your metrics data with events, you'll also have the added benefit of being able to measure an event's real impact! For any given system, you should list out the kinds of events that might affect the system's behavior. These commonly fall into a handful of categories:</span></p><ul><li><span style="font-size:16px;">Code Changes - updates, upgrades, and installations</span></li><li><span style="font-size:16px;">Configuration Changes - any change in a system's configuration, whether hardware or software</span></li><li><span style="font-size:16px;">Tasks - recurring tasks like backups, and any one-off tasks that administrators might cause</span></li><li><span style="font-size:16px;">Infrastructure Changes - adding or removing RAM, CPU, storage; scaling up/down/out</span></li><li><span style="font-size:16px;">Alerts - any alerts generated by this or other monitoring or management systems</span></li></ul><p><span style="font-size:16px;">&nbsp;</span></p><p><span style="font-size:16px;">After you've listed these events, think about how you can capture and record them as they occur. When capturing an event, you should also tag the events with metadata such as version numbers, task names, timestamps, commit messages, or any other details that might be relevant.</span></p></div>
</div><div data-element-id="elm_c4njm07y2e1N-CwnzSU4gQ" data-element-type="heading" class="zpelement zpelem-heading "><style> [data-element-id="elm_c4njm07y2e1N-CwnzSU4gQ"].zpelem-heading { border-radius:1px; } </style><h3
 class="zpheading zpheading-style-none zpheading-align-left " data-editor="true">Putting It All Together</h3></div>
<div data-element-id="elm_-q7iL7S1_X3uzHk_EZFJzw" data-element-type="text" class="zpelement zpelem-text "><style> [data-element-id="elm_-q7iL7S1_X3uzHk_EZFJzw"].zpelem-text { border-radius:1px; } </style><div class="zptext zptext-align-left " data-editor="true"><p><span style="font-size:16px;">By identifying your relevant work and resource metrics, tags, and events, you’ve taken the most essential step towards having your own powerful monitoring system. Where do you go from here? Here’s what we tend to do:&nbsp;</span></p><ol><li><span style="font-size:16px;">Collect the work metrics, resource metrics, tags, and events data in one place,</span></li><li><span style="font-size:16px;">Add more context by collecting and analyzing your system’s logs as well,</span></li><li><span style="font-size:16px;">Build a high-level dashboard that helps you immediately understand the state of your systems,</span></li><li><span style="font-size:16px;">Build dashboards that drill into the most important work/resource metrics for each piece of the system,</span></li><li><span style="font-size:16px;">Implement alerts to help you address problems while they’re still small,</span></li><li><span style="font-size:16px;">Instrument your custom code to add helpful context to metrics, events, and logs; and lastly,</span></li><li><span style="font-size:16px;">Iterate! As the system changes and improves, so should its monitoring. Solidify any lessons learned by integrating them into your monitoring and alerting system.</span></li></ol><p><span style="font-size:16px;">&nbsp;</span></p><p><span style="font-size:16px;">It might seem like a lot, but when you do it step by step, it feels like a natural progression. And it’s always worth the effort. If we can help you move from one stage to the next, please give us a call or send us an email!</span></p></div>
</div><div data-element-id="elm_KfhX596lcUUlOs-GgPQ03g" data-element-type="row" class="zprow zprow-container zpalign-items-flex-start zpjustify-content-flex-start zpdefault-section zpdefault-section-bg " data-equal-column=""><style type="text/css"> [data-element-id="elm_KfhX596lcUUlOs-GgPQ03g"].zprow{ border-radius:1px; margin-block-start:52px; } </style><div data-element-id="elm_dmEhgFRjkWBXueMLti6F8A" data-element-type="column" class="zpelem-col zpcol-12 zpcol-md-6 zpcol-sm-12 zpalign-self- zpdefault-section zpdefault-section-bg "><style type="text/css"> [data-element-id="elm_dmEhgFRjkWBXueMLti6F8A"].zpelem-col{ border-radius:1px; } </style><div data-element-id="elm_H3pi7Oq-5z5uPI8ZYjp1yw" data-element-type="buttonicon" class="zpelement zpelem-buttonicon "><style> [data-element-id="elm_H3pi7Oq-5z5uPI8ZYjp1yw"].zpelem-buttonicon{ border-radius:1px; } </style><div class="zpbutton-container zpbutton-align-center "><style type="text/css"></style><a class="zpbutton-wrapper zpbutton zpbutton-type-primary zpbutton-size-md zpbutton-style-none zpbutton-icon-align-left " href="tel:9189486777" rel="nofollow" title="Call Us Now"><span class="zpbutton-icon "><svg viewBox="0 0 1792 1792" height="1792" width="1792" xmlns="http://www.w3.org/2000/svg"><path d="M1600 1240q0 27-10 70.5t-21 68.5q-21 50-122 106-94 51-186 51-27 0-53-3.5t-57.5-12.5-47-14.5-55.5-20.5-49-18q-98-35-175-83-127-79-264-216T344 904q-48-77-83-175-3-9-18-49t-20.5-55.5-14.5-47-12.5-57.5-3.5-53q0-92 51-186 56-101 106-122 25-11 68.5-21t70.5-10q14 0 21 3 18 6 53 76 11 19 30 54t35 63.5 31 53.5q3 4 17.5 25t21.5 35.5 7 28.5q0 20-28.5 50t-62 55-62 53-28.5 46q0 9 5 22.5t8.5 20.5 14 24 11.5 19q76 137 174 235t235 174q2 1 19 11.5t24 14 20.5 8.5 22.5 5q18 0 46-28.5t53-62 55-62 50-28.5q14 0 28.5 7t35.5 21.5 25 17.5q25 15 53.5 31t63.5 35 54 30q70 35 76 53 3 7 3 21z"></path></svg></span><span class="zpbutton-content">Call (918) 948-6777</span></a></div>
</div></div><div data-element-id="elm_cx7fcRoeOIRHXR0seXVx_g" data-element-type="column" class="zpelem-col zpcol-12 zpcol-md-6 zpcol-sm-12 zpalign-self- zpdefault-section zpdefault-section-bg "><style type="text/css"> [data-element-id="elm_cx7fcRoeOIRHXR0seXVx_g"].zpelem-col{ border-radius:1px; } </style><div data-element-id="elm_bZ-zM13frhzaOn7gF2idHQ" data-element-type="buttonicon" class="zpelement zpelem-buttonicon "><style> [data-element-id="elm_bZ-zM13frhzaOn7gF2idHQ"].zpelem-buttonicon{ border-radius:1px; } </style><div class="zpbutton-container zpbutton-align-center "><style type="text/css"></style><a class="zpbutton-wrapper zpbutton zpbutton-type-primary zpbutton-size-md zpbutton-style-none zpbutton-icon-align-left " href="mailto:hello@highperformance.tech?subject=Need Some Info" rel="nofollow"><span class="zpbutton-icon "><svg viewBox="0 0 1792 1792" height="1792" width="1792" xmlns="http://www.w3.org/2000/svg"><path d="M1792 710v794q0 66-47 113t-113 47H160q-66 0-113-47T0 1504V710q44 49 101 87 362 246 497 345 57 42 92.5 65.5t94.5 48 110 24.5h2q51 0 110-24.5t94.5-48 92.5-65.5q170-123 498-345 57-39 100-87zm0-294q0 79-49 151t-122 123q-376 261-468 325-10 7-42.5 30.5t-54 38-52 32.5-57.5 27-50 9h-2q-23 0-50-9t-57.5-27-52-32.5-54-38T639 1015q-91-64-262-182.5T172 690q-62-42-117-115.5T0 438q0-78 41.5-130T160 256h1472q65 0 112.5 47t47.5 113z"></path></svg></span><span class="zpbutton-content">Email Us</span></a></div>
</div></div></div></div></div></div></div></div> ]]></content:encoded><pubDate>Tue, 08 Sep 2020 18:28:31 -0500</pubDate></item></channel></rss>