In a good IT environment, there is monitoring enabled for servers, networking equipment and even web applications. Workstations are generally not monitored but make up the largest number of devices on the network. Is there value in monitoring workstations in an enterprise environment?
I won’t explain how to implement workstation monitoring in this article, as that would make it exceptionally long. This article is to start the discussion around monitoring workstations and the value inherit in doing so. Future articles that will dive deeper into this series will be retroactively linked to this article.
But don’t we already have data for our workstations?
Yes, you most likely do. If you have a platform such as MEMCM (nee SCCM) or alternatives such as LAN Sweeper or PDQ Inventory, you will have data on the workstations in your environment. This will range from a list of applications that are installed through to hardware information as of the last inventory run. But there is a wealth of information that is not gathered by these platforms which would be helpful to troubleshooting issues in the environment.
Ok. So what other information would be helpful?
The following information would be helpful when troubleshooting issues with workstations:
- Performance trends
- Entries from various logs
Why would I capture performance trends?
Being able to quantify the delta in performance when deploying an agent or software update would provide a level of confidence to the business. How many times have you gone in front of a business stakeholder and they’ve asked “How will this affect the end users?” and the best you’ve got is a “It should be right”. Pilot groups are usually used to help identify the impact to end users, but those reports usually comes from the end user themselves. This may not always be adequate as the performance impact that they feel may not be related.
For this to succeed, I believe that performance needs to be monitored in two locations:
- Test environment
- Monitoring the test environment will capture the delta prior to it hitting any production workstations
- Pilot groups
- Pilot users are the guinea pigs. Capturing performance data here should provide a good indication of how the larger production environment will be affected.
Being able to identify trends in performance for workstations is beneficial to see if a reported performance issue is widespread within the environment, or isolated to a number of devices.
Capturing log entries
Logs. We should always look at the logs when we start troubleshooting workstations. It isn’t that great a leap to say that retaining information from logs will help in identifying trends within the environment. Capturing log information from workstations and putting it all in one location would be beneficial for identifying trends rather than interrogating each workstation individually (as would be normally done as a part of root cause analysis). Using a tool such as ELK to gather these log entries and search through them would make that particular job easier.
Capturing log entries will assist with identifying trends on workstations within the environment.
Is that all that you should capture?
The answer to this question depends on your environment. There may be value in capturing further information such as the PowerShell commands that are run on workstations to alerts from sysmon rules. I’m wanting to document some starting points that will help on the journey to getting good workstation monitoring in place.
Wrapping this up
In summary, I feel that there is value in monitoring workstations in an enterprise environment, in much the same way that servers, web applications and networking equipment are also monitored. It enables a larger picture view of the health of workstations in an environment. The level of monitoring would will differ for each environment. Cursory investigation into workstation monitoring shows that there is not a lot of information that caters specifically to Windows workstations (which I am more familiar with) as there are articles for monitoring Windows servers. Hopefully we can change that in due course.