Hi there, trying to get some feedback regarding ha...
# observability
a
Hi there, trying to get some feedback regarding hardware observability. Does anyone here monitor data center infrastructure? How do you know when the infrastructure is down due to a fault in the physical device? Do you use any of the observability tools like Datadog, Dynatrace, Splunk or New Relic. OR something like Solarwinds, Nagios and Zabbix.
k
Hi @Akhil Vishwanath for physical devices with an observability tool I would guess you could see when something is down if it becomes slower? If a disk in a RAID set goes down this has influence on Disk latency? But if you want to have a more direct way of measuring you can have something like vcenter, xenserver or virtual machine manager they can be integrated into the observability tools. These managers know some hardware statuses of the hosts they are managing. If you want to go even a level deeper you will need take a look at api, redfish or snmp ingestion into the O11y tools. For example HP ILO used to be capable to predict when a hardware disk needed to be replaced. This could then be read out by one of your tools of preference.