![]() I run the shell script every 5 minutes with a systemd service triggered by a timer: Smartmontools is the typical package of tools for reading SMART information from drives on Linux, and conveniently the community-maintained example scripts for collecting system information with Prometheus includes both a Python and shell script ( smartmon.sh and smartmon.py) that generate metrics for all devices on a system that support SMART reporting. It does support reading arbitrary metrics from text files written by other programs with its textfile collector however, which is fairly easy to integrate with arbitrary other tools. The Prometheus Node Exporter is the canonical tool for capturing machine metrics like utilization and hardware information with Prometheus, but it alone does not support probing SMART data from storage drives. Since I use Prometheus to capture information on the server’s operation however, I can use that to monitor that my hard drives are doing well. While I have been aware of this in my home server as well, it is easy to forget to ensure that disks are not silently killing themselves by cycling the heads. It is fairly well-known among techies that hard drives used in server-like workloads can suffer from poor configuration by default such that they frequently load and unload their heads, which can cause disks to fail much faster than they otherwise would. Monitoring (and preventing) excessive hard drive head parking on Linux 14 August, 2021
0 Comments
Leave a Reply. |