Shell script pour vérifier température disques durs (linux)

Ce shell script vérifie la température sur tous les disques durs présents dans votre système.
Vous avez juste à vérifier que ces 2 outils sont présents dans votre système:
smartmontools
lsscsi
Ensuite, vous n’aurez qu’à lancer le script ! rien de plus simple et rapide
Vous pouvez aussi ajouter votre adresse mail afin de recevoir les alertes s’il y’en a.
Les seuils d’alerte de température varient d’un disque dur à un autre selon le constructeur, mais il est possible de le récupérer via les données constructeurs avec l’outil smartctl (ce que fait le script).
Vous pouvez aussi forcer une limite de température avec la variable “forceLimit”.
Vous pouvez aussi dire au script de définir à une valeur inférieure à la valeur par défaut du constructeur en modifiant la variable “decrease”.
Le script retourne “1″ en cas d’erreur ou alerte.
Moi, personnellement je lance une tâche cron qui vérifie les disques toutes les heures, et aussi de fixer la limite de température à 10°C de moins que celui donné par le constructeur ! ![]()
Et évidemment, vous pouvez spécifier les variable en ligne de commande ou directement dans le script.
@hourly mailTo=address@mail.com decrease=10 /chemin/vers/script.sh
#!/bin/bash
#
# -*- Mode: bash -*-
#
# check_disks_temp.sh --------------------------------------------------------
#
# Author : paissad (paissad_temp-spam@yahoo.fr)
# Site : http://blog.paissad.org
# Created On : Wed Dec 8 00:11:46 CET 2010
# Revision N° : 2
# Status : Stable
# Description : This script will check the temperature of all current hard
# disks. By default, the temperature threshold (warning) will
# be retreived automatically from informations of smartctl.
# Depends : smartctl (required)
# lsscsi (required)
# Licence : GPLv3 (http://www.gnu.org/licenses/)
# __ __
# .-----..---.-.|__|.-----..-----..---.-..--| |
# | _ || _ || ||__ --||__ --|| _ || _ |
# | __||___._||__||_____||_____||___._||_____|
# |__|
# ----------------------------------------------------------------------------
# Examples of usage:
# ./check_disks_temp.sh
# exclude="/dev/sdb /dev/sdc" ./check_disks_temp.sh
# mailTo=foo@bar.com exclude="/dev/sda" ./check_disks_temp.sh
smartctl="/usr/sbin/smartctl"
lsscsi="/usr/bin/lsscsi"
# Instead of letting the script find the default temperature limit for each
# hard disk, you can force the limit with a desired value, 40 for example !
# If set, then every disk whose temperature reaches that value will be marked
# as abnormal.
# forceLimit=${forceLimit:-"40"}
forceLimit=${forceLimit:-""} # (optional)
# This will force the limit for each disk to it's default value decreased by
# the value set in this variable.
# For example, if default temperature limit for /dev/sda is 49°C, if you set
# decrease to '10', you will get the WARNING at 39°C & not at 49°C.
# Personally, i think it's better to set this value between '5' and '10'.
# If 'forceLimit' is set, 'decrease' will NOT be used !
decrease=${decrease:-""}
mailTo=${mailTo:-"root"} # send mail if there's a problem with a disk
# (optional)
# Example:
# exclude=${exclude:-"/dev/sdc /dev/sde"}
exclude=${exclude:-""} # disks to exclude
# ============================================================================
# You should not change anything below this line, except you know what you are
# doing
# ============================================================================
trap bashtrap EXIT
tempFile="$(mktemp /tmp/$(basename $0)_XXXXXX)"
exec > >(tee -a $tempFile) 2>&1 # Link file descriptor to stdout & stderr
# function to execute with Trap#{{{
bashtrap(){
if (($?)); then
echo >&2 "**** There is an error during the run of the script."
rm -f "$tempFile"
# Close the stderr and stdout file descriptors.
exec 1>&- 2>&-
exit 1;
fi
}
##}}}
# Check wether or not the commands are executable and user privileges #{{{
if [[ $(id -u) != 0 ]]; then
echo >&2 "You need root privileges to run this script ..."; exit 1
fi
test -x "$smartctl" || \
{ echo "[$smartctl] NOT FOUND. Install 'smartmontools' package."; exit 1; }
test -x "$lsscsi" || \
{ echo "[$lsscsi] NOT FOUND. Install 'lsscsi' package."; exit 1; }
##}}}
# Print header#{{{
printf "%-30s\n" "-------------------------------------------------------"
printf "Date : $(date +"%a %d-%b-%Y %T (%Z)")\n"
printf "%-27s: %-5s\n" "Decrease temperature limit" "${decrease:-"no"}"
printf "%-27s: %-5s\n" "Force temperature limit" "${forceLimit:-"no"}"
printf "%-30s\n" "+-----------+-------------+---------------+-----------+"
printf "| %-10s| %-12s| %-14s| %-10s|\n" \
"DISKS" "REAL LIMIT" "FORCED LIMIT" "CURRENT"
printf "%-30s\n" "+-----------+-------------+---------------+-----------+"
##}}}
allDisks="$(lsscsi | awk '{ print $NF}')"
failDisks=""
warn=false
msg="-----------------------------------------------------\n"
for disk in $allDisks; do
# The disk must not be present in the exclude list !
if [[ " $exclude " != *" $disk "* ]]; then
threshold_TEMP=$($smartctl -x $disk | awk -F':' \
'/^Lifetime[[:space:]]+.*Max[[:space:]]+Temperature/ {print $2}'\
| awk '{print $1}' \
| awk -F'/' '{print $NF}')
limit="$threshold_TEMP"
if [[ -n "$forceLimit" ]]; then
forced_threshold_TEMP="$forceLimit"
limit="$forced_threshold_TEMP"
elif [[ -n "$decrease" ]]; then
forced_threshold_TEMP=$(($threshold_TEMP - $decrease))
limit="$forced_threshold_TEMP"
fi
current_TEMP=$($smartctl -x $disk \
| awk '/Current[[:space:]]+Temperature/' \
| awk -F':' '{print $2}' \
| awk '{print $1}')
if [[ -n ${limit} && -n ${current_TEMP} ]]; then
printf "| %-10s| %-12s| %-14s| %-10s|\n" \
"$disk" "$threshold_TEMP" \
"${forced_threshold_TEMP:-"not forced"}" "$current_TEMP"
if [[ "$current_TEMP" -ge "$limit" ]]; then
warn=true; failDisks+=" $disk"
msg+="WARNING: Disk '$disk' has an abnormal temperature.\n"
msg+="The current temperature is : $current_TEMP\n"
msg+="The limit given for this disk is: $limit\n"
msg+="-----------------------------------------------------\n"
fi
fi
fi
done
printf "%-30s\n" "+-----------+-------------+---------------+-----------+"
# If ever there are some kind of temperature warning !
if [[ "$warn" = "true" ]]; then
echo >&2 -e "$msg"
echo "******************************************************************"
echo "* Informations about disks with abnormal temperatures ... *"
echo "******************************************************************"
echo ""
for fd in $failDisks; do
$smartctl -i $fd 2>&1 | tee -a "$tempFile"
echo "============================================================="
done
# Close the stderr and stdout file descriptors.
exec 1>&- 2>&-
# Send mail
if [[ -n "$mailTo" ]]; then
subject="Disk Temperature ALERT on '($(hostname -f))'"
cat "$tempFile" | mail -s "$subject" $mailTo
fi
fi
rm -f "$tempFile"
#vim ts=4, sw=4, tw=78
Voici deux exemples de sorties normales sans erreurs ou warnings
Mon système contient 4 disques durs !
Exemple 1:
sudo ./check_disks_temp.sh ------------------------------------------------------- Date : sam. 11-déc.-2010 21:14:08 (CET) Decrease temperature limit : no Force temperature limit : no +-----------+-------------+---------------+-----------+ | DISKS | REAL LIMIT | FORCED LIMIT | CURRENT | +-----------+-------------+---------------+-----------+ | /dev/sda | 45 | not forced | 25 | | /dev/sdb | 49 | not forced | 29 | | /dev/sdc | 38 | not forced | 28 | | /dev/sdd | 69 | not forced | 28 | +-----------+-------------+---------------+-----------+
Exemple 2:
Je force la limite à 35°C pour tous les disques du système !
sudo forceLimit=35 ./check_disks_temp.sh ------------------------------------------------------- Date : sam. 11-déc.-2010 21:15:01 (CET) Decrease temperature limit : no Force temperature limit : 35 +-----------+-------------+---------------+-----------+ | DISKS | REAL LIMIT | FORCED LIMIT | CURRENT | +-----------+-------------+---------------+-----------+ | /dev/sda | 45 | 35 | 25 | | /dev/sdb | 49 | 35 | 29 | | /dev/sdc | 38 | 35 | 28 | | /dev/sdd | 69 | 35 | 28 | +-----------+-------------+---------------+-----------+
Voici un exemple de sortie avec alerte sur un disque dur !
Exemple 3:
ça c’est passé réellement dans mon cas !
Je fixe l’alerte à 10°C en dessous des normes constructeurs (pour plus de précautions
)
sudo decrease=10 ./check_disks_temp.sh ------------------------------------------------------- Date : sam. 11-déc.-2010 21:17:18 (CET) Decrease temperature limit : 10 Force temperature limit : no +-----------+-------------+---------------+-----------+ | DISKS | REAL LIMIT | FORCED LIMIT | CURRENT | +-----------+-------------+---------------+-----------+ | /dev/sda | 45 | 35 | 25 | | /dev/sdb | 49 | 39 | 29 | | /dev/sdc | 38 | 28 | 28 | | /dev/sdd | 69 | 59 | 28 | +-----------+-------------+---------------+-----------+ ----------------------------------------------------- WARNING: Disk '/dev/sdc' has an abnormal temperature. The current temperature is : 28 The limit given for this disk is: 28 ----------------------------------------------------- ****************************************************************** * Informations about disks with abnormal temperatures ... * ****************************************************************** smartctl 5.39.1 2010-01-28 r3054 [x86_64-slackware-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Seagate Maxtor DiamondMax 23 Device Model: STM3500418AS Serial Number: 9VM4BY8A Firmware Version: CC37 User Capacity: 500 107 862 016 bytes Device is: In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 4 Local Time is: Sat Dec 11 20:23:48 2010 CET SMART support is: Available - device has SMART capability. SMART support is: Enabled =============================================================
VOILA …
Trackbacks
There are no trackbacks on this entry.





Comments
There are no comments on this entry.