Watchdog

From Segfault
Jump to navigation Jump to search

The watchdog driver has been developed by Alan Cox and interacts with a userspace daemon. A special hardware watchdog card is preferred, but we can use a (slightly less effective) software watchdog too:

  • Compile a kernel with CONFIG_SOFT_WATCHDOG set
  • Load the softdog kernel module, it will recognize the following parameters:
 soft_noboot   - set to 1 to ignore reboots, 0 to reboot (default depends on ONLY_TESTING)
 nowayout      - Watchdog cannot be stopped once started (default=CONFIG_WATCHDOG_NOWAYOUT)
 soft_margin   - soft_margin in seconds. (default=60)

E.g.

sudo modprobe softdog soft_margin=30 soft_noboot=0

Get the watchdog userspace daemon, edit watchdog.conf:

# should be created by udev, otherwise it'd be done by 'mknod -m 660 /dev/watchdog c 10 130'
watchdog-device         = /dev/watchdog

# watchdog can check if certain processes are still running
pidfile                 = /var/run/apache2.pid
pidfile                 = /var/run/imapd.pid
pidfile                 = /var/run/sshd.pid

# monitors if there was traffic between two watchdog intervals. 
# FIXME: has http://bugs.gentoo.org/show_bug.cgi?id=123404 been fixed?
interface               = eth0
 
# checks if enough memory is available. It's not measured in bytes but in pages. (see getconf PAGE_SIZE)
min-memory              = 1

# monitor your current load, reboot if too high
max-load-1              = 24
max-load-5              = 18
max-load-15             = 12

# ping pings a host and assumes network unreachable if the host doesn't reply.
# FIXME: see also Watchdog Bugreports
ping                   = 172.26.1.255

# monitors a file for changes. change specifies how often the file has to be changed.
# The value is counted in watchdog intervals (which are normally at 10 seconds).
file                   = /var/log/everything/syslog
change                 = 20

Start watchdog and tail your logfile to see what's going on.

See also

Links