Watchdog Linux

2021年5月7日
Download here: http://gg.gg/uivop
*Watchdog Linux
Almost every hardware board has on-chip watchdog device. From an operating system perspective this use of watchdog device can be very vital, you can read more on watchdog use in linux operating system here. In this post, I’d mostly talk about use of watchdog from the u-boot boot loader’s perspective.
I rent a dedicated server (with Intel Haswell CPU and custom hardware) at a lowcost hosting service and use it with CentOS 6.4 / 64 bit Linux (with stock kernel: 2.6.32-358.14.1.el6.x8664). Every few weeks it hangs and the other customers seem to have similar problems. In the dmesg output I see (here is the full dmesg output). CPU0: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz stepping 03. The Linux kernel can reset the system if serious problems are detected. This can be implemented via special watchdog hardware, or via a slightly less reliable software-only watchdog inside the kernel.
Hardware watchdog is a device if enabled then on its timeout expiry it would reset the hardware unless its timeout is reset by the software. In other words if watchdog is not kicked(reset its timeout value) then it will bite (reset the hardware) you.
This mechanism of watchdog can be used to have a redundant boot in case of failure of default boot.
There is bootcounter in the variable u-boot which counts the number of boots and with this we can also set bootlimit so that if bootcounter exceeds the bootlimit we
program u-boot to take appropriate action based on the use case.
The bootcounter starts with value on 1 on very first boot.
Now imagine if your default boot fails because any of below mentioned reasons
*
*Kernel Crash occurs
*Invalid boot argument
*Root file system partition not found
*Crash in u-boot itself.
So whenever any of these events occur we can set an altbootcmd which is the boot command executed once the bootcounter exceeds the bootlimit variable defined in u-boot.
The only thing we need to take care is of where to store bootcounter (
CONFIG_SYS_BOOTCOUNT_ADDR config option in u-boot) because once linux starts booting it take control of entire hardware and our bootcounter variable may be overwritten by linux operations.
Its is recommended to make use of scratch registers or sram of your hardware device to store bootcounter value, but again it may vary from soc to soc and should be checked carefully. I’ve listed some corner cases to check this:
Mayuri tamil full movie watch online. Mayuri movie is a dubbed version of Tamil movie Maya and it is Horror movie written and directed by debut director Ashwin Saravanan. The movie has Nayantara and Aari in a lead role. Maari Movie Online Watch Maari Full Length HD Movie Online on YuppFlix. Maari Film Directed by Balaji Mohan Cast Dhanush,Kajal Agarwal,Vijay Yesudas,Robo Shankar.
* Once watchdog enabled , test the value of bootcounter from linux using
fw_printenv commands
*Check watchdog timeout value in linux and u-boot
*Check bootcounter value from u-boot command line on every successive boot.The Software Watchdog
First: build the Linux kernel with watchdog support, the full guide is located here:
After a reboot with the new kernel there should be a /dev/watchdogfile:
Next: you will need to install a watchdog daemon:
List the files that get installed by the watchdog package:
This looks interesting, /usr/lib/systemd/system/watchdog.service isa Systemd service file.
Starting and stopping the watchdog:
The watchdog gets automatically started once you open /dev/watchdog.To stop the watchdog, you will need to:
*Write the character V into /dev/watchdog to prevent stopping thewatchdog accidentally
*Close the file /dev/watchdog unless your kernel is compiled with theCONFIG_WATCHDOG_NOWAYOUT option enabled. When this option is enabled,the watchdog cannot be stopped at all.
After the watchdog has been enabled you have to reset the watchdog timer every60 seconds, else your system gets rebooted. Resetting the timer will be done by thewatchdog daemon if none of its tests fails.
Supported tests by the watchdog daemon to check the system status:
*Is the process table full?
*Is there enough free memory?
*Are some files accessible?
*Have some files changed within a given interval?
*Is the average work load too high?
*Has a file table overflow occurred?
*Is a process still running? The process is specified by a pid file.
*Do some IP addresses answer to ping?
*Do network interfaces receive traffic?
*Is the temperature too high? (Temperature data not always available.)
*Execute a user defined command to do arbitrary tests.
*Execute one or more test/repair commands found in /etc/watchdog.d. These commands are called with the argument test or repair.
The configuration file should be self-explanatory:
Now we will enable the watchdog daemon, currently it should be disabled:
For testing purpose I’ve added the following to my /etc/watchdog.conf:
So when my WiFi connection gets lost my system should reboot.
Start the watchdog daemon:
OK, then I will have to use the IP address because the watchdog daemon fails to start.The ping option of watchdog only supports numeric IPv4 addresses:
In general you are safer pinging your router, packages to an remote host can getlost or delayed, Googles IP may change or your IP gets blocked if you send24/7 pinq requests to Google.
And it works:
Now disconnect the WiFi and voila, after max. 60 seconds it will reboot:
Later we can enable the watchdog on boot when everything is working correctly:The Hardware Watchdog
The software watchdog module is, of course, no protection against a kernel faultbut hardware watchdog support is coming for the iMX233-OLinuXino.
Have a look at chapter 23 of the iMX233 Reference Manual (17,5 MB):
23.7 Watchdog Reset Function
The watchdog reset is a CPU-configurable device. It is programmed by software to generate a chip-widereset after HW_RTC_WATCHDOG milliseconds. The watchdog generates this reset if software does notrewrite this register before this time elapses.
The watchdog timer decrements the register value once for every tick of the 1-kHz clock supplied fromthe RTC analog section (see Figure 23-1). The reset generated by the watchdog timer has no effect on thevalues retained in the master registers of the real-time clock seconds counter, alarm, or persistent registers(analog persistent storage).
The watchdog timer is initially disabled and set to count 4,294,967,295 milliseconds before generating awatchdog reset.
The watchdog timer does not run when the chip is in its powered-down state. Therefore, there is no master/shadow register pairing for the watchdog timer, and it must be reprogrammed after cycling power orresetting the block.
I’ve seen a kernel option (<*>FreescaleSTMP3XXX&i.MX23/28watchdog)on newer kernels and also some log messages:
Now I have 3 watchdog devices:
But which is the hardware watchdog?
So by default the hardware watchdog timer gets assigned to /dev/watchdogwhich makes sense. I haven’t tested it yet whether the hardware watchdog timeris working on the OLinuXino but I think so.References/Further Reading
*Hardware watchdog support in 3.9:Please enable JavaScript to view the comments powered by Disqus.Watchdog Linux
* « A new SD card image for the iMX233-OLinuXino
*How to check the memory usage of your embedded Linux system »
Download here: http://gg.gg/uivop

https://diarynote.indered.space

コメント

お気に入り日記の更新

テーマ別日記一覧

まだテーマがありません

この日記について

日記内を検索