[OpenWrt-Devel] Ubus based service watchdog?

Michael Jones mike at meshplusplus.com
Fri May 15 03:39:52 EDT 2020

On Fri, May 15, 2020 at 1:58 AM Petr Štetiar <ynezz at true.cz> wrote:

> Michael Jones <mike at meshplusplus.com> [2020-05-13 12:48:49]:
> Hi,
> > I have a critical service on my OpenWRT system that needs monitoring and
> > re-starting if it's failed.
> whats wrong with monit[1]? It was designed exactly for this purpose and is
> much more flexible.
What's wrong with monit is that it's documentation is gigantic for a
relatively trivial need. This disqualifies it as being designed exactly for
the purpose.

> > I've been looking for a mechanism in procd that would allow me to request
> > that my service be terminated if it did not periodically notify some
> > watchdog endpoint via ubus.
> So instead of proper error handling and crashing your service ASAP, you're
> now
> going to add another ubus layer which might possibly fail as well.

If ubus is failing, there's a much larger problem than my service failing.

> You know, your service could happily ping the watchdog endpoint, yet still
> fail in other
> parts. You want something more robust.

Ubus would only be pinged when the service does the thing it's designed to
do. In this case, there'll be some communication with the internet that
involves bi-directional communication. No risk of false positives.

> I would simply add ubus status method to that critical service,

This requires that my program be able to communicate with ubus natively and
offer a ubus endpoint that can be queried.

UBus is fundamentally incompatible with programs that have their own event
loop. Or was, last I investigated. I have not had time to dig into ubox to
make the necessary improvements to allow external loop drivers.

Having the program being managed call "ubus call
service.$servicename.watchdog ...." in whatever way it wants to is more
flexible. All programs can launch sub processes, even if they have to
resort to fork+exec.

> then check the
> output in the cron shell/Lua script and kill the service if the output of
> the
> ubus status method wouldnt match liveliness for that service.

More complicated than a simple timer in procd.

> In other words I think, that one can solve this use case with current
> solutions, no need to bloat procd.
It's hardly bloat. It's a very simple feature that serves a core need in
service management as a generic concern.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.infradead.org/pipermail/openwrt-devel/attachments/20200515/cab1dd8c/attachment.htm>
-------------- next part --------------
openwrt-devel mailing list
openwrt-devel at lists.openwrt.org

More information about the openwrt-devel mailing list