[OpenWrt-Devel] Ubus based service watchdog?

Michael Jones mike at meshplusplus.com
Fri May 15 03:39:52 EDT 2020


On Fri, May 15, 2020 at 1:58 AM Petr Štetiar <ynezz at true.cz> wrote:

> Michael Jones <mike at meshplusplus.com> [2020-05-13 12:48:49]:
>
> Hi,
>
> > I have a critical service on my OpenWRT system that needs monitoring and
> > re-starting if it's failed.
>
> whats wrong with monit[1]? It was designed exactly for this purpose and is
> much more flexible.
>
>
What's wrong with monit is that it's documentation is gigantic for a
relatively trivial need. This disqualifies it as being designed exactly for
the purpose.


> > I've been looking for a mechanism in procd that would allow me to request
> > that my service be terminated if it did not periodically notify some
> > watchdog endpoint via ubus.
>
> So instead of proper error handling and crashing your service ASAP, you're
> now
> going to add another ubus layer which might possibly fail as well.


If ubus is failing, there's a much larger problem than my service failing.


> You know, your service could happily ping the watchdog endpoint, yet still
> fail in other
> parts. You want something more robust.
>

Ubus would only be pinged when the service does the thing it's designed to
do. In this case, there'll be some communication with the internet that
involves bi-directional communication. No risk of false positives.


>
> I would simply add ubus status method to that critical service,


This requires that my program be able to communicate with ubus natively and
offer a ubus endpoint that can be queried.

UBus is fundamentally incompatible with programs that have their own event
loop. Or was, last I investigated. I have not had time to dig into ubox to
make the necessary improvements to allow external loop drivers.

Having the program being managed call "ubus call
service.$servicename.watchdog ...." in whatever way it wants to is more
flexible. All programs can launch sub processes, even if they have to
resort to fork+exec.



> then check the
> output in the cron shell/Lua script and kill the service if the output of
> the
> ubus status method wouldnt match liveliness for that service.
>

More complicated than a simple timer in procd.


> In other words I think, that one can solve this use case with current
> solutions, no need to bloat procd.
>
>
It's hardly bloat. It's a very simple feature that serves a core need in
service management as a generic concern.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.infradead.org/pipermail/openwrt-devel/attachments/20200515/cab1dd8c/attachment.htm>
-------------- next part --------------
_______________________________________________
openwrt-devel mailing list
openwrt-devel at lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


More information about the openwrt-devel mailing list