[OpenWrt-Devel] [RFC] netifd: Crash when netifd reload is handled during netifd startup

Hans Dedecker dedeckeh at gmail.com
Wed Jun 25 06:10:08 EDT 2014


Hi Felix,

Applied the new ubus patch and it solves the netifd crash. As can be seen in the traces below the ubus requests are deferred as netifd_reload is called when config_init_all is terminated.

Thanks for the patch,
Hans

Jun 25 09:53:53 OpenWrt daemon.notice netifd: config_init_all : Enter                                             
Jun 25 09:53:53 OpenWrt daemon.notice netifd: Create interface 'loopback'                                         
Jun 25 09:53:53 OpenWrt daemon.notice netifd: Create simple device 'lo'                                           
Jun 25 09:53:53 OpenWrt daemon.notice netifd: Initialize device 'lo'                                              
Jun 25 09:53:53 OpenWrt daemon.notice netifd: Network device 'lo' is now present                                  
Jun 25 09:53:53 OpenWrt daemon.notice netifd: Add user for device 'lo', refcount=1                                
Jun 25 09:53:53 OpenWrt daemon.notice netifd: Interface 'loopback', available=1                                   
Jun 25 09:53:53 OpenWrt daemon.notice netifd: Add user for device 'lo', refcount=2                                
Jun 25 09:53:53 OpenWrt daemon.notice netifd: Create new device 'br-lan' (Bridge)                                 
Jun 25 09:53:53 OpenWrt daemon.notice netifd: Initialize device 'br-lan'                                          
Jun 25 09:53:53 OpenWrt daemon.notice netifd: Create interface 'lan'                                              
...
Jun 25 09:53:54 OpenWrt daemon.notice netifd: config_init_all : Exit                                              
Jun 25 09:53:54 OpenWrt daemon.notice netifd: netifd_reload : Enter                                               
Jun 25 09:53:54 OpenWrt daemon.notice netifd: config_init_all : Enter                                             
Jun 25 09:53:54 OpenWrt daemon.notice netifd: Device 'br-lan': config applied                                     
Jun 25 09:53:54 OpenWrt daemon.notice netifd: Update interface 'lan'                                              
....
Jun 25 09:53:54 OpenWrt daemon.notice netifd: netifd_reload : Exit                                                
Jun 25 09:53:54 OpenWrt daemon.notice netifd: Network device 'eth1' link is up                                    
Jun 25 09:53:54 OpenWrt daemon.notice netifd: Bridge 'br-lan' link is up                                          
Jun 25 09:53:54 OpenWrt daemon.notice netifd: Interface 'lan' has link connectivity                               
Jun 25 09:53:54 OpenWrt daemon.notice netifd: Interface 'lan' is setting up now                                   
Jun 25 09:53:54 OpenWrt daemon.notice netifd: Queue hotplug handler for interface 'lan', event 'ifup'             
Jun 25 09:53:54 OpenWrt daemon.notice netifd: Call hotplug handler for interface 'lan', event 'ifup' (br-lan)     
Jun 25 09:53:54 OpenWrt daemon.notice netifd: Interface 'lan' is now up                                           


On Tue, Jun 24, 2014 at 5:36 PM, Felix Fietkau <nbd at openwrt.org> wrote:
> Hi Hans,
>
> thanks for testing. I uploaded a new patch (same URL), which uses a
> uloop timer to defer processing of incoming invoke msgs.
> Note that this changes the ubus context data structure and thus affects
> everything that depends on ubus, so it's better to reflash after rebuilding.
>
> - Felix
>
> On 2014-06-24 16:11, Hans Dedecker wrote:
>> Hi,
>>
>> Applied the ubus patch but netifd_reload is still called while
>> config_init_all is processing the config and thus leading to a crash
>> when netifd_reload is done
>>
>> Added extra traces in netifd which confirms this :
>> un 24 16:00:44 OpenWrt daemon.notice netifd: config_init_all : Enter
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Create interface
>> 'loopback'
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Create simple device
>> 'lo'
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Initialize device 'lo'
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Remove a route from
>> device lo
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Remove a route from
>> device lo
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Remove a route from
>> device lo
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Network device 'lo' is
>> now present
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Add user for device
>> 'lo', refcount=1
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Interface 'loopback',
>> available=1
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Add user for device
>> 'lo', refcount=2
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: netifd_reload : Enter
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: config_init_all : Enter
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Update interface
>> 'loopback'
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Create new device
>> 'br-lan' (Bridge)
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Initialize device
>> 'br-lan'
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Create interface 'lan'
>> .....
>> Jun 24 16:00:45 OpenWrt daemon.notice netifd: config_init_all : Exit
>> Jun 24 16:00:45 OpenWrt daemon.notice netifd: netifd_reload : Exit
>>
>>
>> Hans
>> On Tue, Jun 24, 2014 at 2:05 PM, Felix Fietkau <nbd at openwrt.org> wrote:
>>> On 2014-06-24 12:46, Hans Dedecker wrote:
>>>> Netifd is crashing when when a network reload (ubus call network reload) is handled during the parsing of the network config in the function config_init_all (called from main) at startup.
>>>> As an ubus_invoke function call is issued when the interfaces are created; ubus will also process the pending ubus calls in this case the network reload during the invoke.
>>>> As netifd_reload calls again config_init_all network config will be parsed again; on return from netifd_reload the original config_init_all function call will continue but will crash as references hold to interface/device/etc ... lists are not correct anymore.
>>>> This potential problem has always been present but due to netifd_reload timing behavior change in netifd commit 5db02763d61785529bef538f196c180e968b7c26 this problem can easily be triggered.
>>>> To solve the issue I was thinking about deferring the network reload when the function config_init_all is parsing the config.
>>>> Any opinion if this is the correct way to go or any other alternatives ?
>>> Please try applying this patch to ubus:
>>> http://nbd.name/libubus-req-defer.patch
>>>
>>> It should ensure that no invoke will be processed while netifd is busy
>>> with registering/unregistering objects or sending notify calls.
>>>
>>> - Felix
>>
_______________________________________________
openwrt-devel mailing list
openwrt-devel at lists.openwrt.org
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel



More information about the openwrt-devel mailing list