[PATCH] realtek: mark clock source as continuous

Sander Vanheule sander at svanheule.net
Mon Oct 31 13:40:18 PDT 2022


Hi Jan,

On Mon, 2022-10-31 at 21:21 +0100, Jan Hoffmann wrote:
> On 31.10.22 at 10:11, Sander Vanheule wrote:
> > After replacing the R4K event timer and clock source with the new
> > Realtek Otto timer, performance for RTL839x devices was severely
> > impacted, as reported by Hiroshi.
> > 
> > Research by Markus showed that after commit 4657a5301eb5 (realtek: avoid
> > busy waiting for RTL839x PHY read/write, 2022-10-14), the ethernet
> > driver could only update a phy once per timer interval, which also
> > heavily impacted boot time. On e.g. a Zyxel GS1900-48, this added around
> > a minute to the time to fully initialise the switch.
> > 
> > By marking the otto clocksource as continuous, the kernel enables it to
> > be used for high resolution timers. This allows readx_poll_timeout() to
> > sleep for less than one system timer interval, reducing system dead
> > time.
> 
> I can confirm this brings usleep_range sleep duration back to a more 
> reasonable amount of time.
> 
> Average time for usleep_range(10, 20); when called in a loop 10000 times 
> (tested on HPE 1920-8G for RTL838x and 1920-48G for RTL839x):
> 
>                            RTL838x     RTL839x
> r4k timer                   48 us       31 us
> otto timer               10003 us    10031 us
> otto timer with patch       60 us       32 us
> 
> Average runtime of the polling loop in smi_wait_op is now also back to 
> where it was with the r4k timer (under 0.3 ms for RTL838x and under 0.5 
> ms for RTL839x on otherwise idle device, compared to 5 ms / 10 ms when 
> using the otto timer without the patch).

Thanks for doing these checks!

Markus expressed his concern earlier today (on IRC) with the sleep duration in
the polling loop. usleep_range((20 >> 2) + 1, 20) for a something that takes
300/500 us might be a bit fast, especially considering how large the overhead is
(+40 us on RTL838x, +12 us on RTL839x) on a 10-20 us sleep.

Would it make sense to you to bump the timeout for the smi poller to, say, 80
microseconds? Could be a different duration too, just something that strikes a
good balance between not waiting too long, and not keeping the CPU busy setting
up new timers.

Best,
Sander

> 
> > Link: https://github.com/openwrt/openwrt/issues/11117
> > Reported-by: INAGAKI Hiroshi <musashino.open at gmail.com>
> > Cc: Markus Stockhausen <markus.stockhausen at gmx.de>
> > Signed-off-by: Sander Vanheule <sander at svanheule.net>
> > ---
> > With this patch, initialisation time for my GS1900-48 drops from 110
> > seconds to 50 seconds. Please check if you can reproduce this. The 'why
> > this works' from the commit message is from a quick look at the places
> > where this CLOCK_SOURCE flag is checked, so I hope it actually makes sense.
> > 
> > Best,
> > Sander
> > 
> >   .../realtek/files-5.10/drivers/clocksource/timer-rtl-otto.c      | 1 +
> >   1 file changed, 1 insertion(+)
> > 
> > diff --git a/target/linux/realtek/files-5.10/drivers/clocksource/timer-rtl-
> > otto.c b/target/linux/realtek/files-5.10/drivers/clocksource/timer-rtl-
> > otto.c
> > index 12eed78653d0..14e28e50f40e 100644
> > --- a/target/linux/realtek/files-5.10/drivers/clocksource/timer-rtl-otto.c
> > +++ b/target/linux/realtek/files-5.10/drivers/clocksource/timer-rtl-otto.c
> > @@ -227,6 +227,7 @@ struct rttm_cs rttm_cs = {
> >                 .name   = "realtek_otto_timer",
> >                 .rating = 400,
> >                 .mask   = CLOCKSOURCE_MASK(RTTM_BIT_COUNT),
> > +               .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
> >                 .read   = rttm_read_clocksource,
> >                 .enable = rttm_enable_clocksource
> >         }




More information about the openwrt-devel mailing list