Future of git.openwrt.org [Was: Re: Moving git.openwrt.org behind Fastly CDN]

Bas Mevissen abuse at basmevissen.nl
Fri Dec 6 04:28:45 PST 2024


On 06/12/2024 06:59, Petr Štetiar wrote:
> Hannu Nyman <hannu.nyman at iki.fi> [2024-12-04 17:23:39]:
> 
> Hi,
> 
> tl;dr CDN is not going to cut it, we need some other solution
> 
>> a) setting feeds.conf.default to point to the actual root feeds at GitHub
>> (e.g. https://github.com/openwrt/packages) instead of the git.openwrt.org
>> mirror?
> 
> yes, using more powerful Git mirrors is one of the options.
> 
> It would need to be git.builds.openwrt.org or such, so we still have it under control.
> It would need to be somewhere else than on GitHub (sanctions, no IPv6 etc.).
> Following options are circulating around for some time:
> 
>   - codeberg.org
>   - sourcehut.org
> 
>>    Or alternatively, set feeds.conf.default to point to the new
>> git.cdn.openwrt.org?
> 
> CDNs are not able to handle/cache this Git HTTP smart protocol yet, so it
> wouldn't help with Git fetching operations, this would be basically still
> passthru mode. CDN is going to lower the load from gitweb based scrapers which
> are not using proper Bot user agent in HTTP headers (those are already rate
> limited).
> 
> I looked into this more closely and there were actually multiple issues going
> on simultaneously:
> 
>   1. sudden spikes of requests from various gitweb based scrappers, usually
>      requesting source code tarballs (heavy CPU and I/O operation) of random projects
> 
>      * bots using proper user agent identification are already forbidden this requests
> 
>      * bots not using proper user agent identifaction are PITA because you can't
>        distinguish them from humans
>   
>   2. strange vulnerability scanners, generating a lot of concurrent requests
> 
>   3. relatively high numbers of concurrent builds starting at the same time
> 
>      * probably some build farms and/or CI jobs (Hi Qualcomm! :))
> 
> This was leading to the saturation of CPU and I/O on the box, long backlog of
> requests, running out of resources and 500s hugely impacting our buildbot
> builds.
> 

Would an internal mirror only accessible for the buildbots help? Even if 
the mirror falls behind due to main git server unavailability, the 
builds can continue.

Or is there a way to prioritize "own" traffic over the world with the 
git server?

> As a quick fix, I've done following in the past days:
> 
>   - disabled tarballs for everyone with 403
>   - enabled IP based rate limits on everyone
> 
>     * heavy projects like luci.git, packages.git and openwrt.git
> 
>       - after 5r/m additional requests are delayed up to 15r/m, then 429 sorry
> 
>     * other requests 15r/m, delayed after 8-th request, up to 30r/m, then 429 sorry
> 
> Seems to work, VPS can manage the load, no git fetch issues on buildbots, thus
> we can focus on the long term solution:
> 
>   A. outsource Git operations
> 
>      - this is the git.builds.openwrt.org explained above, thus following
>        (shortened) diff
> 
>          --- a/feeds.conf.default
>          +++ b/feeds.conf.default
>          -src-git packages https://git.openwrt.org/feed/packages.git
>          -src-git luci https://git.openwrt.org/project/luci.git
>          -src-git routing https://git.openwrt.org/feed/routing.git
>          -src-git telephony https://git.openwrt.org/feed/telephony.git
>          +src-git packages https://git.builds.openwrt.org/feed/packages.git
>          +src-git luci https://git.builds.openwrt.org/project/luci.git
>          +src-git routing https://git.builds.openwrt.org/feed/routing.git
>          +src-git telephony https://git.builds.openwrt.org/feed/telephony.git
> 
>          --- a/include/download.mk
>          +++ b/include/download.mk
>          -PROJECT_GIT = https://git.openwrt.org
>          +PROJECT_GIT = https://git.builds.openwrt.org
>   
>          --- a/package/boot/uboot-bcm4908/Makefile
>          +++ b/package/boot/uboot-bcm4908/Makefile
>          -PKG_SOURCE_URL:=https://git.openwrt.org/project/bcm63xx/u-boot.git
>          +PKG_SOURCE_URL:=https://git.builds.openwrt.org/project/bcm63xx/u-boot.git
> 
>      - other option is to keep using git.openwrt.org and handle this via HTTP
>        redirects, which should probably work as well
> 
That would also avoid people trying to get priority by changing the URLs.

>   B. improve scripts/feeds
> 
>      - add kind of --retry backoff mechanism to Git operations
>      - add fallback list of additional Git repository mirrors, if one fails, use another
>        etc.
> 

Can't these be set to be shallow clones by default? In most cases, you 
only need the latest HEAD to build against.

>   C. upgrade the box
> 
>      - this means $$ which IMO would be better spent on funding/improving projects like
>        codeberg.org or sourcehut.org
> 
> Cheers,
> 
> Petr
> 
> _______________________________________________
> openwrt-devel mailing list
> openwrt-devel at lists.openwrt.org
> https://lists.openwrt.org/mailman/listinfo/openwrt-devel




More information about the openwrt-devel mailing list