Future of git.openwrt.org [Was: Re: Moving git.openwrt.org behind Fastly CDN]
Bas Mevissen
abuse at basmevissen.nl
Fri Dec 6 04:28:45 PST 2024
On 06/12/2024 06:59, Petr Štetiar wrote:
> Hannu Nyman <hannu.nyman at iki.fi> [2024-12-04 17:23:39]:
>
> Hi,
>
> tl;dr CDN is not going to cut it, we need some other solution
>
>> a) setting feeds.conf.default to point to the actual root feeds at GitHub
>> (e.g. https://github.com/openwrt/packages) instead of the git.openwrt.org
>> mirror?
>
> yes, using more powerful Git mirrors is one of the options.
>
> It would need to be git.builds.openwrt.org or such, so we still have it under control.
> It would need to be somewhere else than on GitHub (sanctions, no IPv6 etc.).
> Following options are circulating around for some time:
>
> - codeberg.org
> - sourcehut.org
>
>> Or alternatively, set feeds.conf.default to point to the new
>> git.cdn.openwrt.org?
>
> CDNs are not able to handle/cache this Git HTTP smart protocol yet, so it
> wouldn't help with Git fetching operations, this would be basically still
> passthru mode. CDN is going to lower the load from gitweb based scrapers which
> are not using proper Bot user agent in HTTP headers (those are already rate
> limited).
>
> I looked into this more closely and there were actually multiple issues going
> on simultaneously:
>
> 1. sudden spikes of requests from various gitweb based scrappers, usually
> requesting source code tarballs (heavy CPU and I/O operation) of random projects
>
> * bots using proper user agent identification are already forbidden this requests
>
> * bots not using proper user agent identifaction are PITA because you can't
> distinguish them from humans
>
> 2. strange vulnerability scanners, generating a lot of concurrent requests
>
> 3. relatively high numbers of concurrent builds starting at the same time
>
> * probably some build farms and/or CI jobs (Hi Qualcomm! :))
>
> This was leading to the saturation of CPU and I/O on the box, long backlog of
> requests, running out of resources and 500s hugely impacting our buildbot
> builds.
>
Would an internal mirror only accessible for the buildbots help? Even if
the mirror falls behind due to main git server unavailability, the
builds can continue.
Or is there a way to prioritize "own" traffic over the world with the
git server?
> As a quick fix, I've done following in the past days:
>
> - disabled tarballs for everyone with 403
> - enabled IP based rate limits on everyone
>
> * heavy projects like luci.git, packages.git and openwrt.git
>
> - after 5r/m additional requests are delayed up to 15r/m, then 429 sorry
>
> * other requests 15r/m, delayed after 8-th request, up to 30r/m, then 429 sorry
>
> Seems to work, VPS can manage the load, no git fetch issues on buildbots, thus
> we can focus on the long term solution:
>
> A. outsource Git operations
>
> - this is the git.builds.openwrt.org explained above, thus following
> (shortened) diff
>
> --- a/feeds.conf.default
> +++ b/feeds.conf.default
> -src-git packages https://git.openwrt.org/feed/packages.git
> -src-git luci https://git.openwrt.org/project/luci.git
> -src-git routing https://git.openwrt.org/feed/routing.git
> -src-git telephony https://git.openwrt.org/feed/telephony.git
> +src-git packages https://git.builds.openwrt.org/feed/packages.git
> +src-git luci https://git.builds.openwrt.org/project/luci.git
> +src-git routing https://git.builds.openwrt.org/feed/routing.git
> +src-git telephony https://git.builds.openwrt.org/feed/telephony.git
>
> --- a/include/download.mk
> +++ b/include/download.mk
> -PROJECT_GIT = https://git.openwrt.org
> +PROJECT_GIT = https://git.builds.openwrt.org
>
> --- a/package/boot/uboot-bcm4908/Makefile
> +++ b/package/boot/uboot-bcm4908/Makefile
> -PKG_SOURCE_URL:=https://git.openwrt.org/project/bcm63xx/u-boot.git
> +PKG_SOURCE_URL:=https://git.builds.openwrt.org/project/bcm63xx/u-boot.git
>
> - other option is to keep using git.openwrt.org and handle this via HTTP
> redirects, which should probably work as well
>
That would also avoid people trying to get priority by changing the URLs.
> B. improve scripts/feeds
>
> - add kind of --retry backoff mechanism to Git operations
> - add fallback list of additional Git repository mirrors, if one fails, use another
> etc.
>
Can't these be set to be shallow clones by default? In most cases, you
only need the latest HEAD to build against.
> C. upgrade the box
>
> - this means $$ which IMO would be better spent on funding/improving projects like
> codeberg.org or sourcehut.org
>
> Cheers,
>
> Petr
>
> _______________________________________________
> openwrt-devel mailing list
> openwrt-devel at lists.openwrt.org
> https://lists.openwrt.org/mailman/listinfo/openwrt-devel
More information about the openwrt-devel
mailing list