Switch issues and CI to GitHub

Paul Spooren mail at aparcar.org
Thu Jan 20 01:42:15 PST 2022

Hi Sam,
> A big thank you for doing this.

Half the fun.

> Must confess: I was unaware of the ~16k issue body character limit when
> I proposed SourceHut.  Did you find a public bug report or feature
> request about that?  (I looked just now.  Could not find one myself, but
> perhaps my search-fu is off today.)

I discussed this with Drew (sourcehut developer) and the limit is due to email compatibility.

>> A quick bug tracker conclusion, I'd be happy to use codeberg.org for
>> issue tracking. Both sr.ht and codeberg.org are FOSS, GitHub not so
>> much. [..]
> GitHub not at all, last time I checked.

Some of their frameworks, actions etc are open source, however their core isn’t, right.

>> As an immediate action, we might as well close down bugs.openwrt.org
>> and open issues on GitHub.com without any migration of existing
>> issues. Both users and developers already know the workflows over
>> there and issues have a higher visibility. A migration away from
>> GitHub over to coderberg or sr.ht is possible with much less effort
>> than migrating away from flyspray.
> I wish to caution against this.
> Here are some reasons not to use GitHub for hosting issue/bug-reports.
> Best practice for handling user account deletions is to either:
> 1.  If the user is happy for a record of their contributions to remain
>    attributed to them:
>    Leave the username shown unchanged in the remaining webpages where
>    it was used, so that at-mentions ("@username") within discussions
>    still work (aren't broken), and quotations remain correctly
>    attributed ("username commented MMM DD, YYYY").
>    Or...
> 2.  If the user is *not* happy for that:
>    Replace all instances of the username (at-mentions, quotation
>    attributions, etc) with a non-personally-identifying pseudonym, e.g.
>    "user12345".
>    This, too, retains comprehensibility and avoids link-breakage.
> GitHub does neither.
> Instead, GitHub replaces *some but not all* instances of a deleted
> user's username with "Ghost".  That can make it difficult to follow a
> discussion (bug report, pull request, etc) featuring a now-deleted user.
> See e.g. https://github.com/GothenburgBitFactory/taskwarrior/issues/2088
> .  If you didn't know that the comments therein that are now attributed
> to "Ghost" were in fact made by me, it would be a confusing discussion
> to follow.
> (I later closed my GitHub account due to the increasing accessibility
> problems I encountered on GitHub.)
> That would be bad enough.  But because *every* deleted user account is
> processed this way by GitHub, it effectively conflates *all* deleted
> users into one confusing account.  For instance, the "Ghost" account
> here is *not* me: https://github.com/matrix-org/synapse/issues/5778  .
> But a third party would be unable to know that.
> This is especially problematic if more than one now-deleted user
> contributed to a single discussion.  Both user's posts would now be
> attributed - by GitHub's incompetence - to the same user, making it look
> as though one, rather than several, people made those comments.  (I
> don't have an example at hand, but I'd be amazed if this hasn't happened
> several times now, given GitHub's size.)

I think they are in a bit of a pickle there. If you delete everything a lot of issues miss comments and stop making sense. If you rename the the user account “aparcar” to a random string like “mystery-blob-64”, other users can still “recreate” the deleted user behavior by specifically looking for that _new_ name. Their solution seems to combine “anonymity" with usability (aka not ruining issue discussions entirely).

> Worse still, because GitHub is proprietary and doesn't have a good way
> for users to report GitHub bugs or submit patches to fix them, bugs like
> this tend to go undiscussed and unfixed for years, leading to
> progressive corruption in GitHub discussions.

They have a forum[0] and a “Discussion” thing[1]

[0]: https://github.community
[1]: https://github.com/github/feedback/discussions

> There is no way within GitHub to avoid irrelevant search results.  For
> instance, if I search in the TaskWarrior repo for
>    is:issue in:title "TW-10"
> I get results like "[TW-1733] taskwarrior 2.5.0 can not compile FreeBSD
> 10.1", because they have a "TW" and a "10" in the title.  In other
> words, GitHub fails to perform exact string matching.
> Try it yourself:
> https://github.com/GothenburgBitFactory/taskwarrior/issues?utf8=%E2%9C%93&q=is%3Aissue+in%3Atitle+%22TW-10%22
> This makes GitHub's search feature a real pain to use.
> Again, because GitHub is proprietary and lacks good ways to track or fix
> GitHub bugs, ones like this go unfixed for years.

This critique came up multiple times, are you aware of a better search implementation? I’d be keen to find something better. From my experience bugs.openwrt.org (aka flyspray) doesn’t do a much better job here.

> As previously discussed, e.g.:
> https://lists.openwrt.org/pipermail/openwrt-devel/2022-January/037546.html
> Understand that moving OpenWRT's issue-hosting to GitHub would make it
> impossible for some users to subscribe to OpenWRT's bug tracker to
> receive bug reports by email.

I’m not familiar with Internet connectivity in Syria, Crimea and North Korea, do you know if sr.ht and codeberg.org are reachable from over there?

If a user _watches_ a GitHub project every new issue and response is send via email, is that an option?

Since I plan to have Issue Backups anyway, I could publish them somewhere.
> Also remember, Microsoft is a key player in the surveillance-industrial
> complex:
> https://www.theguardian.com/technology/2016/may/02/google-microsoft-pact-antitrust-surveillance-capitalism
> https://theintercept.com/2020/07/14/microsoft-police-state-mass-surveillance-facial-recognition/
> Sure, comments on OpenWRT issues might be public, but do you really want
> OpenWRT users giving Microsoft their browser fingerprints or IP
> addresses in order to participate?

You can use their CLI (`gh`) you can also avoid using a browser.

> (You might say: users can work around this by using Tor.  But can they?
> What if they live in jurisdictions where Tor usage would get them
> flagged by law enforcement?  What if GitHub blocks sign-ups from Tor?
> And do you really want a situation where people have to weigh up their
> threat models and what steps to take to protect themselves from OpenWRT
> infrastructure because it has been outsourced to a malevolent entity?)

Do mirrors of source, mirrors of issues and email lists solve this problem? If so, we don’t plan to shut down that infrastructure. There are currently no plans to have GitHub as our only (or main) Git host.

> If OpenWRT were, as you said, to "open issues on GitHub.com without any
> migration of existing issues", then this could lead to broken links in
> OpenWRT commit messages, bug reports, and comments.
> One reason for this is that the issue numbering on GitHub might not
> remain coordinated with the issue numbering on bugs.openwrt.org .  For
> instance, there might end up being two bug reports with the same number.
> That would be like the ambiguity of "#4206", which could currently
> (sadly, as a result of OpenWRT allowing pull requests on GitHub) refer
> to either

It’s not possible to disable PRs on GitHub and a massive chunk of contributions come via the GitHub, I’d estimate more than from the mailing list at this point.
> https://bugs.openwrt.org/index.php?do=details&task_id=4206
> or
> https://github.com/openwrt/openwrt/pull/4206
> but worse.

Yes that’s indeed a _flaw_ when distributing service to different entities. When people refer to flyspray within the commit messages they usually prefixed it with `FS`. Sourcehut requires users to post the full address which seems the most stable (though least convenient) method[3].

[3]: https://man.sr.ht/git.sr.ht/#referencing-tickets-in-git-commit-messages

> And even if OpenWRT resists, for now, the *migration* of issues to
> GitHub, every additional endorsement of GitHub by OpenWRT sadly
> increases the likelihood that some future OpenWRT dev/maintainer will
> attempt such a migration in future - probably in ignorance of these
> problems.
> I have seen projects mess up such migrations quite badly, in ways that
> have knock-on effects for years to come.  For instance TaskWarrior,
> whose devs/maintainers did not notice that quite a lot of data
> corruption and link-breakage had occurred during the migration, until it
> was too late to correct because on GitHub, people had already started to
> refer to issue numbers that should properly have been reserved for
> existing issues.
> As a result, the many references from one bug report or pull requests to
> another (e.g. "Fixes #XX" or "See #YY", that sort of thing) and that
> were silently auto-linked by GitHub to the wrong bug report or pull
> request, could not longer be fixed without extensive effort (more than
> could be spared - and so IIUC the issue still persists).  See e.g.
> https://github.com/GothenburgBitFactory/taskwarrior/issues/2088 .
> This is a subtle and insidious kind of corruption that GitHub makes it
> hard to avoid.

There are no plans to delete all existing tickets created on bugs.openwrt.org. Migrating tickets over to GitHub (is possible) could include prefixing them with `FS: 123` to make them available via the search function.

>> [..]
>> ## Conclusion
>> From a FOSS perspective I'd skip GitHub entirely and move to Codeberg
>> or sr.ht. Codeberg (Gitea) is a fine clone of GitHub and sr.ht comes
>> with a great _no bloat_ attitude and priority on email integration for
>> tickets and git (they created git-send-email.io).
> Yes.  Those are both great options.

Agree, minus the mentioned limitations.

>> [OpenWRT] community repositories are on GitHub, people are actively
>> and happy contributing there
> Except for the people who aren't.

Correct and I’m happy to find ways to support those as well.
>> and mostly think about "how to make OpenWrt better" and less "how to
>> improve our workflow and infrastructure".
> Those are not contradictory goals!

Correct, however if you’re excited about debugging WiFi drivers or understand vendor firmware hacks, finding the best FOSS Git host can be out of your scope. Obviously it doesn’t have to but I understand people being tired of maintaining things they don’t like to maintain.

If nobody steps up and offers and maintains (!) a fitting solution we have to outsource.


More information about the openwrt-devel mailing list