incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Nalley <>
Subject Re: Podling request: Gerrit
Date Wed, 15 Jul 2015 03:56:54 GMT
On Tue, Jul 14, 2015 at 8:08 PM, Till Westmann <> wrote:
> On 14 Jul 2015, at 15:31, David Nalley wrote:
>> On Tue, Jul 14, 2015 at 1:14 AM, Ian Maxon <> wrote:
>>> We use Gerrit as
>>> a tool to do code reviews and to organize the commits, as well as to
>>> facilitate easy testing. However that's all it's used for- we still
>>> clone from repositories that come downstream from ASF, not the other
>>> way around. I'd be interested to understand how this would be
>>> considered any different than what is done with Github Pull Requests.
>> So GH PR have a subtle distinction (at least in the way that they are
>> handled at the ASF). Projects can't merge pull requests into the repo
>> at github. Non-committers see a workflow that is the Github workflow,
>> because that's very familiar, and lowers the barrier to contribution.
>> Committers, however, have a very different workflow than the folks who
>> typically review and close pull requests on github. They have to take
>> the patch [1], and merge it into the canonical repository at the ASF,
>> which then appears in the github repository because of the mirror
>> process.  This stops the problem of diverging codebases that you are
>> currently experiencing, calls to rewrite history to align the ASF repo
>> with the external repo, etc.
> As Ian indicated AsterixDB's process also requires manual interaction of
> a committer. The current steps are now documented on the website [2].

So, that's marginally better than some previous examples of similar behavior.
But I think there are still multiple problems, and I'll try and be
more explicit about them:

1. People are not clearly contributing to Apache AsterixDB when
submitting a patch via Gerrit at Think about Section 5 of
2. The ASF has no record of any contributions that are happening on
the Gerrit instance at UCI, until a committer decides to push code to
the ASF repo. And from a provenance perspective, we have no records of
submission of contributions at all.
3. Discussion and code review is happening at UCI, within their Gerrit
instance, there is no record of those discussions at the ASF. (With
reviews.a.o, Jira, GH Pull Requests, all of that information gets
copied to one of the project's mailing list for posterity.)
4. And this is the real issue for me. Gerrit is possessive of git
repos it manages by nature; it needs and wants control. The very
nature of Gerrit demands that it be the canonical repo. We can play
word games and say that it isn't, or that the repo of record that
releases are produced from is the ASF repo, but there are a number of
realities that reflect that it isn't. First, when the mirroring goes
wrong, the initial call is to rewrite history on the ASF repo [3].
This suggests to me that the gerrit repo is the de facto repo for the
project. Second, Gerrit is where everything is really happening:
contributions, code review, testing (from a Jenkins instance at UCI).

>> There are some other problems, that aren't necessarily as worrisome,
>> but should be something to consider. First, you're relying on a third
>> party to provide that resource. That's not inherently a problem, but
>> we have a number of examples of projects using external tools and
>> those being shut down or phased out which causes tremendous disruption
>> to projects. It's also at the old project's home, which might cause
>> some folks to question whether the project is truly independent, or
>> not.
> In my view Gerrit is "just" a tool that the AsterixDB community chose
> to keep when starting the incubation process. It is is non-essential and
> has been used by developers from different organizations before the
> incubation started. But I think that its use was and is very beneficial
> to the project.
> When we started incubation it seemed to us, that keeping the existing
> tool would be a good idea as it
> a) allows for a smoother transition and
> b) would not put additional requirements on the ASF infrastructure.

I personally like Gerrit. I think it's probably one of the more robust
review tools in existence, and it's certainly the most extensible
based on what I've seen. That said, its use in this case is not
without problems.

> However, I do agree that a shut down of the service (which seems very
> unlikely at the current point in time) could be a disruption to the
> project.

We would have said the same thing about Codehaus not too many years ago.

> So it might be better to run this tool on the ASF
> infrastructure.
> Should we pursue this?

We've explored gerrit 2-3 times in the past 24 months. We have seen
several projects request it over the years. As I've mentioned
elsewhere in this thread, our most recent exploration was in December,
and there are a number of issues that would make an ASF-wide instance
of gerrit to be impractically costly to deploy. I also think that due
to the provenance requirements that come with version control as I
understand them, as well as some of the other issues that would come
into play, that infrastructure would not permit a project-specific
instance of Gerrit to be run on ASF infrastructure.

> Or is it acceptable to keep the tool on external hardware for now?
> Or do you see fundamental issues with AsterixDB's use of Gerrit?

I do not think it's acceptable to use the tool on external hardware. I
don't see inherent issues with the tool itself, but also don't think
it's pragmatic to have running internally. I know that's a bad
position that seems to be inflexible for the project itself, but with
around 200 active projects a bit of flexibility is assumed to be lost.


>> [1]
> [2]


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message