incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <>
Subject Re: [VOTE] Graduation of Apache Spark from the Incubator
Date Wed, 05 Feb 2014 18:33:48 GMT
Hi Craig,

On Feb 5, 2014, at 9:05 AM, Craig L Russell <> wrote:

>> Hi Craig,
>> Thanks for the list, I’m following up with these folks to get them accounts. I
think some people filed an ICLA but never received an account and were thus never added to
the repo.
> This is a significant failure of the leadership of this project to request accounts.

> The project status on says that "all
active committers have submitted a contributors agreement" as of a week ago. The project started
seven months ago. Setting up the project, filing ICLAs, and getting accounts for committers
is supposed to be part of the initial activities, not a graduation exercise.

There was definitely a failure here in setting up their expectations and following up to make
sure everyone got an account. All the committers who sent an ICLA requested an account name,
but I at least wasn’t clear what the process is for getting one (I assumed that secretary@
does that, since as a non-IPMC member, I can’t create accounts). We also didn’t tell the
new committers to ask if they haven’t received an account in X time, so because people were
immediately added on the private list they might’ve thought everything is under way. But
we can definitely take a much more proactive stance on this as a TLP. I would both 1) tell
them to expect an account within a week and 2) make sure the PMC leads this process for new

As you must’ve seen from our discussion with secretary@, all the proposed committers actually
had submitted ICLAs, and the only one you hadn’t received one from was Kay Ousterhout due
to spam filtering. She was only added in December. If you look at the VOTE threads or GitHub
review activity, all the added committers were highly active members in terms of both new
contributions and reviewing before being proposed as committers.

>> A couple of questions:
>> - What do you mean by “does not appear to be a committer” — that they weren’t
added to the repo?
> They were not given credentials to commit to the repo.
>> All of these individuals have contributed code, but it was merged by someone else.
> This is a major issue. At Apache, committers update the repo with their own code. Occasionally,
they commit code on behalf of others but this should be a rare exception, such as a person
from the outside contributing a patch or two.
> If committers on the project are routinely committing patches on behalf of other active
members of the project, there is something fundamentally wrong with the leadership of the

What I meant by this is that all code is reviewed by another committer and merged by them.
Different projects operate differently, but I believe this is a very normal way to operate.
I’ve been a committer on Apache Hadoop, one of the most active Apache projects, since 2009,
and nearly all the patches I sent there were reviewed and merged by someone else.

If you look at the GitHub code reviews (,
you’ll see that lots of people are contributing to reviewing. But I agree that the new committer
onboarding process should include having them do a test commit.

> Perhaps it is a feature of using git that it's so easy to write code, create a pull request,
and have someone else do the "easy" job of merging. 

That might definitely be part of it. GitHub not only makes it easy to send patches but also
makes it easy for the reviewer to say “this patch does not merge cleanly”, or “does
not pass unit tests”, so a lot of the merging happens at review time. The actual merging
is a couple of shell commands. You can look at some of the pull requests if you’re worried
about peoples’ activity — discussion is extremely active and many of our proposed committers
were reviewing patches before being proposed.

>> - Andrew Xia is listed as having an ICLA on file here:
> Yes, I missed this. Andrew's public name is different. 
> There are some folks on the proposed PMC list who do not appear to have been active on
the mail lists, which are the life blood of a project.

It’s true that some members have been less active in the past six months, but keep in mind
that the project existed for 3.5 years before joining the Incubator, and was highly active
even then. All the initial committers made major contributions during these first 3.5 years,
and as Chris said, our philosophy was to recognize them for their contributions and give them
the ability to participate in the project once it moved to Apache. All of the initial committers
agreed to being made committers (I asked them before the proposal). If we end up with a problem
of inactive committers in the future, we may consider some kind of emeritus status, but I’d
personally feel uncomfortable asking anyone to drop their committer status due to the past
six months when they’ve been participating in the project for 2+ years.

> I'm still -1 on this project graduating without demonstrated understanding of how Apache
projects work.

If you still have specific concerns after reading the discussion above and looking at some
of the reviewing process, let me know. As Chris said, I believe we operated the project very
closely to the Apache spirit even before submitting it to the incubator. Some specific examples:

- Out of the proposed committers, myself, Patrick Wendell, Tom Graves, Bobby Evans, Thomas
Dudziak, and Andy Konwinski were already committers on other Apache projects before joining
Spark. (I might be missing others, these are the ones I have off the top of my head).

- All activity and review has always been done electronically through email and GitHub. Even
for people who sat in the same room (the Berkeley contributors), we did all reviews online.
Contributors from throughout the community routinely participate in these reviews.

- Our review process closely matches Apache Hadoop’s, which most of us were familiar with.

- We chose the initial committer set to include a wide range of organizations, and have continued
to do so in the new committers added since incubation.

- We’ve done not just minor tweak releases, but two major releases, our largest to date,
through the Apache process. In these releases, the VOTE process worked as intended, often
finding bugs and packaging issues, fixing the artifacts, etc.

In any case, we appreciate that you are checking these things thoroughly, and we realize that
from the infrastructure point of view we haven’t done a great job onboarding people. But
I believe we can easily start doing this as a TLP, perhaps more easily now that more members
of the project will be able to carry on these tasks.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message