incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marko Rodriguez <>
Subject Apache Metrics, Not Apache Humans
Date Sun, 15 Nov 2015 18:20:02 GMT

I was talking with Daniel Gruno and wrote the following ideas to him. Note that these are
just ideas and not based on any real momentary issue or concern -- though a more general concern
about how Apache should evolve.

Apache should NOT use a binary "podling" / "top-level" model. All projects should simply have
a "health score" and that health score is derived from measurables. Because of Apache Infrastructure's
centralized server model (email lists, version control, distributions, homepages, etc.), it
 has the ability to gather metrics such as, for example, the distribution of pushes to the
repository, the branch factor of the mailing list, the centrality of the project in the Central
Maven repository dependency graph, the number of non-sequisters (dead-end conversations) in
the email chain, the length of discussions in JIRA, etc. etc. Which metrics are important?
Who care -- just make up things to glean from the wealth of information you already have access
to. Watch...

Next, the Apache members subjectively say which projects they think are "good" (healthy).
This can even be a global vote including everyone in the world and (should be) dynamic over
time as projects evolve with time. Either way, lets say, the ranking says Apache Hadoop, Apache
Solr, Apache Commons, etc. are the (collective subjective's) "best" Apache projects. Now,
there should exist a multi-dimensional projection of the aforementioned gleaned statistics
what will have Hadoop, Solr, Commons, etc. close to one another in metric-space (clustered).
Likewise, low ranking projects should be close to one another in this space and far from Hadoop,
Solr, Commons, etc. Find that projection and that is your "healthy metric space."

From here, all Apache projects have a computed "healthy" score(s) and when users go to download,
lets say, Lucene, they go: "Cool. This is a healthy project." (it has a HEALTH.txt file distributed
with it, lets say). What that means is that Lucene, at that release was in the "healthy" cluster
of the metric space. This model has various benefits:

	1. There is no need to have philosophical arguments (not grounded in measurables) about what
rules a project should follow (bounded by law). 
		- Perhaps a project that is exclusive, but is X is still in the "healthy" subspace.
		- Perhaps having bad documentation is a "unhealthy" even though Apache doesn't care about
		- Perhaps too much discussion causes a project to become "unhealthy."
		- Perhaps … who knows? … let the statistics do the talking.
		-  Apache becomes a breeding ground for different models of open source (bounded by law),
not just "The Apache Way." 
			- And these models are measurable! Let us study the act of open source.
	2. "Top-level" projects can fall from grace. 
		- Currently, all "top-level" projects are "equal." This should by dynamic as the mighty
do fall.
		- It is possible for what are now "podlings" to be "healthy" as they simply are coming into
			- "The student is the master."
		- Hadoop 1.2.1 might be the healthiest version of Hadoop (as I tend to believe). "Hadoop"
is not a thing eternal.
	3. Less work for people.
		- No more VOTEing on graduation.
		- No more amorphous aesthetic arguments about "The Apache Way."
		- No more long winded contradictory documentation about how things should be done (bounded
by law).	

The Apache Way should be about metrics, not about philosophy as different paths lead to the
same mountain top <--- See! Is that random Buddhist saying that everyone just "believes"
even true? :) Get the human out of the loop!

Thanks for reading,

P.S. The same should hold true for educational degrees. I graduate and now forever I'm an
expert in computers? Medical doctors too! A 90 year old doctor can do surgery on me?!?!…
Binary graduation is not "real." Metrics, metrics, metrics --- we live in a world where this
is possible. For every "thing" good comes and goes, up and down… 
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message