incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Romain Manni-Bucau <rmannibu...@gmail.com>
Subject Re: [DISCUSS] [PROPOSAL] HTrace for Apache Incubator
Date Mon, 03 Nov 2014 20:22:50 GMT
2014-11-03 20:04 GMT+00:00 Jean-Louis MONTEIRO <jeanouii@gmail.com>:
> BTW, wondering how to get the Apache Sirona community involved or if there
> is a possible common road where the 2 projects could join.
>

+1. having a single solution would be awesome and sirona is just
starting to get there so wommunity will be happy to get help on this
part

> 2014-11-03 20:27 GMT+01:00 Roman Shaposhnik <rvs@apache.org>:
>
>> Hi!
>>
>> Thanks for the positive feedback and volunteering. I think the
>> more mentors the merrier -- all the folks who volunteered
>> please add your names to the wiki.
>>
>> As for the name, personally, I really like Distrace. That said,
>> I'd leave this bikesched to be painted for later ;-)
>>
>> Andrew, great point on the wording: I'll update the proposal.
>>
>> Finally, since I'm currently on vacation, I'll let this thread
>> go for a little longer and will start the official VOTE in a few
>> days.
>>
>> Thanks,
>> Roman.
>>
>> On Fri, Oct 31, 2014 at 4:06 PM, Roman Shaposhnik <rvs@apache.org> wrote:
>> > Hi!
>> >
>> > I would like to propose HTrace to be consider for
>> > Apache Incubator. The proposal is attached and
>> > is also available on the wiki:
>> >     https://wiki.apache.org/incubator/HTraceProposal
>> >
>> > Please let me know what do you guys think and also
>> > don't hesitate to massage the proposal on the wiki
>> > based on the feedback from this thread.
>> >
>> > Thanks,
>> > Roman.
>> >
>> > == Abstract ==
>> > HTrace is a tracing framework intended for use with distributed
>> > systems written in java.
>> >
>> > == Proposal ==
>> > HTrace is an aid for understanding system behavior and for reasoning
>> > about performance
>> > issues in distributed systems. HTrace is primarily a low impedance
>> > library that a java
>> > distributed system can incorporate to generate ‘breadcrumbs’ or
>> > ‘traces’ along the path
>> > of execution, even as it crosses processes and machines. HTrace also
>> > includes various
>> > tools and glue for collecting, processing and ‘visualizing’ captured
>> > execution traces
>> > for analysis ex post facto of where time was spent and what resources
>> > were consumed.
>> >
>> > == Background ==
>> > Distributed systems are made up of multiple software components
>> > running on multiple
>> > computers connected by networks. Debugging or profiling operations run
>> > over non-trivial
>> > distributed systems -- figuring execution paths and what services,
>> machines, and
>> > libraries participated in the processing of a request -- can be involved.
>> >
>> > == Rationale ==
>> > Rather than have each distributed system build its own custom
>> > ‘tracing’ libraries,
>> > ideally all would use a single project that provides necessary
>> > primitives and saves
>> > each project building its own visualizations and processing tools anew.
>> >
>> > Google described “...[a] large-scale distributed systems tracing
>> infrastructure”
>> > in Dapper, a Large-Scale Distributed Systems Tracing Infrastructure. The
>> paper
>> > tells a compelling story of what is possible when disparate systems
>> standardize
>> > on a single tracing library and cooperate, ‘passing the baton’, filling
>> out
>> > trace context as executions cross systems.
>> >
>> > HTrace aims to provide a rough equivalent in open source of the
>> described core
>> > Dapper tools and library.  As it is adopted by more projects, there will
>> be a
>> > ‘network effect’ as HTrace will provide a more comprehensive view of
>> activity
>> > on the cluster.  For example, as HDFS gets HTrace support, we can
>> connect this
>> > with the HTrace support in HBase to follow HBase requests as they enter
>> HDFS.
>> >
>> > Given the success of HTrace depends on its being integrated by many
>> projects,
>> > HTrace should be perceived as unhampered, free of any commercial,
>> political,
>> > or legal ‘taint’. Being an Apache project would help in this regard.
>> >
>> > == Initial Goals ==
>> > HTrace is a small project of narrow scope but with a grand vision:
>> >   * Move the HTrace source and repository to Apache, a vendor-neutral
>> > location. Currently HTrace resides at a Cloudera-hosted repository.
>> >   * Add past contributors as committers and institute Apache governance.
>> >   * Evangelize and encourage HTrace diffusion. Initially we will
>> > continue a focus on the Hadoop space since that is where most of the
>> > initial contributors work and it is where HTrace has been initially
>> > deployed.
>> >   * Building out the standalone visualization tool that ships with
>> HTrace.
>> >   * Build more community and add more committers
>> >
>> > == Current Status ==
>> > Currently HTrace has a viable Java trace library that can be interpolated
>> > to create ‘traces’.  The work that needs to be done on this library is
>> mostly
>> > bug fixes, ease-of-use improvements, and performance tweaks.  In the
>> future,
>> > we may add libraries for other languages besides Java.
>> >
>> > HTrace has means of dumping traces to the filesystem, Twitters’ Zipkin
>> > (a tracing
>> > sink and visualization system developed by Twitter
>> > https://github.com/twitter/zipkin),
>> > or Apache HBase.  Executions can be viewed either in Zipkin or in pygraph
>> > (https://code.google.com/p/python-graph/).
>> >
>> > Since the initial sprint in the summer of 2012 which saw HTrace patches
>> proposed
>> > for Apache HDFS and committed to Apache HBase, development has been
>> sporadic;
>> > mostly a single developer or two adding a feature or bug fixing. HTrace
>> is
>> > currently undergoing a new “spurt” of development with the effort to get
>> HTrace
>> > added to Apache HDFS revived and a new standalone viewing facility being
>> added
>> > in to HTrace itself.
>> >
>> > HTrace has been integrated by Apache Phoenix.
>> >
>> >
>> > === Meritocracy ===
>> > HTrace, up to this, has been run by Apache committers and PMC members.
>> > We want to
>> > build out a diverse developer and user community and run the HTrace
>> project in
>> > the Apache way.  Users and new contributors will be treated with respect
>> and
>> > welcomed; they will earn merit in the project by tendering quality
>> patches
>> > and support that move the project forward.  Those with a proven support
>> and
>> > quality patch track record will be encouraged to become committers.
>> >
>> > === Community ===
>> > There are just a few developers involved at the moment. If our project
>> > is accepted
>> > by incubator, building community would be a primary initial goal.
>> >
>> > === Core Developers ===
>> >
>> > Core developers include Apache members and members of the Hadoop and
>> > HBase PMCs.
>> > Of those listed, all have contributed to HTrace. Half are from Cloudera.
>> > The remainder are Hortonworks, NTTData, Google, and Facebook employees.
>> >
>> > === Alignment ===
>> > HTrace has been integrated into Apache HBase and Apache Phoenix.
>> Integration
>> > into Apache HDFS is currently being worked on. Approaching the Apache
>> YARN
>> > project would be a likely next integration.
>> >
>> >
>> > == Known Risks ==
>> > As noted above, development has been sporadic up to this.  It may
>> continue so.
>> >
>> > HTrace is not the primary focus of any of the current list of
>> contributors.
>> > It is for all a side effort.  HTrace may lack sufficient impetus with
>> such
>> > a state of affairs.
>> >
>> > For HTrace to tell a compelling story, it needs to be taken up by
>> significant
>> > projects that make up a traced distributed system.  For example, say
>> YARN and
>> > HBase take on HTrace but HDFS does not, then the HDFS portions of an
>> end-to-end
>> > operation will render opaque compromising our being able to tell a good
>> story
>> > around an execution. Because the picture painted has gaps, HTrace may be
>> left
>> > aside as ineffective.
>> >
>> > === Orphaned products ===
>> > The proposers have a vested interest in making HTrace succeed, driving
>> its
>> > development and its insertion into projects we all work on. Its
>> dispersion
>> > will shine light on difficult to understand interactions amongst the
>> various
>> > systems we all work on. A working, integrated HTrace will add a useful
>> > debugging mechanism to the Apache projects we all work on.
>> >
>> >
>> > === Inexperience with Open Source ===
>> > The majority of the proposers here have day jobs that has them working
>> near
>> > full-time on (Apache) open source projects. A few of us have helped carry
>> > other projects through incubator.  HTrace to date has been developed as
>> > an open source project.
>> >
>> > === Homogenous Developers ===
>> > The initial group of committers is small but already we have a healthy
>> > diversity of participating companies.  We are bay-area challenged but
>> > a Japanese contributor makes for a good counter balance.
>> >
>> > === Reliance on Salaried Developers ===
>> > Most of the contributors are paid to work in the Hadoop ecosystem.
>> > While we might wander from our current employers, we probably won’t
>> > go far from the Hadoop tree.  Whoever the Hadoop employer, it is
>> > plain a successful HTrace project is in everyone’s interest.
>> > At least one of the developers has already changed employers but
>> > his interest in seeing HTrace succeed prevails.
>> >
>> > === Relationships with Other Apache Products ===
>> > For HTrace to succeed, it is critical we build good relations with
>> > other distributed systems projects.  We intend to initially build
>> > on relations we already have in place, mostly in the Hadoop space.
>> >
>> > The HTrace project has been incorporated by Apache HBase and
>> > Apache Phoenix. It is currently being actively integrated into
>> > Apache HDFS.
>> >
>> > We do not know of any equivalent or near-equivalent project
>> > in the Apache space.
>> >
>> > The Dapper paper notes precedent, in particular, the Berkeley
>> > Rad Lab X-Trace project.
>> >
>> > ==== How HTrace relates to Zipkin ====
>> > Zipkin is an Apache Licensed project from Twitter. It is a complete
>> > tracing tool with trace collectors, trace viewers and tools to help
>> > you generate traces. It is written in Scala.  If your project is
>> > not Scala or if it is Java and you cannot afford a Scala dependency,
>> > at a minimum, you need an alternate means of generating traces.
>> > HTrace provides this facility for Java as well as bridging tools
>> > to feed traces to Zipkin for query and display.
>> >
>> > The projects complement each other.
>> >
>> > === A Excessive Fascination with the Apache Brand ===
>> > While we intend to leverage the Apache ‘branding’ when talking to other
>> > projects as testament of our project’s ‘neutrality’, we have no plans
>> > for making use of Apache brand in press releases nor posting billboards
>> > advertising acceptance of HTrace into Apache Incubator.
>> >
>> >
>> > == Documentation ==
>> > See [[http://htrace.org|htrace.org]] for the current state of the HTrace
>> > project and documentation.
>> >
>> > How to enable tracing in
>> > [[http://hbase.apache.org/book/tracing.html|HBase using HTrace]]
>> > Elliott Clark on
>> > [[
>> http://files.meetup.com/1350427/HBase%20Meetup%20-%20Zipkin.pptx|tracing
>> > in HBase]]
>> >
>> > == Initial Source ==
>> > Jonathan Leavitt and Todd Lipcon built the first versions of HTrace in
>> the
>> > summer of 2012.  Jonathan was Todd’s summer intern at Cloudera.
>> >
>> >
>> > == Source and Intellectual Property Submission Plan ==
>> > We know of no legal encumberments in the way of transfer of source to
>> Apache.
>> >
>> > == External Dependencies ==
>> > HTrace includes third party libs. These include guava, jetty, junit,
>> protobuf,
>> > hbase, and thrift.  All dependencies are Apache licensed or licenses
>> that are
>> > palatable: e.g. junit is EPL (Eclipse Public License v1.0) and
>> > ProtoBufs are BSD licensed.
>> >
>> > Cryptography
>> > N/A
>> >
>> > == Required Resources ==
>> >
>> > === Mailing lists ===
>> >   * private@htrace.incubator.apache.org (moderated subscriptions)
>> >   * commits@htrace.incubator.apache.org
>> >   * dev@htrace.incubator.apache.org
>> >   * issues@htrace.incubator.apache.org
>> >   * user@htrace.incubator.apache.org
>> >
>> > === Git Repository ===
>> > https://git-wip-us.apache.org/repos/asf/incubator-htrace.git
>> >
>> > === Issue Tracking ===
>> > JIRA HTrace (HTRACE)
>> >
>> > === Other Resources ===
>> > Means of setting up regular builds for htrace on builds.apache.org
>> >
>> > == Initial Committers ==
>> >   * Colin McCabe (cmccabe@apache.org)
>> >   * Elliott Clark (eclark@apache.org)
>> >   * Jonathan Leavitt (jon.s.leavitt@gmail.com) -- CLA being submitted
>> >   * Masatake Iwasaki (iwasakims@gmail.com) -- CLA being submitted
>> >   * Michael Stack (stack@apache.org)
>> >   * Nick Dimiduk (ndimiduk@apache.org)
>> >   * Todd Lipcon (todd@apache.org)
>> >
>> >
>> > == Affiliations ==
>> >   * Colin McCabe - Cloudera
>> >   * Elliott Clark - Facebook
>> >   * Jonathan Leavitt - Google
>> >   * Masatake Iwasaki - NTTData
>> >   * Michael Stack - Cloudera
>> >   * Nick Dimiduk - Hortonworks
>> >   * Todd Lipcon - Cloudera
>> >
>> > == Sponsors ==
>> >
>> > === Champion ===
>> > Roman Shaposhnik
>> >
>> > === Nominated Mentors ===
>> >   * Michael Stack - Apache Member
>> >   * Todd Lipcon - Apache Member
>> >
>> > We will be soliciting more mentors as part of the proposal process.
>> >
>> > === Sponsoring Entity ===
>> > We would like to propose Apache incubator to sponsor this project.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
>>
>>
>
>
> --
> Jean-Louis

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message