lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Pook <andy.p...@gmail.com>
Subject Re: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g
Date Fri, 30 Dec 2011 14:10:16 GMT
Thought I'd add my opinion as a user of Lucene.net...

My company processes content from several feeds (mainly web but also social
media). The volumes are fairly large (100M's of documents). The results are
stored in lucene indexes.

Points of interest:
 - We use 2.9.4g compiled against dotnet 4.0
 - We add our own tokens in parallel with the word tokens
 - We have our own parser (using Irony) so that we can extend the syntax
(related to the extra tokens)
 - We have created a wrapper to abstract/hide most of the Lucene API
   - maps to and from poco objects
   - it exposes IEnumerable<TPoco>

So much for the background.
I agree with much of what is being said here. Particularly, let's make a
choice and stop wasting the little resources the project has.

I don't care about java. So I don't care that the API changes.
I do want 100% index compatibility.
I would like things like name capitalization, IDisposable, IEnumerable etc.
Though I think that g adds a little confusion by also using other
collection types i.e. ICollection<T>, List<T> unnecessarily.
I'd like to get to Lucene 4 as soon as possible. The NRT and Codec bits
would solve a lot of the issues we spend a lot of time on. So being able to
catch up to java, currently 3.5, is high on my list.
I like Troy's list of "I wants". I'd like all of that too. The question is,
how?

I think the ancient argument about "line by line" or "transliteration" or
"lets change the api" or "complete rewrite" can't be easily resolved
because it obscures two separate things.
 1. How do we port changes in java to the .net version
 2. Most (all?) don't like the java api

I don't think the project will survive if this cannot be resolved. It
barely survived the last time. But I also don't believe that the project
can achieve a full rewrite (such as Lucere started). At least, not yet. It
would take too much and is too easily divisive.

I would like to see something like "line by line with formally defined
mutations".

I want to see Lucene.net on par with java Lucene and not continually a year
or more behind. I think the only way to get there is to adhere, mostly, to
the same basic code and structures as java.

However, I also think that there should be a set of agreed and documented
"mutations" that are applied (both retro fitting and applied to newly
migrated changes). For example:
 - Method names are capitalized according to dotnet conventions
 - A class implementing Close should have IDisposable added according to an
agreed template/style
 - A collection class should implement IEnumerable<T> or ICollection<T> or
IList<T> depending on agreed criteria and according to an agreed
template/style
 - Convert to enum
 - Convert to Func<T>/Action<T>

I'm sure there are many more. All of the above need to be formalized into
guidelines/criteria/templates/styles and agreed by the core committers. The
experience of the g branch should provide a solid start.

Please use this or something like this so that we can accelerate towards
parity with java Lucene.
As the project progresses more mutations can be added as they become
apparent.
This doesn't mean that other refactoring cannot be done but I would hope
that these can be discussed as mini-projects instead of opening this box
yet again.

My company is in the process of expanding it's dev team significantly so
I'm hoping that I will be able to devote some time to help.

Regards,
  Andy

On 30 December 2011 04:55, Christopher Currens <currens.chris@gmail.com>wrote:

> If we could find a reasonable solution and get people to commit to it, I
> could see it being done, in fact, I'd like to see it done.  If we had more
> developers and more time to work on it, it would be awesome to see that
> kind of response and progress on the project.  I would hope their
> enthusiasm wouldn't fade with time; that it wouldn't just be a boost of
> energy at the beginning of the (welcome) change and then fizzle out if it
> got to be too much for people.  I will admit, I'm sure it's not surprising,
> but there are a lot of annoyances about Lucene.NET that could be done away
> with, if we did a re-write of the library.
>
> Anyway, I really have no idea who uses Lucene.NET besides those on this
> project, and StackOverflow.  I would hope we could get opinions and advice
> from those other users as well.
>
>
> Thanks,
> Christopher
>
> On Thu, Dec 29, 2011 at 8:47 PM, Troy Howard <thoward37@gmail.com> wrote:
>
> > Chris,
> >
> > Regarding release schedule and the amount of work to accomplish
> > porting... What if we had 20 developers working on the project?
> >
> > It's likely that by changing what we're doing, we'll attract more
> > people to work on the project and thus these concerns (which are
> > perfectly valid concerns if you're attempting to port the entire
> > library as a one or two person effort, as George, DIGY and you have
> > done) will no longer be relevant.
> >
> > If the *volume* of work is a problem, then a reasonable solution is to
> > scale up the quantity of devs on the project and get organized enough
> > to keep them all productive. I'm certain that moving away from the
> > line-by-line port will cause more developers to be interested in
> > working on it. No one wants to do that kind of code and then have to
> > look at the product of their work and still be annoyed by the API.
> > This is essentially *exactly* where things were a year ago when the
> > project was about to die and this is why.
> >
> > When I started up the Lucere project in response to these problems, I
> > was flooded with devs offering to help, to the extent that I couldn't
> > keep up with it and had to just start telling people we didn't need
> > any more help. There is no reason that the same thing couldn't happen
> > here.
> >
> > Thanks,
> > Troy
> >
> >
> > On Thu, Dec 29, 2011 at 8:27 PM, Christopher Currens
> > <currens.chris@gmail.com> wrote:
> > > Unfortunately, my desires for this project seem to change with the
> > progress
> > > that we make on it.  What I mean, is that what I want right now, will
> > > likely be different from what I will want once we've released a few
> more
> > > versions.  What I KNOW I want right now:
> > >
> > > I want the line-by-line port to continue, but in respects of the API, I
> > > want things that "just don't make sense(tm)" in .NET to change.  By
> that
> > I
> > > mean removing Close() and properly implement the IDisposable pattern.
> > >  Also, the Java iterator has a perfect .NET analog, IEnumerable.  The
> > code
> > > can essentially stay the same, but it enables real usage in .NET.
> > >  Fortunately, a great deal of 3.0.3 has already been moved over to
> > > generics, so I'm actually concerned less with that.  I want .NET
> > > style-naming, and I want CLS compatibility where possible, at least
> > > allowing for use in case-insensitive languages.
> > >
> > > When the project started, I didn't want a line-by-line port to
> continue,
> > > but once I touched every single part of this codebase that I understood
> > how
> > > large this project is. I've realized that with the amount of time that
> > > everyone has been able to put into this project, I can't see a .NET
> > version
> > > being made until it's up to date with Java.  Maybe I'm being
> pessimistic,
> > > maybe I'm not.  I'm not trying to call anyone out or blame anyone, we
> all
> > > have other jobs, but the amount of time that can be spent vs the amount
> > of
> > > work a .NET centric re-write would take, just doesn't seem possible,
> > > considering the goals mention, of trying to keep up with Lucene.
> > >
> > > I think it would be more likely that a goal like that would succeed, if
> > the
> > > codebase were caught up with java, and as the .netification was being
> > done,
> > > any features, bugfixes, changes, or whatever would be immediately
> > obvious.
> > >  I care very much about the index formats being the same, as well as
> the
> > > query syntax, and think a search done in java against an index should
> > > behave the exact same way in .NET.  I'm afraid that will the amount of
> > > effort it would take to do it now, when we're already behind, would
> cause
> > > the project to end up stagnating, like it did before, which I'm
> committed
> > > to not let happen.
> > >
> > > That being said, if everyone else disagrees with me, that is absolutely
> > > fine.  If no one would be against it, I would ask that I could work on
> a
> > > line-by-line port on my own, in a separate branch, if no one else
> wanted
> > > to.  For me, I think people want a) performance and b) the latest
> version
> > > with bug fixes.  That's what I DO want out of this project, since we're
> > > using it in ours.  I don't have exact benchmarks, but the performance
> of
> > > the 3.0.3 branch is much better than 2.9.4, from indexing to searching.
> >  I
> > > should also mention that 2.9.4 introduced a memory leak that was not
> > > present in 2.9.2, and in 3.0.3, this memory leak no longer exists.  It
> is
> > > also more memory friendly.
> > >
> > > I don't think that the line-by-line port hurts the performance of
> > > Lucene.NET as much as it does annoy the crap out of everyone.  I'm
> > > seriously annoyed that I can't enumerate over Terms or TermDocs,
> without
> > > that terrible Next() and HasNext() crap.  That's not to say that moving
> > > from a line-by-line port won't increase performance, of course. :)  I'm
> > > definitely not against changing the API to facilitate a more .NET
> > > experience.
> > >
> > > That's what I want.  If no one else wants it, that's okay.  If we
> decide
> > to
> > > not have the line-by-line port an official part of Lucene.NET, that's
> > fine
> > > too, I'm not going to stop working on it, I would likely wind up
> working
> > on
> > > it outside this project.  However, I think it's valuable to have as an
> > > official part of our releases, and I like to think its something the
> > > community wants, since I believe it would allow a faster release
> schedule
> > > than our own interpretation of a Lucene library.
> > >
> > >
> > > Thanks,
> > > Christopher
> > >
> > >
> > > On Thu, Dec 29, 2011 at 7:42 PM, Troy Howard <thoward37@gmail.com>
> > wrote:
> > >
> > >> I completely agree with Michael's mentality on thinking of this as a
> > >> business and coming from the perspective of "what will wow our
> > >> customers" ...
> > >>
> > >> I also completely agree with Prescott, that we just need to get down
> > >> to brass tacks and say what we want it to be specifically and
> > >> subjectively.
> > >>
> > >> Here's what I want:
> > >>
> > >> * I want an extremely modern .NET API full of injection points where I
> > >> can pass lambdas, use IEnumerable<T> and all that System.Linq
> > >> provides, interfaces for everything, as well as excellent unit test
> > >> coverage.
> > >> * I want to write *very* minimal code to accomplish basic tasks
> > >> * I want an average .NET developer be able to intuitively understand
> > >> and use the library with Intellisense as their only documentation
> > >> * I want performance that meets or exceeds Java Lucene
> > >> * I want no memory leaks
> > >> * I want no "surprises" in general
> > >> * I want minimal I/O
> > >> * I want any execution that can be deferred or optimized out to be
> > >> deferred or optimized out
> > >> * I want any data that could be large in size to be streamable
> > >> * I want no pointless unavoidable limitations on scale... and I want
> > >> to be able to horizontally distribute searching and indexing with ease
> > >> * I want every feature that Java Lucene's latest version has (and then
> > >> some)
> > >> * I want the index formats to be compatible with every other "Lucene"
> > >> out there in whatever language and I want the query language to work
> > >> identically across all of them.. That is to say given query Text "X"
> > >> and index "Y" you will always get result set "Z" from every
> > >> implementation of Lucene. Because when I have to get to my data via
> > >> Python, Java, C++, Ruby or whatever, I want everything to just work.
> > >> * I want to know which clauses in my query caused the result hit and
> > >> to what degree, without having to incur a huge performance hit
> > >> * I want real-time updates without having to do a little dance and
> > >> wave my hands to get it to work
> > >> * I want to get a new major version of the library roughly once or
> > >> twice a year and I want to be very impressed by the features in the
> > >> new version. I want bug fixes rolled out on a quarterly basis (at
> > >> minimum) between those major versions.
> > >> * I want to be able to trace or step-debug the execution of a search
> > >> or indexing process and not think "WTF" constantly. Some of that code
> > >> is extremely obtuse.
> > >> * I want the query parser to be generated from a PEG grammar so that I
> > >> can easily generate one in other languages
> > >>
> > >> ... and much much more. I didn't even get into things like being able
> > >> to create custom indexes that use something other than strings for the
> > >> inversion product, decorating my POCO's properties with attributes
> > >> like [Field("Description")] and just saying "Store", better query
> > >> expansion, and blah blah blah.  :)
> > >>
> > >> And I agree with Prescott on this one: I don't care *at all* about
> > >> Java, other than porting code out of it so that it can run on .NET. I
> > >> hate Java, but I love a lot of the libraries written in it. I feel
> > >> that the JVM is an inferior runtime to the CLR and the Java language
> > >> is like C#'s retarded cousin. I'll gladly write a new book on the new
> > >> API and publish it for free online, so people don't have to read
> > >> "Lucene in Action" to learn Lucene.Net. I'll gladly spend the time it
> > >> takes to understand a changeset from the Java project and the mentally
> > >> model what they were trying to accomplish by it and then re-engineer
> > >> the change to apply to our library.
> > >>
> > >> Basically, I don't want to limit the project to a line-by-line port at
> > >> all. I also don't want to piss people off and destroy the project in
> > >> the process. Soo... I'm flexible as well. :)
> > >>
> > >> Thanks,
> > >> Troy
> > >>
> > >>
> > >> On Thu, Dec 29, 2011 at 6:18 PM, Prescott Nasser <
> geobmx540@hotmail.com
> > >
> > >> wrote:
> > >> >
> > >> > Someone has to take a stand and call out what they prefer - rather
> > than
> > >> shooting out all the alternatives, we need to start voicing our
> > opinions of
> > >> which direction we need to go. I'll get us started: I want to see
> > something
> > >> that is more .NET like, I want to see something that can run on the
> > phone,
> > >> xbox, pc, mono, etc. I want to use the latest and greatest .NET has to
> > >> offer.  I do care that we keep the index files 100% compabitible. I
> also
> > >> care that we try to keep up with Java in feature set and extras
> > >> (contrib's). I couldn't care less about keeping the API in line with
> > java.
> > >>   I don't really care about the line by line - but others in the past
> > have
> > >> said they did. My energy isn't really behind keeping that in line but
> > I'll
> > >> help maintain it if that is what the community really wants. But I
> agree
> > >> with Troy - there are lots of options if you want the Java Lucene
> > avaliable
> > >> in .Net That's my feeling - but at the same time, I realize we are a
> > small
> > >> community, and if we don't really agree with what we want to do, then
> we
> > >> are SOL - I'm FLEXIBLE if others really want something or feel we
> > should do
> > >> something.  ~P
> > >> >
> > >> >  > Date: Thu, 29 Dec 2011 20:51:09 -0500
> > >> >> From: mherndon@wickedsoftware.net
> > >> >> To: lucene-net-dev@lucene.apache.org
> > >> >> Subject: Re: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g
> > >> >>
> > >> >> Might I suggest that we all approach this as a business owners,
> > >> community
> > >> >> builders, startup entrepreneurs instead of developers for a second.
> > >> >>
> > >> >> You have limited resources: time, budget, personnel, etc.
> > >> >>
> > >> >> What is our two biggest metrics of success for this product?
> > >> >>
> > >> >> My guess is adoption and customer involvement (contributing
> patches,
> > >> >> tutorials, tweets, etc).  Most likely both are those are going to
> be
> > >> >> carried by .NET developers as your inside promoter of Lucene.NET.
> > >> >>
> > >> >> So what is going to wow them? bring them the most value?  What can
> we
> > >> >> provide so that it makes their job easier, cost effective, and lets
> > get
> > >> >> home faster to their lives or significant other?  What is a break
> out
> > >> niche
> > >> >> that Lucene.Net could have over Solr/Lucene?
> > >> >>
> > >> >> What is going to make an average developer more willing to grow the
> > >> >> community and contribute?  What would encourage them to give up
> their
> > >> free
> > >> >> time to do so?
> > >> >>
> > >> >> I would approach the answer from this angle rather than continue to
> > talk
> > >> >> about it from a developer/committer perspective as we keep going in
> > >> >> circles. You're not going to be able to please everyone, so lets
> > figure
> > >> out
> > >> >> was is going to deliver the most value to .NET developers and go
> from
> > >> >> there.
> > >> >>
> > >> >> - michael
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> On Thu, Dec 29, 2011 at 8:13 PM, Rory Plaire <codekaizen@gmail.com
> >
> > >> wrote:
> > >> >>
> > >> >> > The other option for people not wanting a line-by-line port is to
> > just
> > >> >> > stick with whichever the last version that had a line-by-line
> > >> >> > transliteration done to it. This is done in a number of projects
> > >> where new
> > >> >> > versions break compatibility. 2.9.4 is certainly a nice
> release...
> > >> >> >
> > >> >> > -r
> > >> >> >
> > >> >> > On Thu, Dec 29, 2011 at 4:32 PM, Troy Howard <
> thoward37@gmail.com>
> > >> wrote:
> > >> >> >
> > >> >> > > Thinking about it, I should make myself more clear regarding
> why
> > I
> > >> >> > > brought up IKVM again, just so no one gets the wrong idea about
> > my
> > >> >> > > intentions there...
> > >> >> > >
> > >> >> > > I only mentioned it as a justification for dropping
> line-by-line
> > >> >> > > compatibility and as an alternative for people who really care
> > about
> > >> >> > > that. As we discussed previously, IKVMed Lucene is not
> Lucene.Net
> > >> in a
> > >> >> > > lot of important material ways. We are already deviating
> > >> significantly
> > >> >> > > from Java Lucene even with the "mostly line by line" approach.
> > >> Compare
> > >> >> > > Lucene.Net 2.9.4 and IKVMed Java Lucene 2.9.4. They are very
> > >> different
> > >> >> > > user experiences on a lot of levels (licensing, packaging, data
> > >> types
> > >> >> > > used, etc).
> > >> >> > >
> > >> >> > > But it's a *reasonable alternative* when a high-degree of
> > >> consistency
> > >> >> > > with Java Lucene is important to the end user and by pointing
> to
> > >> IKVM
> > >> >> > > as our answer to those users, we are free to move forward
> without
> > >> that
> > >> >> > > concern.
> > >> >> > >
> > >> >> > > That means, supposing we move away from Java significantly, as
> a
> > new
> > >> >> > > end user looking to employ Lucene in their .NET product, they
> can
> > >> >> > > choose between IKVM Lucene (identical API to Java, can use the
> > >> latest
> > >> >> > > Java build, performs well, may have some problems with
> licensing
> > and
> > >> >> > > packaging) and Lucene.Net (different API but hopefully one that
> > is
> > >> >> > > more palatable to .NET users so it'd be easy to learn, perfoms
> > >> better
> > >> >> > > than IKVM, but has a dev cycle that lags behind Java, possibly
> > by a
> > >> >> > > lot).
> > >> >> > >
> > >> >> > > Existing users who like who Lucene.Net as it is now, may feel
> > >> >> > > alienated because they would be forced to choose between
> learning
> > >> the
> > >> >> > > new API and dealing with a slow dev cycle, or adapting to IKVM
> > which
> > >> >> > > could be very difficult or impossible for them. Either one
> would
> > >> >> > > require a code change. But of course, we run this risk with any
> > >> change
> > >> >> > > we make to what we are doing. I think a greater risk is that
> the
> > >> >> > > project lacks direction.
> > >> >> > >
> > >> >> > > Anyway, it's just one idea/talking point towards the end goal
> of
> > >> >> > > getting the general topic off the table completely.
> > >> >> > >
> > >> >> > > Thanks,
> > >> >> > > Troy
> > >> >> > >
> > >> >> > >
> > >> >> > > On Thu, Dec 29, 2011 at 3:32 PM, Troy Howard <
> > thoward37@gmail.com>
> > >> >> > wrote:
> > >> >> > > > Apologies upfront: another long email.
> > >> >> > > >
> > >> >> > > > My most firm opinion on this topic is that, as a community,
> we
> > >> spend
> > >> >> > > > too much time on this discussion. We should just simply
> commit
> > to
> > >> one
> > >> >> > > > or the other path, or both, or some middle ground, or just
> > commit
> > >> to
> > >> >> > > > not discussing it anymore and go with "whatever code gets
> > written
> > >> and
> > >> >> > > > works is what we use" and leave it up to the discretion of
> the
> > >> coder
> > >> >> > > > who is actually spending time improving the product.
> Obviously
> > the
> > >> >> > > > last option is the worst of them.
> > >> >> > > >
> > >> >> > > > My view of our current roadmap is/was:
> > >> >> > > >
> > >> >> > > > 1. We'd maintain basic line-by-line consistency through the
> 2.x
> > >> >> > > > releases. But 3.X and beyond were open to changing the API
> > >> >> > > > significantly. We are committed to changing the API and
> > internal
> > >> >> > > > implementations in order to improve performance and developer
> > >> >> > > > experience on .NET, but haven't yet had made a plan for that
> > (eg,
> > >> no
> > >> >> > > > spec for a new API).
> > >> >> > > >
> > >> >> > > > 2. We'd try to automate the porting process so that it was
> > >> repeatable
> > >> >> > > > and easy to keep up with (or at least easier) and maintain a
> > >> >> > > > line-by-line port in a branch. That means the .NET version
> > would
> > >> >> > > > ultimately be a very different product than the line-by-line
> > port
> > >> and
> > >> >> > > > we'd be creating two separate but related products but where
> > >> possible,
> > >> >> > > > share code between them. Patching the line-by-line product
> from
> > >> Java
> > >> >> > > > would be easier and faster than patching the .NET product and
> > so
> > >> they
> > >> >> > > > may end up with different release schedules.
> > >> >> > > >
> > >> >> > > > It seems that effort on improving automation of the port has
> > >> tapered
> > >> >> > > > off. As anyone who has done any of the porting from commit
> > patches
> > >> >> > > > from Java knows, a good portion of that work can be automated
> > with
> > >> >> > > > find/replace but substantial portions and certain scenarios
> is
> > the
> > >> >> > > > current code definitely cannot be and probably will never be
> > able
> > >> to
> > >> >> > > > be fully automated.
> > >> >> > > >
> > >> >> > > > While I have been advocating "doing both" and trying to find
> a
> > >> >> > > > strategy that makes sense for that, another option is to just
> > >> >> > > > officially drop any concern for line-by-line consistency with
> > >> Java. A
> > >> >> > > > justification for that is simple: IKVM provides this already.
> > The
> > >> >> > > > licensing allows use in commercial apps and it's performance
> is
> > >> close
> > >> >> > > > to the same, so, AFAIK it's a viable replacement for a
> > >> line-by-line
> > >> >> > > > version of Lucene.Net in just about any context as long as no
> > one
> > >> is
> > >> >> > > > modifying IKVM itself. I don't think it's unreasonable to
> > suggest
> > >> to
> > >> >> > > > people who want a line-by-line version to use IKVM instead of
> > >> >> > > > Lucene.Net.
> > >> >> > > >
> > >> >> > > > So, if we use that perspective and say that the need for a
> .NET
> > >> usable
> > >> >> > > > line-by-line version of Lucene is already available via IKVM,
> > why
> > >> >> > > > would we bother handcoding another one? It makes more sense
> to
> > >> focus
> > >> >> > > > our valuable hand coding work on making something that
> > *improves*
> > >> upon
> > >> >> > > > the .NET development experience. It may cause us to be slow
> to
> > >> >> > > > release, but for good reason.
> > >> >> > > >
> > >> >> > > > So it seems to me we have the following primary agenda items
> to
> > >> deal
> > >> >> > > with:
> > >> >> > > >
> > >> >> > > > 1. Make an official decision regarding line-by-line porting,
> > >> publish
> > >> >> > > > it and document our reasoning, so that we can end the
> ambiguity
> > >> and
> > >> >> > > > circular discussions
> > >> >> > > > 2. If line-by-line porting is still part of our plan after we
> > >> >> > > > accomplish Agenda Item #1, resume work on improving
> automation
> > of
> > >> >> > > > porting, creating scripts/tools/etc and document the process
> > >> >> > > > 3. If having a different API for .NET is still part of our
> plan
> > >> after
> > >> >> > > > we accomplish Agenda Item #1, spec those API changes and
> > >> associated
> > >> >> > > > internal changes required and publish the spec
> > >> >> > > >
> > >> >> > > > And to drive home the point I made in my first sentence: If
> had
> > >> >> > > > already accomplished those three agenda items, the time I
> just
> > >> spent
> > >> >> > > > typing this email could have been spent working on
> Lucene.Net.
> > We
> > >> need
> > >> >> > > > to get to that point if we want to maintain any kind of
> > >> development
> > >> >> > > > velocity.
> > >> >> > > >
> > >> >> > > > Thanks,
> > >> >> > > > Troy
> > >> >> > > >
> > >> >> > > >
> > >> >> > > > On Thu, Dec 29, 2011 at 2:38 PM, Prescott Nasser <
> > >> >> > geobmx540@hotmail.com>
> > >> >> > > wrote:
> > >> >> > > >> I dont think at the end of the day we want to make just
> > cosmetic
> > >> >> > > changes. We also have the issue of same name different casing
> > which
> > >> needs
> > >> >> > > to be fixed - but it's not clear how to manage that without
> some
> > >> large
> > >> >> > > adjustments to the API.
> > >> >> > > >>
> > >> >> > > >>
> > >> >> > > >>
> > >> >> > > >> Sent from my Windows Phone
> > >> >> > > >> ________________________________
> > >> >> > > >> From: Troy Howard
> > >> >> > > >> Sent: 12/29/2011 2:19 PM
> > >> >> > > >> To: lucene-net-dev@lucene.apache.org
> > >> >> > > >> Subject: Re: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g
> > >> >> > > >>
> > >> >> > > >> My vote goes to merging the two:
> > >> >> > > >>
> > >> >> > > >> Apply the same concepts from 2.9.4g to 3.X development,
> using
> > >> generics
> > >> >> > > >> where possible, Disposable vs Close, and exposing
> *additional*
> > >> APIs
> > >> >> > > >> for generics (but leaving the existing old ones) to enable
> the
> > >> >> > > >> underlying performance improvements the generics offer.
> Also,
> > >> expose
> > >> >> > > >> IEnumerable<T> implementations vs Java style
> > >> enumerables/iterators.
> > >> >> > > >>
> > >> >> > > >> If we are only adding to the existing and making relatively
> > minor
> > >> >> > > >> changes to enable generics, updating/maintenance should be
> > >> relatively
> > >> >> > > >> easy and it won't break anyone's code.
> > >> >> > > >>
> > >> >> > > >> Thanks,
> > >> >> > > >> Troy
> > >> >> > > >>
> > >> >> > > >>
> > >> >> > > >> On Thu, Dec 29, 2011 at 2:08 PM, Prescott Nasser <
> > >> >> > geobmx540@hotmail.com>
> > >> >> > > wrote:
> > >> >> > > >>> I agree its a matter of taste. I'd vote continue with g and
> > >> evolve it
> > >> >> > > to where we want a .net version to be. What do others think?
> > >> >> > > >>>
> > >> >> > > >>> Sent from my Windows Phone
> > >> >> > > >>> ________________________________
> > >> >> > > >>> From: Digy
> > >> >> > > >>> Sent: 12/29/2011 1:16 PM
> > >> >> > > >>> To: lucene-net-dev@lucene.apache.org
> > >> >> > > >>> Subject: RE: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g
> > >> >> > > >>>
> > >> >> > > >>> When I started that "g" branch, I had no intention to
> change
> > >> the API,
> > >> >> > > but at
> > >> >> > > >>> the end it resulted in a few changes
> > >> >> > > >>> like StopAnalyzer(List<string> stopWords),
> > >> >> > > >>> Query.ExtractTerms(ICollection<string>) etc.
> > >> >> > > >>> But I think, a drop-in replacement will work for most of
> the
> > >> >> > Lucene.Net
> > >> >> > > >>> users (Of course some contribs have been also modified
> > >> accordingly)
> > >> >> > > >>>
> > >> >> > > >>> Changing arraylists/collections with generic counterparts,
> > >> >> > > GetEnumerator's
> > >> >> > > >>> with foreach, AnonymousClass's with
> > >> >> > > >>> Func<> or Action<>'s and Fixing LUCENENET-172 are things
> most
> > >> people
> > >> >> > > would
> > >> >> > > >>> not notice.
> > >> >> > > >>>
> > >> >> > > >>> This "g" version includes also some other patches that were
> > >> fixed for
> > >> >> > > >>> .GE.(=>) Lucene3.1 (Which? I have to rework on my commits)
> > >> >> > > >>>
> > >> >> > > >>> So, there isn't much change in API, more changes for
> > developers
> > >> and
> > >> >> > > more
> > >> >> > > >>> stable code(At least I think so, since I use this "g"
> > version in
> > >> >> > > production
> > >> >> > > >>> env. for months without any problem. For short, 2.9.4g is a
> > >> superset
> > >> >> > of
> > >> >> > > >>> 2.9.4 in bugfix level)
> > >> >> > > >>>
> > >> >> > > >>>
> > >> >> > > >>> As a result, creating a new branch for .Net friendly
> > Lucene.Net
> > >> or
> > >> >> > > >>> continuing on this branch is just a matter of taste.
> > >> >> > > >>>
> > >> >> > > >>> DIGY
> > >> >> > > >>>
> > >> >> > > >>>
> > >> >> > > >>>
> > >> >> > > >>>
> > >> >> > > >>>
> > >> >> > > >>>
> > >> >> > > >>> -----Original Message-----
> > >> >> > > >>> From: Scott Lombard [mailto:lombardenator@gmail.com]
> > >> >> > > >>> Sent: Thursday, December 29, 2011 5:05 PM
> > >> >> > > >>> To: lucene-net-dev@lucene.apache.org
> > >> >> > > >>> Subject: RE: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g
> > >> >> > > >>>
> > >> >> > > >>>
> > >> >> > > >>> I don't see the g branch differing all that much from the
> > >> >> > line-by-line
> > >> >> > > port.
> > >> >> > > >>> All the g branch does is change some data types as
> generics,
> > >> but line
> > >> >> > > by
> > >> >> > > >>> line the code the same once the generics are declared.
> > >> >> > > >>>
> > >> >> > > >>> I don't see 2.9.4g being any closer to a .NET style version
> > than
> > >> >> > 2.9.4.
> > >> >> > > >>> While it does generics use for list style variable types
> the
> > >> >> > underlying
> > >> >> > > >>> classes are still the same and all of the problems with
> 2.9.4
> > >> not
> > >> >> > > being .NET
> > >> >> > > >>> enough would be true in 2.9.4g.
> > >> >> > > >>>
> > >> >> > > >>> I would have to refer to Digy on if it changes how an end
> > user
> > >> >> > > interacts
> > >> >> > > >>> with Lucene.NET.  If it does not affect how the end user
> > >> interacts
> > >> >> > with
> > >> >> > > >>> Lucene.NET then I think we should merge it into the Trunk
> and
> > >> go from
> > >> >> > > there
> > >> >> > > >>> on 3.0.3.
> > >> >> > > >>>
> > >> >> > > >>>
> > >> >> > > >>> Scott
> > >> >> > > >>>
> > >> >> > > >>>
> > >> >> > > >>>> -----Original Message-----
> > >> >> > > >>>> From: Prescott Nasser [mailto:geobmx540@hotmail.com]
> > >> >> > > >>>> Sent: Wednesday, December 28, 2011 8:28 PM
> > >> >> > > >>>> To: lucene-net-dev@lucene.apache.org
> > >> >> > > >>>> Subject: RE: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g
> > >> >> > > >>>>
> > >> >> > > >>>>
> > >> >> > > >>>> Any reason we can't continue this g branch and make it
> more
> > >> >> > > >>>> and more .net like? I was thinking about what we've
> > expressed
> > >> >> > > >>>> at goals - we want a line by line port - it's easy to
> > >> >> > > >>>> maintain parity with java and easy to compare. We also
> want
> > a
> > >> >> > > >>>> more .NET version - the g branch gets this started -
> > although
> > >> >> > > >>>> it's not as .Net as people want (I think).
> > >> >> > > >>>>
> > >> >> > > >>>>
> > >> >> > > >>>>
> > >> >> > > >>>> What if we used the g branch as our .Net version and
> > >> >> > > >>>> continued to make it more .Net like? and kept the trunk as
> > >> >> > > >>>> the line by line? The G branch seems like a good start to
> > the
> > >> >> > > >>>> more .Net version anyway - we might as well build off of
> > that?
> > >> >> > > >>>>
> > >> >> > > >>>>
> > >> >> > > >>>>
> > >> >> > > >>>>
> > >> >> > > >>>>
> > >> >> > > >>>>
> > >> >> > > >>>>
> > >> >> > > >>>>
> > >> >> > > >>>>
> > >> >> > > >>>> ---------------------------------------- > From:
> > >> >> > > >>>> digydigy@gmail.com > To: lucene-net-dev@lucene.apache.org>
> > >> >> > > >>>> Date: Thu, 29 Dec 2011 02:45:23 +0200 > Subject: RE:
> > >> >> > > >>>> [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g > > > but I
> > >> >> > > >>>> guess the future of 2.9.4g depends on the extent that it
> is
> > >> >> > > >>>> becoming > more .NET like > > My intention while I was
> > >> >> > > >>>> creating that branch was just to make 2.9.4 a > little bit
> > >> >> > > >>>> more .Net like(+ maybe some performance). > I used many
> > codes
> > >> >> > > >>>> from 3.0.3 Java. So it is somewhere between 2.9.4 & 3.0.3
> >
> > >> >> > > >>>> But I didn't think it as a separate branch to evolve on
> its
> > >> >> > > >>>> own path. It > is(or I think it is) the final version of
> 2.9
> > >> >> > > >>>> > > DIGY > > -----Original Message----- > From:
> Christopher
> > >> >> > > >>>> Currens [mailto:currens.chris@gmail.com] > Sent:
> Wednesday,
> > >> >> > > >>>> December 28, 2011 9:20 PM > To:
> > >> >> > > >>>> lucene-net-dev@lucene.apache.org > Cc:
> > >> >> > > >>>> lucene-net-user@lucene.apache.org > Subject: Re:
> > [Lucene.Net]
> > >> >> > > >>>> Lucene.Net 3 onwards and 2.9.4g > > One of the benefits of
> > >> >> > > >>>> moving forward with the conversion of the Java > Lucene,
> is
> > >> >> > > >>>> that they're using more recent versions of Java that
> support
> > >> >> > > >>>> > things like generics and enums, so the direct port is
> > >> >> > > >>>> getting more and more > like .NET, though not in all
> > respects
> > >> >> > > >>>> of course. I'm of the mind, though, > that one of the
> larger
> > >> >> > > >>>> annoyances, Iterables, should be converted to >
> Enumerables
> > >> >> > > >>>> in the direct port. It makes it a pain to use it in .NET >
> > >> >> > > >>>> without it inheriting from IEnumerable, since it can't be
> > >> >> > > >>>> used in a foreach > loop or with linq. Also, since the
> > direct
> > >> >> > > >>>> port isn't perfect anyway, it > seems a port of the IDEA
> of
> > >> >> > > >>>> iterating would be more in the spirit of what > we're
> trying
> > >> >> > > >>>> to accomplish, since the code would pretty much be the
> same,
> > >> >> > > >>>> > just with different method names. > > I sort of got off
> > >> >> > > >>>> topic there for a second, but I guess the future of >
> 2.9.4g
> > >> >> > > >>>> depends on the extent that it is becoming more .NET like.
> >
> > >> >> > > >>>> Obviously, while java is starting to use similar
> constructs
> > >> >> > > >>>> that we have > in .NET, it will never be perfect.
> > Admittedly,
> > >> >> > > >>>> I haven't looked at 2.9.4g > in a little while, so I'm not
> > >> >> > > >>>> sure how much it now differs from 3.x, since > there's a
> > >> >> > > >>>> relatively large change there already. > > Thanks, >
> > >> >> > > >>>> Christopher > > On Thu, Dec 22, 2011 at 9:13 PM, Prescott
> > >> >> > > >>>> Nasser > wrote: > > > > > That's a great question - I
> know a
> > >> >> > > >>>> lot of people like the generics, and I > > don't really
> want
> > >> >> > > >>>> it to disappear. I'd like to keep it in parity with the >
> >
> > >> >> > > >>>> trunk. But I know we also have a goal of making Lucene.Net
> > >> >> > > >>>> more .Net like > > (further than 2.9.4g), and I don't know
> > >> >> > > >>>> how that fits in. We are a pretty > > small community and
> I
> > >> >> > > >>>> know everyone has some pretty busy schedules so it > >
> takes
> > >> >> > > >>>> us considerable time to make big progress. Trying to keep
> > >> >> > > >>>> three > > different code bases probably isn't the right
> way
> > >> >> > > >>>> to go. > > > > > > > > > Date: Fri, 23 Dec 2011 13:02:03
> > >> >> > > >>>> +1100 > > > From: mitiaguin@gmail.com > > > To:
> > >> >> > > >>>> lucene-net-user@lucene.apache.org > > > Subject:
> > [Lucene.Net]
> > >> >> > > >>>> Lucene.Net 3 onwards and 2.9.4g > > > > > > I was browsing
> > >> >> > > >>>> "Roadmap" emails from November in Lucene developer list.
> > >
> > >> >> > > >>>> It > > > remains unclear in what state Lucene 3 porting
> is ,
> > >> >> > > >>>> but my question more > > > about 2.9.4g . > > > Is it kind
> > of
> > >> >> > > >>>> experimental dead end variation of 2.9.4 with generics ? >
> > Am
> > >> >> > > >>>> > > > I right in classifying it as more .Net like 2.9.4
> > which
> > >> >> > > >>>> is unrelated to > > > roadmap Lucene 3 porting effort. > >
> > >> >> > > >>>> ----- > > Checked by AVG - www.avg.com > Version:
> > 2012.0.1901
> > >> >> > > >>>> / Virus Database: 2109/4708 - Release Date: 12/28/11 >
> > >> >> > > >>>>
> > >> >> > > >>>
> > >> >> > > >>> -----
> > >> >> > > >>>
> > >> >> > > >>> Checked by AVG - www.avg.com
> > >> >> > > >>> Version: 2012.0.1901 / Virus Database: 2109/4710 - Release
> > Date:
> > >> >> > > 12/29/11
> > >> >> > > >>>
> > >> >> > >
> > >> >> >
> > >> >
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message