lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Garski <>
Subject RE: port of contrib packages from java
Date Fri, 04 Dec 2009 03:35:41 GMT
So I've been thinking about this for a few days before throwing my 2
cents in.

Regarding the contrib section, I'm not sure how it is managed on the
Java side, but I see it as a repository where any user can contribute
items that they have developed (or ported from Java).  Keeping the
contrib section up to date with the latest release version would fall on
whoever had submitted the contrib code or if that person is no longer
active with Lucene anyone who wants to.  I have a few things that I'll
be contributing for it, some of which are ported from Java, some of
which are unique to Lucene.Net.

On the topic of altering Lucene.Net to take advantage of the .Net
Framework and employing best practices I have mixed feelings.
Modifications to internal implementations are fine with me, however I
would draw the line at modifying the between the classes as we just
don't have the critical mass of contributors to take the port to a
functionality based port from a class & method based port.  A good
example of this is in the interfaces for TermEnum, TermDocs, and
TermPositions - they are a bit cumbersome to use compared to the best
practice of enumerating over .Net collections, however it radically
alters Lucene.Net and makes porting future Java Lucene functionality
challenging.  Providing wrappers around TermDocs, etc is a good
candidate for the contrib section.  Nick's point on
ParallelMutliSearcher is a good one - it's a dog.  However from my
testing in a high load environment you're better off using MultiSearcher
and searching multiple indexes serially and handling multiple requests
concurrently which minimizes thread contention and resource starvation.

We (at MySpace) have never run a 'stock' version of Lucene.Net but a
customized build that tweaks a few things under the hood.  I have not
yet made these changes to the 2.9 version, and will do so once it is
tagged and contribute a patch that anyone can then use and apply.  I
would not see such changes being applied to the trunk as they either
modify behavior in ways that would make future porting more challenging.

I don't think there is a timeline on when we target the ability to keep
up with Java Lucene commits. I'm hoping we can give it a whirl with 3.0
which was recently released, but how to approach that is a whole other


-----Original Message-----
From: Nicholas Paldino [.NET/C# MVP] []

Sent: Sunday, November 29, 2009 11:00 PM
Subject: RE: port of contrib packages from java


	I appreciate the input, but I feel that I might have been

	When I said a custom port, I meant for private consumption, not
public consumption.

	To answer your question, there are a number of benefits that
benefit from a .NET overhaul which have been outlined before.  Some
ones are the multi thread searchers (using Threads with calls to Join
synchronization is a bit of a dog, the ThreadPool can help there),
replacement of ArrayList with List<T> (especially where the type
is a structure, there are performance issues due to boxing when using
ArrayList instances).

	Those are the two off the top of my head which would fall within
purview of the Lucene.NET project, but no one seems to be doing.  The
replacement of ArrayList with List<T> isn't even a call site change in
of the instances, just a declaration change.  This is probably the
lowest-hanging fruit of all, and no one is doing it (Hashtables come to
as well, but those would require call site changes, but could easily be
handled with an extension method on IDictionary<TKey, TValue>).

	Why?  To be honest, none of the answers are really satisfactory.
The current commits that are being made are only being made if they help
drive towards passing the test cases.  I'm not saying that's not a goal
try and direct people towards, but being open source, you have to take
you can get when it comes to the work that people contribute (I'm not
you have to ^accept it^ mind you).

	With that, not everyone wants to see a line-for-line port of
from Java to .NET.  People would like to address pain points that come
an implementation that isn't very .NET friendly, as well as an API that
unfriendly.  I know the latter point is not up for discussion, but you
indicate your desire for having the project fulfill your particular
I respect that vision, but I have one as well.  I don't think it's
unfair to
say that there are others that share it as well.

	While I agree that catching up to Java is an achievable goal,
is no timeline for that goal (nor do you give one, mind you), and my
impression is that it's not one that will be accomplished anytime soon.
George implies (and if I am misrepresenting you George, I apologize,
this is
how I read your response) that is the case given the current level of

	All this being said, I see the discussion as moot, given my
statement about not making it available for public consumption.  I
want access to the process for my own individual consumption.  Given the
open source nature of the project, I don't see why it should be

	I should also note that I am not looking to stop contributing to
project, but given the current direction that it is going, I have needs
desires for it I would like to address, and feel comfortable doing work
I know will not be shared with others, but which will fully attribute
original source of the work.

	That being said, are those tools and information on the process

		- Nicholas Paldino [.NET/C# MVP]

-----Original Message-----
From: Ron Grabowski [] 
Sent: Monday, November 30, 2009 12:53 AM
Subject: Re: port of contrib packages from java

I agree with George. Catching up to Java (within in a week or so of
SVN commits) seems like an achievable goal. The work being done on 2.9
only about a month off the Java release.

I'm concerned that having more of a .NET internal API would cause the
project to slow down adopting new features. Take the PHP Lucene port for
example...its sort of a port of Lucene but I couldn't find anything on
site detailing what version they branched from. I doubt they've
the new features of 2.4, 2.9, etc. into their port or even have plans to
3.0 compliant.

I'd rather have a .NET port that is 10% slower but can more easily adapt
features from the parent project than a super-sweet .NET API that people
have to bend over backwards to re-re-implement parent project features.

Do we need to make the internal API more .NET-ish if people aren't going
use it much? Do you have specific areas that might benefit from a .NET

----- Original Message ----
From: Nicholas Paldino [.NET/C# MVP] <>
Sent: Sun, November 29, 2009 9:53:53 PM
Subject: RE: port of contrib packages from java


    If that is the case, then where can I get a hold of the
that is 
used to port over the java version to .NET?

    Being completely honest, I'd much rather just grab 3.0 from Java, do
and then have a custom version which is more to my liking implementation

API-wise.   (still honoring the Apache license of course).

    While I very much like what Lucene does (and I am speaking in a
sense, not the .NET specific version), the .NET version suffers from
of resources, which unfortunately will keep it in this perpetual state.

        - Nick

-----Original Message-----
From: George Aroush []
Sent: Sunday, November 29, 2009 12:40 AM
Subject: RE: port of contrib packages from java

I'm not discouraging the use of .NET 3.5, or making Lucene.Net to be
.NET compliant.  I'm simply trying to set expectation as this is not the
first time this subject came up.

As you can see, it has been over 1 month since I committed the initial
of 2.9 and even with a good community help (never had this much help in
previous releases, it was just 2 or 3 of us) we still have about 14
tests failing!  If the port was not line-per-line port, not only will we
have to deal with NUnit tests, but we might very well have to deal with
index format, compatibility, corruption, and threading issues to name
the community will have to be well versed with Lucene's internals to
such issues.  Are we ready for this?  IMHO, no, we are not.  I believe
need to first prove that we can maintain a port at a commit-per-commit
(or no more than a week behind Lucene Java), before we commit to be
.NET compliant and take full advantage of it.

-- George

-----Original Message-----
From: Nicholas Paldino [.NET/C# MVP] []
Sent: Wednesday, November 25, 2009 10:09 PM
Subject: RE: port of contrib packages from java


    This brings up the question of whether or not work will be done to
Lucene.NET to adhere to best practices in .NET development.  I'm not
suggesting the public-facing API, but doing internal work.

    While I respect the desire to be able to be on a commit-by-commit
basis with the Java project, there has been discussion in the past about
moving to .NET 3.5 when Lucene 3.0 comes out (they are upgrading to a
version of the JVM at that point, from what I understand).

    Even if the decision to move to .NET 3.5 is made, I can't see the
benefit if all that is desired for the Lucene.NET port is to be a mirror
the Java version because there aren't enough people that can maintain
project on a commit-per-commit basis.

    And while I don't have the metrics of those that have contributed,
it doesn't seem like the project has the critical mass necessary to do
which makes for a catch-22 situation.

    Basically, there aren't enough people to keep the project current on
a commit-by-commit basis with the Java project, and that's one of the
reasons that I think people aren't contributing, because they are
severely to this tenant to have literally line-by-line parity between
two code bases.

    It's also a tenant which serves the limitations of the resources
that the project has available to it, as opposed to the betterment of
project itself.

    I'm not looking to bash the project or the people who have
contributed (and I still want to contribute), but I don't see the point
where the goal of matching the Java version consistently will happen, so
makes me ask if there shouldn't be a discussion about shifting the
priorities of the project to address some of the pain points for the
audience that is using the product now (some examples being a sloppy API
from a .NET perspective, inefficient internal implementations and other

    Perhaps this is something that should be put to a vote as well (not
that I know who's vote would matter or count, but it's something you
suggested for the ports of the contrib projects)?

        - Nick-----Original Message-----
From: George Aroush []
Sent: Monday, November 23, 2009 11:21 PM
Subject: RE: port of contrib packages from java

Porting all of the code in contrib is going to be a challenge; there is
lot of code in there.  So it makes sense to first port packages that
us the most value (maybe via a vote).  Also, what's ported now may no
work with 2.9.1's Lucene.Net port; this is because contrib.Net port has
been kept up to date.  And yes, virtually every project in contrib has a
JUnit test associated with it, thus it can be used for validation of a
project port.

Regarding the .NET'es of ports, this has come up few times in the past,
it's tempting to want to make Lucene.Net more .NET'es.  However. this is
very hard to achieve without solid commitment and being at a
commit-per-commit port with Lucene Java (i.e.: anytime a commit in
Java happens, it must, within days, be ported over to Lucene.Net and

Many of the projects in contrib, the task to port them is much simpler
it is for the core Lucene code.  However, here is where things get
challenging.  Any time you think about making a port more .NET'es, you
keep the following in mind:

1) It will be more work and harder to keep the code in sync with the
version (per the above reasons), and
2) The code in contrib may no longer work with the code in Lucene core
to the .NET'es of the port (mainly public APIs).  Thus, your effort at
.NET'es of contrib port may be limited if Lucene core code isn't.

What's the take away?  Until when we can maintain commit-per-commit port
with Lucene Java, trying to make Lucene.Net and / or contrib more
isn't realistic.

-- George

-----Original Message-----
From: Eran Sevi []
Sent: Monday, November 23, 2009 3:12 PM
Subject: Re: port of contrib packages from java

Although some contrib packages might not be in use by any lucene .net
at the moment, I think we should port them all in accordance with the
version (it shouldn't be as hard as the core classes although I'm not
there are any tests for them).
When and if we'll diverge from the core java implementation in order to
benefit of .net and apply each patch as it comes, we can do the same for
contrib which also sees much less traffic anyway.


On Mon, Nov 23, 2009 at 8:27 PM, Digy <> wrote:

> I don't know whether there is such a preference for contribs or not,
> diverging from Java makes life harder for further ports.
> Will someone be able  to easily port the next release after your state
> art work following .NET best practices?
> Or a new port from scratch?
> -----Original Message-----
> From: Nicholas Paldino [.NET/C# MVP]
> Sent: Monday, November 23, 2009 7:11 PM
> To:
> Subject: RE: port of contrib packages from java
>        On a somewhat related note, these ports, do they adhere to the
> tenants applied to the main trunk, or can they better follow .NET best
> practices if one wants to apply them?
>                - Nick
> -----Original Message-----
> From: Eran Sevi []
> Sent: Monday, November 23, 2009 8:45 AM
> To:
> Subject: Re: port of contrib packages from java
> Thanks,
> I'm more into the "queries" package.
> If no one will beat me to it, I hope I can help and add it myself.
> How did you do the port? manually or using some conversion tools?
> Eran.
> On Mon, Nov 23, 2009 at 3:34 PM, Roger Chapman <>
> > I've done a first pass port of the Spatial Contrib project :
> >
> >
> > Roger.
> >
> > -----Original Message-----
> > From: Eran Sevi []
> > Sent: 23 November 2009 13:17
> > To:
> > Subject: port of contrib packages from java
> >
> > Hi,
> > Is there any thought to port all the contrib packages from java
> > after
> > the porting of core 2.9.1 version is complete?
> > Currently there are 23 packages in java contrib compared to only 7
> packages
> > in .net contrib.
> >
> > Thanks,
> > Eran.
> >

View raw message