lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "George Aroush" <geo...@aroush.net>
Subject RE: port of contrib packages from java
Date Tue, 01 Dec 2009 16:22:02 GMT
I have posted about how I do an initial port several times in the past.  You
can search in the mail archives, but here are some pointers:

http://www.mail-archive.com/lucene-net-dev@incubator.apache.org/msg00401.htm
l
http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg10860.html

Just do a search on "JLCA" in the mailing list for more background.

To sum-up, a port (specially an initial port) isn't much fun, is time
consuming, and can't be divided into smaller tasks to be distributed.

With all of my past ports, it use to take me a little over a month to
complete one -- this includes getting up to 80% of NUnit tests passing --
for 2.9, it took me well over a month (with a lot more hours working on the
project) and never got the chance to address any NUnit test failures.  Why?
The delta between 2.9 and previous releases was considerable (major
refactoring in Lucene Java), and a lot of new features and files were added
in 2.9 (code base grew by 30%).

-- George


-----Original Message-----
From: Nicholas Paldino [.NET/C# MVP] [mailto:casperOne@caspershouse.com] 
Sent: Monday, November 30, 2009 2:00 AM
To: lucene-net-dev@incubator.apache.org
Subject: RE: port of contrib packages from java

Rob,

	I appreciate the input, but I feel that I might have been
misunderstood.

	When I said a custom port, I meant for private consumption, not for
public consumption.

	To answer your question, there are a number of benefits that will
benefit from a .NET overhaul which have been outlined before.  Some specific
ones are the multi thread searchers (using Threads with calls to Join for
synchronization is a bit of a dog, the ThreadPool can help there),
replacement of ArrayList with List<T> (especially where the type parameter
is a structure, there are performance issues due to boxing when using
ArrayList instances).

	Those are the two off the top of my head which would fall within the
purview of the Lucene.NET project, but no one seems to be doing.  The
replacement of ArrayList with List<T> isn't even a call site change in 99%
of the instances, just a declaration change.  This is probably the
lowest-hanging fruit of all, and no one is doing it (Hashtables come to mind
as well, but those would require call site changes, but could easily be
handled with an extension method on IDictionary<TKey, TValue>).

	Why?  To be honest, none of the answers are really satisfactory.
The current commits that are being made are only being made if they help
drive towards passing the test cases.  I'm not saying that's not a goal to
try and direct people towards, but being open source, you have to take what
you can get when it comes to the work that people contribute (I'm not saying
you have to ^accept it^ mind you).

	With that, not everyone wants to see a line-for-line port of Lucene
from Java to .NET.  People would like to address pain points that come with
an implementation that isn't very .NET friendly, as well as an API that is
unfriendly.  I know the latter point is not up for discussion, but you
indicate your desire for having the project fulfill your particular vision.
I respect that vision, but I have one as well.  I don't think it's unfair to
say that there are others that share it as well.

	While I agree that catching up to Java is an achievable goal, there
is no timeline for that goal (nor do you give one, mind you), and my
impression is that it's not one that will be accomplished anytime soon.
George implies (and if I am misrepresenting you George, I apologize, this is
how I read your response) that is the case given the current level of
contribution.

	All this being said, I see the discussion as moot, given my first
statement about not making it available for public consumption.  I simply
want access to the process for my own individual consumption.  Given the
open source nature of the project, I don't see why it should be unavailable.

	I should also note that I am not looking to stop contributing to the
project, but given the current direction that it is going, I have needs and
desires for it I would like to address, and feel comfortable doing work that
I know will not be shared with others, but which will fully attribute the
original source of the work.

	That being said, are those tools and information on the process
available?

		- Nicholas Paldino [.NET/C# MVP]

-----Original Message-----
From: Ron Grabowski [mailto:rongrabowski@yahoo.com] 
Sent: Monday, November 30, 2009 12:53 AM
To: lucene-net-dev@incubator.apache.org
Subject: Re: port of contrib packages from java

I agree with George. Catching up to Java (within in a week or so of their
SVN commits) seems like an achievable goal. The work being done on 2.9 is
only about a month off the Java release.


I'm concerned that having more of a .NET internal API would cause the
project to slow down adopting new features. Take the PHP Lucene port for
example...its sort of a port of Lucene but I couldn't find anything on the
site detailing what version they branched from. I doubt they've incorporated
the new features of 2.4, 2.9, etc. into their port or even have plans to be
3.0 compliant.

I'd rather have a .NET port that is 10% slower but can more easily adapt new
features from the parent project than a super-sweet .NET API that people
have to bend over backwards to re-re-implement parent project features.

Do we need to make the internal API more .NET-ish if people aren't going to
use it much? Do you have specific areas that might benefit from a .NET
overhaul?


----- Original Message ----
From: Nicholas Paldino [.NET/C# MVP] <casperOne@caspershouse.com>
To: lucene-net-dev@incubator.apache.org
Sent: Sun, November 29, 2009 9:53:53 PM
Subject: RE: port of contrib packages from java

George,

    If that is the case, then where can I get a hold of the tools/process
that is 
used to port over the java version to .NET?

    Being completely honest, I'd much rather just grab 3.0 from Java, do a
port, 
and then have a custom version which is more to my liking implementation and

API-wise.   (still honoring the Apache license of course).

    While I very much like what Lucene does (and I am speaking in a general 
sense, not the .NET specific version), the .NET version suffers from this
lack 
of resources, which unfortunately will keep it in this perpetual state.

        - Nick

-----Original Message-----
From: George Aroush [mailto:george@aroush.net]
Sent: Sunday, November 29, 2009 12:40 AM
To: lucene-net-dev@incubator.apache.org
Subject: RE: port of contrib packages from java

I'm not discouraging the use of .NET 3.5, or making Lucene.Net to be fully
.NET compliant.  I'm simply trying to set expectation as this is not the
first time this subject came up.

As you can see, it has been over 1 month since I committed the initial port
of 2.9 and even with a good community help (never had this much help in any
previous releases, it was just 2 or 3 of us) we still have about 14 NUnit
tests failing!  If the port was not line-per-line port, not only will we
have to deal with NUnit tests, but we might very well have to deal with
index format, compatibility, corruption, and threading issues to name some;
the community will have to be well versed with Lucene's internals to address
such issues.  Are we ready for this?  IMHO, no, we are not.  I believe we
need to first prove that we can maintain a port at a commit-per-commit level
(or no more than a week behind Lucene Java), before we commit to be fully
.NET compliant and take full advantage of it.

-- George


-----Original Message-----
From: Nicholas Paldino [.NET/C# MVP] [mailto:casperOne@caspershouse.com]
Sent: Wednesday, November 25, 2009 10:09 PM
To: lucene-net-dev@incubator.apache.org
Subject: RE: port of contrib packages from java

George,

    This brings up the question of whether or not work will be done to
Lucene.NET to adhere to best practices in .NET development.  I'm not even
suggesting the public-facing API, but doing internal work.

    While I respect the desire to be able to be on a commit-by-commit
basis with the Java project, there has been discussion in the past about
moving to .NET 3.5 when Lucene 3.0 comes out (they are upgrading to a new
version of the JVM at that point, from what I understand).

    Even if the decision to move to .NET 3.5 is made, I can't see the
benefit if all that is desired for the Lucene.NET port is to be a mirror for
the Java version because there aren't enough people that can maintain the
project on a commit-per-commit basis.

    And while I don't have the metrics of those that have contributed,
it doesn't seem like the project has the critical mass necessary to do this,
which makes for a catch-22 situation.

    Basically, there aren't enough people to keep the project current on
a commit-by-commit basis with the Java project, and that's one of the big
reasons that I think people aren't contributing, because they are limited
severely to this tenant to have literally line-by-line parity between the
two code bases.

    It's also a tenant which serves the limitations of the resources
that the project has available to it, as opposed to the betterment of the
project itself.

    I'm not looking to bash the project or the people who have
contributed (and I still want to contribute), but I don't see the point
where the goal of matching the Java version consistently will happen, so it
makes me ask if there shouldn't be a discussion about shifting the
priorities of the project to address some of the pain points for the
audience that is using the product now (some examples being a sloppy API
from a .NET perspective, inefficient internal implementations and other such
"goodies").

    Perhaps this is something that should be put to a vote as well (not
that I know who's vote would matter or count, but it's something you
suggested for the ports of the contrib projects)?

        - Nick-----Original Message-----
From: George Aroush [mailto:george@aroush.net]
Sent: Monday, November 23, 2009 11:21 PM
To: lucene-net-dev@incubator.apache.org
Subject: RE: port of contrib packages from java

Porting all of the code in contrib is going to be a challenge; there is a
lot of code in there.  So it makes sense to first port packages that gives
us the most value (maybe via a vote).  Also, what's ported now may no longer
work with 2.9.1's Lucene.Net port; this is because contrib.Net port has not
been kept up to date.  And yes, virtually every project in contrib has a
JUnit test associated with it, thus it can be used for validation of a
project port.

Regarding the .NET'es of ports, this has come up few times in the past, and
it's tempting to want to make Lucene.Net more .NET'es.  However. this is
very hard to achieve without solid commitment and being at a
commit-per-commit port with Lucene Java (i.e.: anytime a commit in Lucene
Java happens, it must, within days, be ported over to Lucene.Net and
committed).

Many of the projects in contrib, the task to port them is much simpler than
it is for the core Lucene code.  However, here is where things get
challenging.  Any time you think about making a port more .NET'es, you must
keep the following in mind:

1) It will be more work and harder to keep the code in sync with the Java
version (per the above reasons), and
2) The code in contrib may no longer work with the code in Lucene core due
to the .NET'es of the port (mainly public APIs).  Thus, your effort at
.NET'es of contrib port may be limited if Lucene core code isn't.

What's the take away?  Until when we can maintain commit-per-commit port
with Lucene Java, trying to make Lucene.Net and / or contrib more .NET'es
isn't realistic.

-- George


-----Original Message-----
From: Eran Sevi [mailto:eransevi@gmail.com]
Sent: Monday, November 23, 2009 3:12 PM
To: lucene-net-dev@incubator.apache.org
Subject: Re: port of contrib packages from java

Although some contrib packages might not be in use by any lucene .net user
at the moment, I think we should port them all in accordance with the java
version (it shouldn't be as hard as the core classes although I'm not sure
there are any tests for them).
When and if we'll diverge from the core java implementation in order to take
benefit of .net and apply each patch as it comes, we can do the same for
contrib which also sees much less traffic anyway.

Eran

On Mon, Nov 23, 2009 at 8:27 PM, Digy <digydigy@gmail.com> wrote:

> I don't know whether there is such a preference for contribs or not, but
> diverging from Java makes life harder for further ports.
> Will someone be able  to easily port the next release after your state of
> art work following .NET best practices?
> Or a new port from scratch?
>
> DIGY
>
> -----Original Message-----
> From: Nicholas Paldino [.NET/C# MVP] [mailto:casperOne@caspershouse.com]
> Sent: Monday, November 23, 2009 7:11 PM
> To: lucene-net-dev@incubator.apache.org
> Subject: RE: port of contrib packages from java
>
>        On a somewhat related note, these ports, do they adhere to the
> tenants applied to the main trunk, or can they better follow .NET best
> practices if one wants to apply them?
>
>                - Nick
>
> -----Original Message-----
> From: Eran Sevi [mailto:eransevi@gmail.com]
> Sent: Monday, November 23, 2009 8:45 AM
> To: lucene-net-dev@incubator.apache.org
> Subject: Re: port of contrib packages from java
>
> Thanks,
> I'm more into the "queries" package.
> If no one will beat me to it, I hope I can help and add it myself.
>
> How did you do the port? manually or using some conversion tools?
>
> Eran.
>
> On Mon, Nov 23, 2009 at 3:34 PM, Roger Chapman <roger@stormid.com> wrote:
>
> > I've done a first pass port of the Spatial Contrib project :
> > https://issues.apache.org/jira/browse/LUCENENET-199
> >
> > Roger.
> >
> > -----Original Message-----
> > From: Eran Sevi [mailto:eransevi@gmail.com]
> > Sent: 23 November 2009 13:17
> > To: lucene-net-dev@incubator.apache.org
> > Subject: port of contrib packages from java
> >
> > Hi,
> > Is there any thought to port all the contrib packages from java lucene
> > after
> > the porting of core 2.9.1 version is complete?
> > Currently there are 23 packages in java contrib compared to only 7
> packages
> > in .net contrib.
> >
> > Thanks,
> > Eran.
> >
>
>


Mime
View raw message