lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vincent Daron <vda...@ask.be>
Subject Re: Lucene.NET Community Status
Date Tue, 02 Nov 2010 07:53:47 GMT
Hi all

Sharpen from db4o could maybe replace jlca

My 2cents...

Vincent

Le 1 nov. 2010 à 21:55, George Aroush <george@aroush.net> a écrit :

> Let me jump in here and offer some perspective about Lucene.Net  
> (btw, it's not Lucene.NET :-) ).  This is based on my past  
> involvement with the project -- since 2003 when it was on SourceForge.net 
>  and called dotLucene.
>
> 1) Up until early this year, I have been porting and supporting Lucene.Net 
>  since ver 1.4 (back in 2004 on SourceForge.net) to the current  
> release on trunk ver. 2.9.2.  This is in NO WAY to say that others  
> have not helped or contributed.  I'm just saying that I know the  
> history and have the experience (I wrote and worked on search  
> engines from 1998 to 2002).
>
> 2) Doing an initial port of a new Java Lucene release to C# Lucene  
> is very hard; it's the most complex part of the port even using  
> automated tools such as JLCA and my own customize scripts which I  
> use pre-and port JLCA (you can search the listing on how I do the  
> port).  What used to take me about 1 months with 90% of tests  
> passing took me well over 4 months (for 2.9.x) with only 10% of  
> tests passing.  This was no easy effort and won't be easier now  
> since Java Lucene is using new Java language features that JLCA is  
> not aware of (MS is not maintaining JLCA).  Put another way, porting  
> is hard especially when you are dealing with > 5.6 GB source code  
> consistent of > 610 source files.  You will know this ONLY if you  
> have tried it out and maintained it -- this is why no one has  
> stepped up to do an initial port otherwise there would be a port by  
> now not only of Java Lucene but other projects too.
>
> 3) To simplify ports of new release, maintaining as small as  
> possible delta between release is very important. This was a main  
> pain point when I ported from 2.4 to 2.9.  The in-between ports were  
> never done due to lack of time on my end.  See point #2.
>
> 4) Diverging away from Java Lucene, both API base and algorithm is  
> risky and will just make point #2 more evidence.  Not only will you  
> now need a deep knowledge of search engines to catch bugs, but also  
> a deep knowledge of Lucene's internals.  Also, you risk  
> compatibility as well as books and existing resources on the web  
> that cover Lucene -- hack, one can take any Java Lucene example and  
> easily read it as a Lucene.Net code or use Luke to debug an index.   
> Keep in mind, the current port model that we have for Lucene.Net  
> keeps the API one-to-one in sync with Java Lucene; just upper case  
> method names.  Yes, it's not fully .NET'es, but if you are looking  
> for a search engine that is compatible with the open source search  
> engine standard, and it is available in C#, Lucene.Net is it.
>
> 5) Beside making the port simpler, and per point #3 above, doing a  
> line-per-line port, and maintaining API naming as well as the  
> algorithm and file format of Java Lucene in C# Lucene means a Lucene  
> index created by Java Lucene is usable, concurrently, by C# Lucene.   
> I have worked on one such project where a Java and C# code accessing  
> the same index.  I'm not too interested in making Lucene.Net .NET'es  
> and end up adding more risk to the project.
>
> 6) If anyone wants a different flavor of Lucene.Net, the code is on  
> Apache, just fork it and start a new project.  Make it more .NET'es,  
> use the latest that .NET has to offer, and all.  However, until when  
> you have first hand experience with the port, and a good knowledge  
> of Lucene and search engines, and the cycles to work on it, I really  
> don't want to exercise this idea it will die as I know few folks  
> have tried.
>
> 7) I can't speak for the other committers or those who contributed,  
> but for me, I do this totally during my own time.  Each hour I spent  
> on Lucene.Net is an hour away from my family or anything else.  I  
> don't get paid, and I hardly get much off my Luene.Net work on the  
> side.  As you may know, I was active with Lucene.Net till about  
> early this year, (I had a family emergency).  I want to step up  
> again, but we need more participation than just an offer to help or  
> request divergence from the goal of the project, per the points that  
> I made above.
>
> I can go on, but the above are to clarify some of the issues and  
> background of Lucene.Net.  Please keep those in mind when thinking  
> about this project and how you can contribute -- especially comments  
> about making Lucene.Net more .NET'es -- can't start that till when  
> you first achieve commit-per-commit port of Java Lucene to C# Lucene.
>
> If you agree with the above, and it makes sense to you, my  
> suggestion is as follows:
>
> 1) Lucene.Net goes back into incubation and start all over again.
> 2) Start with cleaning up the webpage and make it more like other  
> Apache project site.
> 3) Put together an official Lucene.Net 2.9.2 and get it released.
> 4) Start working on the next port.
>
> #2, #3 can happen right away, and all that it takes to do them is  
> coming up to speed on how-to using existing Apache documentation.   
> Who is up to this task?
>
> #4 is a bit more complicated.  I don't want to go through the port  
> pain that I had with 2.9.0 -- it was too much.  JLCA that comes with  
> VS 2005 is out of date; I would love to try out a newer version from www.artinsoft.com

> , but it is $$.
>
> I hope the above helps and I have not offended or discouraged anyone  
> as it isn't my intention.  I just want to clarify few things about Lucene.Net
>
> PS: One final point.  Look at CLucene, NLucene and few other  
> variation of Java Lucene ports that were done at Lucene internal  
> level with the goal of maintaining language look feature and look- 
> and-fell, such as C++, those projects are either way out of date in  
> terms of release version support or offer only partial support  
> (index read only).  I don't want to use this to bad mouth another  
> project, but to make a point that porting is hard if you diverge  
> from the core.  As is, Lucene.Net is not dead, it's slow and needs  
> contributors who will step-up.
>
> Thanks,
>
> -- George
>

Mime
View raw message