lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Garski <>
Subject Re: Lucene.Net project involvement
Date Wed, 28 Mar 2007 19:56:49 GMT
Everyone -

I feel I have to chip my 2 cents in regarding the 'throw' issue.  The 
exception throwing inside Lucene, particularly during indexing 
operations and on a smaller scale when using QueryParser can be safely 
altered without affecting either of the 2 goals you list - making the 
index cross compatible with Java and maintaining consistent [external] 

The indexes we maintain are constantly being updated as they contain 
millions of small documents with relatively volatile data.  Seeing 
upwards of 8000/exceptions per second while maintaining those indexes 
prompted us to dig into the internals of Lucene.NET to alter the 
throws.  We also modified the internal data structures to use generic 
collections rather than synchronized arraylists and hashtables to cut 
down on the large amount of small object creation we were seeing in a 
profiler.  The end result cut the exceptions to 0 and significantly 
increased performance during index time.  All modifications we have made 
still result in passing unit tests.

I would venture to say that the vast majority of Lucene.NET users would 
not greatly benefit from these performance improvements unless they are 
working on a _very_ high-volume application such as we are.  We 
currently maintain our own branch of Lucene.NET, incorporating any 
changes made to the SubVersion trunk into our branch.  As it appears 
these changes are not desired in the official Lucene.NET releases, the 
changes are not difficult for anyone to make on their own should they 
choose to do so.  One of the advantages of open source



PS: if you have experience with Lucene.NET, high volume server 
applications, live in the Los Angeles area, and are looking for a new 
job, please email me off the list at mgarski[at]mac[dot]com with a 
recent resume... we are hiring.

George Aroush wrote:
> Hi Michael, Ciaran and all,
> Ciaran: welcome aboard to the mailing list and I am glad to see your email
> generated some interest; I welcome any help you or anyone can offer working
> on Lucene.Net.
> My goal of Lucene.Net are to meet the followings:
> 1) Index is cross compatible with Java's Lucene such that you can read/write
> to the same index concurrently using C# of Java Lucene.
> 2) The APIs are consistent between C# and Java Lucene.  This is why I use
> "GetXYZ()" instead of C# prosperities.
> Up to release 2.0, I kept Lucene.Net on .NET 1.1 because I wanted to support
> more .NET installation as possible.  With Lucene.Net 2.1 release it's time
> to move to .NET 2.0 -- I don't think anyone has any objection to this, but
> Mono may have some issues.
> As for the code clean up, this maybe difficult and it depends on what clean
> up you mean.  Take a look at open JIRA issues against Lucene.Net and you
> will see few about over using "throw".  Those, unfortunately, we can't fix.
> Why?  Because those "throw" are also present in Java Lucene and trying to
> 'fix' them in Lucene.Net may in effect alter the behavior of Lucene.Net.
> This said, any extra code or "throw" introduced into Lucene.Net due to
> conversion mistakes should be fixed.
> As for the warnings, I don't have direct experience looking at them using
> VS.NET 2005 (I still use VS.NET 2003)  But in VS.NET 2003, most of those
> warnings are from comments -- i.e.: the class and API XML documentation that
> don't get converted correctly from Java to C#.  If you can think of a tool
> to clean them up, please let me know.  If it's something else you are
> talking about, please let me know.
> Finally, making the Lucene.Net code more compliant to .NET / C# standard
> would be, in my opinion, a nice thing to have.  But before we can do so, we
> must get the port working and keep in mind my goal #2 above.
> Lets discuss this topic further.  Next week, I expect to release an early
> release of Lucene.Net 2.1.  If folks can help to finish off the conversion,
> then we can get this out much sooner then previous release.
> Regards,
> -- George Aroush
> -----Original Message-----
> From: Michael Mitiaguin [] 
> Sent: Tuesday, March 27, 2007 9:19 PM
> To:
> Subject: Re: Lucene.Net project involvement
> Ciaran,
> What I can't understand if core of synchronising versions with Java 
> Lucene is   Java Language Conversion Assistant, how all this cleaning 
> up/revising  is going to work.
> Would it be  possible to build automated procedure which preserve all .Net
> improvements after conversion from major upgrade from Java ?  I  am not
> sure.
> Even if to track somehow  only changed/added Java classes still for each
> such class merging new/revised Java  functionality with previous manual
> changes to utilise  .Net capabalities is required.
> You used term component , but Lucene is rather API with fine grained classes
> and a simple change may propagate into  several  classes  ( files  in  Java
> ) .
> I don't know how George is coping with that and what would be the plan if
> say tomorrow Lucene Java 3 will be realeased.
> Michael
> Ciaran Roarty wrote:
>> Michael
>> I've been in touch with George about getting involved and he said to 
>> post to
>> the mailing list.
>> I reckon there's a fair amount of work could be done in changing the
>> codebase without affecting the published interface and I reckon that's 
>> where
>> the bulk of the initial work would take place; as we know, the code is 
>> not
>> yet optimised for .NET.
>> Now, balanced against that, in my opinion are the following factors:
>> - The code currently compiles against 1.1 and 2.0 (albeit with some
>> obsolence); any change to move Lucene.Net to 2.0 would leave the
>> 1.1codebase behind.
>> - There are different types of contribution to the codebase: cleaning up
>> code; revising methods and classes to benefit .NET standards and
>> capabilities is a good thing. However, Lucene is a powerful IR 
>> component and
>> if the core development of those capabilities happens in the Java version
>> then we will need to follow that.
>> That's my thoughts for the moment. Maybe we could take a specific part of
>> the component and revise that. Learning lessons about the process and the
>> codebase from that exercise, we can move into the guts of the
>> component......
>> Any thoughts?
>> Ciaran
>> On 27/03/07, Michael Mitiaguin <> wrote:
>>> Ciaran,
>>> The only active contributor to the project is George Aroush and perhaps
>>> he is the only person who will give you the most definite answer.
>>> I am also interested only in  Net2/3 codebase . Currently vesion 2.0.4
>>> still uses VS 2003 projects and my main concern are warning messages
>>> about deprecated and obsolete methods when compiled under Net2.
>>> Supposedly it 'll be fixed in 2.1
>>> Also Java Lucene is more mature project with a lot of people involved
>>> and it would be safer to crosstranslate new things from there taking
>>> into consideration  .Net specifics.
>>> From other hand in my case if Lucene will be part of a  project where
>>> all warning messages considered to be the errors which must be
>>> eliminated , it it beyond my competency what can be done to achieve
>>> that. ( JavaCC generated code crosstranslation creates a lot of them )
>>> Michael
>>> Ciaran Roarty wrote:
>>>> Anthony
>>>> I too have used Lucene.Net with C# 2.0 to great effect. However, I am
>>>> discussing the use of .Net 2.0 in the codebase itself; and, if not, 
>>> the
>>>> optimisation of the codebase for .Net in general.
>>>> Ciaran
>>>> On 26/03/07, tony njedeh <> wrote:
>>>>> I set up my lucene to a .net 2.0 framework, using VB and it works
>>>>> well in
>>>>> that environment.
>>>>> Anthony
>>>>> Ciaran Roarty <> wrote:
>>>>> George et al
>>>>> I have been using Lucene.Net in a proof-of-concept environment for 
>>> the
>>>>> last
>>>>> couple of months - with my colleague Guy Steel - and we wanted to get
>>>>> involved in its development.
>>>>> I am a .NET developer for a large consultancy company and would 
>>> like to
>>>>> get
>>>>> involved in making Lucene.Net more aligned to .NET and .NET 2/3 in
>>>>> particular. However, I am not sure if that is something which is
>>>>> initially
>>>>> planned for Lucene.Net. As I understand it, the majority of the
>>>>> conversion
>>>>> has been done, initially, using the Java Language Conversion 
>>> Assistant.
>>>>> Some
>>>>> of the Java codebase uses patterns that are not best practice for 
>>> .NET
>>> -
>>>>> such as using Exceptions for non-exceptional circumstances. This is
>>>>> not to
>>>>> denigrate Lucene.Net, it is one of the best pieces of software I have
>>>>> used.
>>>>> So, this email should be considered an introduction and a request 
>>> to be
>>>>> allowed to get involved. I have never worked on an Open Source 
>>> project
>>>>> before so I'll need some guidance but I am willing to learn. I do 
>>> have
>>> a
>>>>> couple of questions to start with:
>>>>> - Is there a roadmap for the product? Is there a roadmap for Lucene
>>> that
>>>>> we
>>>>> will try and follow?
>>>>> - Is there a preferred version of the .NET Framework that it is
>>>>> planned to
>>>>> support?
>>>>> Enough for now, just wanted to introduce myself and get involved.
>>>>> Cheers,
>>>>> Ciaran

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message