lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Garski <mgar...@mac.com>
Subject Re: Lucene.Net project involvement
Date Wed, 28 Mar 2007 22:42:21 GMT
I can make diffs available in a few days when I get the opportunity to 
do so.

Michael

Ciaran Roarty wrote:
> Michael
>
> I agree with you about the exceptions and the general point about
> maintaining the public API whilst not necessarily keeping the underlying
> processing.
>
> Can you make your modifications to the codebase available to the 
> community
> or not?
>
> Ciaran
>
>
> On 28/03/07, Michael Garski <mgarski@mac.com> wrote:
>>
>> Everyone -
>>
>> I feel I have to chip my 2 cents in regarding the 'throw' issue.  The
>> exception throwing inside Lucene, particularly during indexing
>> operations and on a smaller scale when using QueryParser can be safely
>> altered without affecting either of the 2 goals you list - making the
>> index cross compatible with Java and maintaining consistent [external]
>> API.
>>
>> The indexes we maintain are constantly being updated as they contain
>> millions of small documents with relatively volatile data.  Seeing
>> upwards of 8000/exceptions per second while maintaining those indexes
>> prompted us to dig into the internals of Lucene.NET to alter the
>> throws.  We also modified the internal data structures to use generic
>> collections rather than synchronized arraylists and hashtables to cut
>> down on the large amount of small object creation we were seeing in a
>> profiler.  The end result cut the exceptions to 0 and significantly
>> increased performance during index time.  All modifications we have made
>> still result in passing unit tests.
>>
>> I would venture to say that the vast majority of Lucene.NET users would
>> not greatly benefit from these performance improvements unless they are
>> working on a _very_ high-volume application such as we are.  We
>> currently maintain our own branch of Lucene.NET, incorporating any
>> changes made to the SubVersion trunk into our branch.  As it appears
>> these changes are not desired in the official Lucene.NET releases, the
>> changes are not difficult for anyone to make on their own should they
>> choose to do so.  One of the advantages of open source
>>
>> Thanks,
>>
>> Michael
>>
>> PS: if you have experience with Lucene.NET, high volume server
>> applications, live in the Los Angeles area, and are looking for a new
>> job, please email me off the list at mgarski[at]mac[dot]com with a
>> recent resume... we are hiring.
>>
>> George Aroush wrote:
>> > Hi Michael, Ciaran and all,
>> >
>> > Ciaran: welcome aboard to the mailing list and I am glad to see your
>> email
>> > generated some interest; I welcome any help you or anyone can offer
>> working
>> > on Lucene.Net.
>> >
>> > My goal of Lucene.Net are to meet the followings:
>> > 1) Index is cross compatible with Java's Lucene such that you can
>> read/write
>> > to the same index concurrently using C# of Java Lucene.
>> > 2) The APIs are consistent between C# and Java Lucene.  This is why I
>> use
>> > "GetXYZ()" instead of C# prosperities.
>> >
>> > Up to release 2.0, I kept Lucene.Net on .NET 1.1 because I wanted to
>> support
>> > more .NET installation as possible.  With Lucene.Net 2.1 release it's
>> time
>> > to move to .NET 2.0 -- I don't think anyone has any objection to this,
>> but
>> > Mono may have some issues.
>> >
>> > As for the code clean up, this maybe difficult and it depends on what
>> clean
>> > up you mean.  Take a look at open JIRA issues against Lucene.Net 
>> and you
>> > will see few about over using "throw".  Those, unfortunately, we can't
>> fix.
>> > Why?  Because those "throw" are also present in Java Lucene and trying
>> to
>> > 'fix' them in Lucene.Net may in effect alter the behavior of 
>> Lucene.Net.
>> > This said, any extra code or "throw" introduced into Lucene.Net due to
>> > conversion mistakes should be fixed.
>> >
>> > As for the warnings, I don't have direct experience looking at them
>> using
>> > VS.NET 2005 (I still use VS.NET 2003)  But in VS.NET 2003, most of 
>> those
>> > warnings are from comments -- i.e.: the class and API XML 
>> documentation
>> that
>> > don't get converted correctly from Java to C#.  If you can think of a
>> tool
>> > to clean them up, please let me know.  If it's something else you are
>> > talking about, please let me know.
>> >
>> > Finally, making the Lucene.Net code more compliant to .NET / C# 
>> standard
>> > would be, in my opinion, a nice thing to have.  But before we can 
>> do so,
>> we
>> > must get the port working and keep in mind my goal #2 above.
>> >
>> > Lets discuss this topic further.  Next week, I expect to release an
>> early
>> > release of Lucene.Net 2.1.  If folks can help to finish off the
>> conversion,
>> > then we can get this out much sooner then previous release.
>> >
>> > Regards,
>> >
>> > -- George Aroush
>> >
>> >
>> > -----Original Message-----
>> > From: Michael Mitiaguin [mailto:mitiaguinm@optusnet.com.au]
>> > Sent: Tuesday, March 27, 2007 9:19 PM
>> > To: lucene-net-dev@incubator.apache.org
>> > Subject: Re: Lucene.Net project involvement
>> >
>> > Ciaran,
>> >
>> > What I can't understand if core of synchronising versions with Java
>> > Lucene is   Java Language Conversion Assistant, how all this cleaning
>> > up/revising  is going to work.
>> > Would it be  possible to build automated procedure which preserve all
>> .Net
>> > improvements after conversion from major upgrade from Java ?  I  am 
>> not
>> > sure.
>> > Even if to track somehow  only changed/added Java classes still for 
>> each
>> > such class merging new/revised Java  functionality with previous 
>> manual
>> > changes to utilise  .Net capabalities is required.
>> > You used term component , but Lucene is rather API with fine grained
>> classes
>> > and a simple change may propagate into  several  classes  (
>> files  in  Java
>> > ) .
>> > I don't know how George is coping with that and what would be the plan
>> if
>> > say tomorrow Lucene Java 3 will be realeased.
>> >
>> > Michael
>> >
>> > Ciaran Roarty wrote:
>> >
>> >
>> >> Michael
>> >>
>> >> I've been in touch with George about getting involved and he said to
>> >> post to
>> >> the mailing list.
>> >>
>> >> I reckon there's a fair amount of work could be done in changing the
>> >> codebase without affecting the published interface and I reckon 
>> that's
>> >> where
>> >> the bulk of the initial work would take place; as we know, the 
>> code is
>> >> not
>> >> yet optimised for .NET.
>> >>
>> >> Now, balanced against that, in my opinion are the following factors:
>> >>
>> >> - The code currently compiles against 1.1 and 2.0 (albeit with some
>> >> obsolence); any change to move Lucene.Net to 2.0 would leave the
>> >> 1.1codebase behind.
>> >> - There are different types of contribution to the codebase: cleaning
>> up
>> >> code; revising methods and classes to benefit .NET standards and
>> >> capabilities is a good thing. However, Lucene is a powerful IR
>> >> component and
>> >> if the core development of those capabilities happens in the Java
>> version
>> >> then we will need to follow that.
>> >>
>> >> That's my thoughts for the moment. Maybe we could take a specific 
>> part
>> of
>> >> the component and revise that. Learning lessons about the process and
>> the
>> >> codebase from that exercise, we can move into the guts of the
>> >> component......
>> >>
>> >> Any thoughts?
>> >>
>> >> Ciaran
>> >>
>> >> On 27/03/07, Michael Mitiaguin <mitiaguinm@optusnet.com.au> wrote:
>> >>
>> >>
>> >>> Ciaran,
>> >>>
>> >>> The only active contributor to the project is George Aroush and
>> perhaps
>> >>> he is the only person who will give you the most definite answer.
>> >>> I am also interested only in  Net2/3 codebase . Currently vesion 
>> 2.0.4
>> >>> still uses VS 2003 projects and my main concern are warning messages
>> >>> about deprecated and obsolete methods when compiled under Net2.
>> >>> Supposedly it 'll be fixed in 2.1
>> >>> Also Java Lucene is more mature project with a lot of people 
>> involved
>> >>> and it would be safer to crosstranslate new things from there taking
>> >>> into consideration  .Net specifics.
>> >>> From other hand in my case if Lucene will be part of a  project 
>> where
>> >>> all warning messages considered to be the errors which must be
>> >>> eliminated , it it beyond my competency what can be done to achieve
>> >>> that. ( JavaCC generated code crosstranslation creates a lot of 
>> them )
>> >>>
>> >>> Michael
>> >>>
>> >>> Ciaran Roarty wrote:
>> >>>
>> >>>
>> >>>> Anthony
>> >>>>
>> >>>> I too have used Lucene.Net with C# 2.0 to great effect. However,

>> I am
>> >>>> discussing the use of .Net 2.0 in the codebase itself; and, if not,
>> >>>>
>> >>> the
>> >>>
>> >>>> optimisation of the codebase for .Net in general.
>> >>>>
>> >>>> Ciaran
>> >>>>
>> >>>>
>> >>>> On 26/03/07, tony njedeh <njedeh@yahoo.com> wrote:
>> >>>>
>> >>>>
>> >>>>> I set up my lucene to a .net 2.0 framework, using VB and it
works
>> >>>>> well in
>> >>>>> that environment.
>> >>>>>
>> >>>>> Anthony
>> >>>>>
>> >>>>> Ciaran Roarty <ciaran.roarty@gmail.com> wrote:
>> >>>>> George et al
>> >>>>>
>> >>>>> I have been using Lucene.Net in a proof-of-concept environment
for
>> >>>>>
>> >>> the
>> >>>
>> >>>>> last
>> >>>>> couple of months - with my colleague Guy Steel - and we wanted
to
>> get
>> >>>>> involved in its development.
>> >>>>>
>> >>>>> I am a .NET developer for a large consultancy company and would
>> >>>>>
>> >>> like to
>> >>>
>> >>>>> get
>> >>>>> involved in making Lucene.Net more aligned to .NET and .NET
2/3 in
>> >>>>> particular. However, I am not sure if that is something which
is
>> >>>>> initially
>> >>>>> planned for Lucene.Net. As I understand it, the majority of
the
>> >>>>> conversion
>> >>>>> has been done, initially, using the Java Language Conversion
>> >>>>>
>> >>> Assistant.
>> >>>
>> >>>>> Some
>> >>>>> of the Java codebase uses patterns that are not best practice
for
>> >>>>>
>> >>> .NET
>> >>> -
>> >>>
>> >>>>> such as using Exceptions for non-exceptional circumstances.

>> This is
>> >>>>> not to
>> >>>>> denigrate Lucene.Net, it is one of the best pieces of software
I
>> have
>> >>>>> used.
>> >>>>>
>> >>>>> So, this email should be considered an introduction and a request
>> >>>>>
>> >>> to be
>> >>>
>> >>>>> allowed to get involved. I have never worked on an Open Source
>> >>>>>
>> >>> project
>> >>>
>> >>>>> before so I'll need some guidance but I am willing to learn.
I do
>> >>>>>
>> >>> have
>> >>> a
>> >>>
>> >>>>> couple of questions to start with:
>> >>>>>
>> >>>>> - Is there a roadmap for the product? Is there a roadmap for

>> Lucene
>> >>>>>
>> >>> that
>> >>>
>> >>>>> we
>> >>>>> will try and follow?
>> >>>>> - Is there a preferred version of the .NET Framework that it
is
>> >>>>> planned to
>> >>>>> support?
>> >>>>>
>> >>>>> Enough for now, just wanted to introduce myself and get involved.
>> >>>>>
>> >>>>> Cheers,
>> >>>>> Ciaran
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>
>> >
>> >
>>
>


Mime
View raw message