lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ciaran Roarty" <ciaran.roa...@gmail.com>
Subject Re: Lucene.Net project involvement
Date Wed, 28 Mar 2007 22:39:37 GMT
Michael

I agree with you about the exceptions and the general point about
maintaining the public API whilst not necessarily keeping the underlying
processing.

Can you make your modifications to the codebase available to the community
or not?

Ciaran


On 28/03/07, Michael Garski <mgarski@mac.com> wrote:
>
> Everyone -
>
> I feel I have to chip my 2 cents in regarding the 'throw' issue.  The
> exception throwing inside Lucene, particularly during indexing
> operations and on a smaller scale when using QueryParser can be safely
> altered without affecting either of the 2 goals you list - making the
> index cross compatible with Java and maintaining consistent [external]
> API.
>
> The indexes we maintain are constantly being updated as they contain
> millions of small documents with relatively volatile data.  Seeing
> upwards of 8000/exceptions per second while maintaining those indexes
> prompted us to dig into the internals of Lucene.NET to alter the
> throws.  We also modified the internal data structures to use generic
> collections rather than synchronized arraylists and hashtables to cut
> down on the large amount of small object creation we were seeing in a
> profiler.  The end result cut the exceptions to 0 and significantly
> increased performance during index time.  All modifications we have made
> still result in passing unit tests.
>
> I would venture to say that the vast majority of Lucene.NET users would
> not greatly benefit from these performance improvements unless they are
> working on a _very_ high-volume application such as we are.  We
> currently maintain our own branch of Lucene.NET, incorporating any
> changes made to the SubVersion trunk into our branch.  As it appears
> these changes are not desired in the official Lucene.NET releases, the
> changes are not difficult for anyone to make on their own should they
> choose to do so.  One of the advantages of open source
>
> Thanks,
>
> Michael
>
> PS: if you have experience with Lucene.NET, high volume server
> applications, live in the Los Angeles area, and are looking for a new
> job, please email me off the list at mgarski[at]mac[dot]com with a
> recent resume... we are hiring.
>
> George Aroush wrote:
> > Hi Michael, Ciaran and all,
> >
> > Ciaran: welcome aboard to the mailing list and I am glad to see your
> email
> > generated some interest; I welcome any help you or anyone can offer
> working
> > on Lucene.Net.
> >
> > My goal of Lucene.Net are to meet the followings:
> > 1) Index is cross compatible with Java's Lucene such that you can
> read/write
> > to the same index concurrently using C# of Java Lucene.
> > 2) The APIs are consistent between C# and Java Lucene.  This is why I
> use
> > "GetXYZ()" instead of C# prosperities.
> >
> > Up to release 2.0, I kept Lucene.Net on .NET 1.1 because I wanted to
> support
> > more .NET installation as possible.  With Lucene.Net 2.1 release it's
> time
> > to move to .NET 2.0 -- I don't think anyone has any objection to this,
> but
> > Mono may have some issues.
> >
> > As for the code clean up, this maybe difficult and it depends on what
> clean
> > up you mean.  Take a look at open JIRA issues against Lucene.Net and you
> > will see few about over using "throw".  Those, unfortunately, we can't
> fix.
> > Why?  Because those "throw" are also present in Java Lucene and trying
> to
> > 'fix' them in Lucene.Net may in effect alter the behavior of Lucene.Net.
> > This said, any extra code or "throw" introduced into Lucene.Net due to
> > conversion mistakes should be fixed.
> >
> > As for the warnings, I don't have direct experience looking at them
> using
> > VS.NET 2005 (I still use VS.NET 2003)  But in VS.NET 2003, most of those
> > warnings are from comments -- i.e.: the class and API XML documentation
> that
> > don't get converted correctly from Java to C#.  If you can think of a
> tool
> > to clean them up, please let me know.  If it's something else you are
> > talking about, please let me know.
> >
> > Finally, making the Lucene.Net code more compliant to .NET / C# standard
> > would be, in my opinion, a nice thing to have.  But before we can do so,
> we
> > must get the port working and keep in mind my goal #2 above.
> >
> > Lets discuss this topic further.  Next week, I expect to release an
> early
> > release of Lucene.Net 2.1.  If folks can help to finish off the
> conversion,
> > then we can get this out much sooner then previous release.
> >
> > Regards,
> >
> > -- George Aroush
> >
> >
> > -----Original Message-----
> > From: Michael Mitiaguin [mailto:mitiaguinm@optusnet.com.au]
> > Sent: Tuesday, March 27, 2007 9:19 PM
> > To: lucene-net-dev@incubator.apache.org
> > Subject: Re: Lucene.Net project involvement
> >
> > Ciaran,
> >
> > What I can't understand if core of synchronising versions with Java
> > Lucene is   Java Language Conversion Assistant, how all this cleaning
> > up/revising  is going to work.
> > Would it be  possible to build automated procedure which preserve all
> .Net
> > improvements after conversion from major upgrade from Java ?  I  am not
> > sure.
> > Even if to track somehow  only changed/added Java classes still for each
> > such class merging new/revised Java  functionality with previous manual
> > changes to utilise  .Net capabalities is required.
> > You used term component , but Lucene is rather API with fine grained
> classes
> > and a simple change may propagate into  several  classes  (
> files  in  Java
> > ) .
> > I don't know how George is coping with that and what would be the plan
> if
> > say tomorrow Lucene Java 3 will be realeased.
> >
> > Michael
> >
> > Ciaran Roarty wrote:
> >
> >
> >> Michael
> >>
> >> I've been in touch with George about getting involved and he said to
> >> post to
> >> the mailing list.
> >>
> >> I reckon there's a fair amount of work could be done in changing the
> >> codebase without affecting the published interface and I reckon that's
> >> where
> >> the bulk of the initial work would take place; as we know, the code is
> >> not
> >> yet optimised for .NET.
> >>
> >> Now, balanced against that, in my opinion are the following factors:
> >>
> >> - The code currently compiles against 1.1 and 2.0 (albeit with some
> >> obsolence); any change to move Lucene.Net to 2.0 would leave the
> >> 1.1codebase behind.
> >> - There are different types of contribution to the codebase: cleaning
> up
> >> code; revising methods and classes to benefit .NET standards and
> >> capabilities is a good thing. However, Lucene is a powerful IR
> >> component and
> >> if the core development of those capabilities happens in the Java
> version
> >> then we will need to follow that.
> >>
> >> That's my thoughts for the moment. Maybe we could take a specific part
> of
> >> the component and revise that. Learning lessons about the process and
> the
> >> codebase from that exercise, we can move into the guts of the
> >> component......
> >>
> >> Any thoughts?
> >>
> >> Ciaran
> >>
> >> On 27/03/07, Michael Mitiaguin <mitiaguinm@optusnet.com.au> wrote:
> >>
> >>
> >>> Ciaran,
> >>>
> >>> The only active contributor to the project is George Aroush and
> perhaps
> >>> he is the only person who will give you the most definite answer.
> >>> I am also interested only in  Net2/3 codebase . Currently vesion 2.0.4
> >>> still uses VS 2003 projects and my main concern are warning messages
> >>> about deprecated and obsolete methods when compiled under Net2.
> >>> Supposedly it 'll be fixed in 2.1
> >>> Also Java Lucene is more mature project with a lot of people involved
> >>> and it would be safer to crosstranslate new things from there taking
> >>> into consideration  .Net specifics.
> >>> From other hand in my case if Lucene will be part of a  project where
> >>> all warning messages considered to be the errors which must be
> >>> eliminated , it it beyond my competency what can be done to achieve
> >>> that. ( JavaCC generated code crosstranslation creates a lot of them )
> >>>
> >>> Michael
> >>>
> >>> Ciaran Roarty wrote:
> >>>
> >>>
> >>>> Anthony
> >>>>
> >>>> I too have used Lucene.Net with C# 2.0 to great effect. However, I am
> >>>> discussing the use of .Net 2.0 in the codebase itself; and, if not,
> >>>>
> >>> the
> >>>
> >>>> optimisation of the codebase for .Net in general.
> >>>>
> >>>> Ciaran
> >>>>
> >>>>
> >>>> On 26/03/07, tony njedeh <njedeh@yahoo.com> wrote:
> >>>>
> >>>>
> >>>>> I set up my lucene to a .net 2.0 framework, using VB and it works
> >>>>> well in
> >>>>> that environment.
> >>>>>
> >>>>> Anthony
> >>>>>
> >>>>> Ciaran Roarty <ciaran.roarty@gmail.com> wrote:
> >>>>> George et al
> >>>>>
> >>>>> I have been using Lucene.Net in a proof-of-concept environment for
> >>>>>
> >>> the
> >>>
> >>>>> last
> >>>>> couple of months - with my colleague Guy Steel - and we wanted to
> get
> >>>>> involved in its development.
> >>>>>
> >>>>> I am a .NET developer for a large consultancy company and would
> >>>>>
> >>> like to
> >>>
> >>>>> get
> >>>>> involved in making Lucene.Net more aligned to .NET and .NET 2/3
in
> >>>>> particular. However, I am not sure if that is something which is
> >>>>> initially
> >>>>> planned for Lucene.Net. As I understand it, the majority of the
> >>>>> conversion
> >>>>> has been done, initially, using the Java Language Conversion
> >>>>>
> >>> Assistant.
> >>>
> >>>>> Some
> >>>>> of the Java codebase uses patterns that are not best practice for
> >>>>>
> >>> .NET
> >>> -
> >>>
> >>>>> such as using Exceptions for non-exceptional circumstances. This
is
> >>>>> not to
> >>>>> denigrate Lucene.Net, it is one of the best pieces of software I
> have
> >>>>> used.
> >>>>>
> >>>>> So, this email should be considered an introduction and a request
> >>>>>
> >>> to be
> >>>
> >>>>> allowed to get involved. I have never worked on an Open Source
> >>>>>
> >>> project
> >>>
> >>>>> before so I'll need some guidance but I am willing to learn. I do
> >>>>>
> >>> have
> >>> a
> >>>
> >>>>> couple of questions to start with:
> >>>>>
> >>>>> - Is there a roadmap for the product? Is there a roadmap for Lucene
> >>>>>
> >>> that
> >>>
> >>>>> we
> >>>>> will try and follow?
> >>>>> - Is there a preferred version of the .NET Framework that it is
> >>>>> planned to
> >>>>> support?
> >>>>>
> >>>>> Enough for now, just wanted to introduce myself and get involved.
> >>>>>
> >>>>> Cheers,
> >>>>> Ciaran
> >>>>>
> >>>>>
> >>>>>
> >>>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message