lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ciaran Roarty" <ciaran.roa...@gmail.com>
Subject Re: Lucene.Net project involvement
Date Wed, 28 Mar 2007 22:35:39 GMT
Ayende

In your opinion, would you say that taking Lucene.Net 2.1 as a baseline and
making it 'pure' .NET would be a sensible thing to do?

Ciaran


On 28/03/07, Ayende Rahien <ayende@ayende.com> wrote:
>
> I have some experience with porting projects from Java to C#, most often,
> the port is done once, similar to the way it is done on Lucene, and
> porting
> new features is done on a per case basis, mostly by hand.
> This allows to take greater advantage on the capabilities of the .Net
> platform, as well as add additional behavior that may not exists in the
> original platform
>
> On 3/28/07, Michael Garski <mgarski@mac.com> wrote:
> >
> > Everyone -
> >
> > I feel I have to chip my 2 cents in regarding the 'throw' issue.  The
> > exception throwing inside Lucene, particularly during indexing
> > operations and on a smaller scale when using QueryParser can be safely
> > altered without affecting either of the 2 goals you list - making the
> > index cross compatible with Java and maintaining consistent [external]
> > API.
> >
> > The indexes we maintain are constantly being updated as they contain
> > millions of small documents with relatively volatile data.  Seeing
> > upwards of 8000/exceptions per second while maintaining those indexes
> > prompted us to dig into the internals of Lucene.NET to alter the
> > throws.  We also modified the internal data structures to use generic
> > collections rather than synchronized arraylists and hashtables to cut
> > down on the large amount of small object creation we were seeing in a
> > profiler.  The end result cut the exceptions to 0 and significantly
> > increased performance during index time.  All modifications we have made
> > still result in passing unit tests.
> >
> > I would venture to say that the vast majority of Lucene.NET users would
> > not greatly benefit from these performance improvements unless they are
> > working on a _very_ high-volume application such as we are.  We
> > currently maintain our own branch of Lucene.NET, incorporating any
> > changes made to the SubVersion trunk into our branch.  As it appears
> > these changes are not desired in the official Lucene.NET releases, the
> > changes are not difficult for anyone to make on their own should they
> > choose to do so.  One of the advantages of open source
> >
> > Thanks,
> >
> > Michael
> >
> > PS: if you have experience with Lucene.NET, high volume server
> > applications, live in the Los Angeles area, and are looking for a new
> > job, please email me off the list at mgarski[at]mac[dot]com with a
> > recent resume... we are hiring.
> >
> > George Aroush wrote:
> > > Hi Michael, Ciaran and all,
> > >
> > > Ciaran: welcome aboard to the mailing list and I am glad to see your
> > email
> > > generated some interest; I welcome any help you or anyone can offer
> > working
> > > on Lucene.Net.
> > >
> > > My goal of Lucene.Net are to meet the followings:
> > > 1) Index is cross compatible with Java's Lucene such that you can
> > read/write
> > > to the same index concurrently using C# of Java Lucene.
> > > 2) The APIs are consistent between C# and Java Lucene.  This is why I
> > use
> > > "GetXYZ()" instead of C# prosperities.
> > >
> > > Up to release 2.0, I kept Lucene.Net on .NET 1.1 because I wanted to
> > support
> > > more .NET installation as possible.  With Lucene.Net 2.1 release it's
> > time
> > > to move to .NET 2.0 -- I don't think anyone has any objection to this,
> > but
> > > Mono may have some issues.
> > >
> > > As for the code clean up, this maybe difficult and it depends on what
> > clean
> > > up you mean.  Take a look at open JIRA issues against Lucene.Net and
> you
> > > will see few about over using "throw".  Those, unfortunately, we can't
> > fix.
> > > Why?  Because those "throw" are also present in Java Lucene and trying
> > to
> > > 'fix' them in Lucene.Net may in effect alter the behavior of
> Lucene.Net.
> > > This said, any extra code or "throw" introduced into Lucene.Net due to
> > > conversion mistakes should be fixed.
> > >
> > > As for the warnings, I don't have direct experience looking at them
> > using
> > > VS.NET 2005 (I still use VS.NET 2003)  But in VS.NET 2003, most of
> those
> > > warnings are from comments -- i.e.: the class and API XML
> documentation
> > that
> > > don't get converted correctly from Java to C#.  If you can think of a
> > tool
> > > to clean them up, please let me know.  If it's something else you are
> > > talking about, please let me know.
> > >
> > > Finally, making the Lucene.Net code more compliant to .NET / C#
> standard
> > > would be, in my opinion, a nice thing to have.  But before we can do
> so,
> > we
> > > must get the port working and keep in mind my goal #2 above.
> > >
> > > Lets discuss this topic further.  Next week, I expect to release an
> > early
> > > release of Lucene.Net 2.1.  If folks can help to finish off the
> > conversion,
> > > then we can get this out much sooner then previous release.
> > >
> > > Regards,
> > >
> > > -- George Aroush
> > >
> > >
> > > -----Original Message-----
> > > From: Michael Mitiaguin [mailto:mitiaguinm@optusnet.com.au]
> > > Sent: Tuesday, March 27, 2007 9:19 PM
> > > To: lucene-net-dev@incubator.apache.org
> > > Subject: Re: Lucene.Net project involvement
> > >
> > > Ciaran,
> > >
> > > What I can't understand if core of synchronising versions with Java
> > > Lucene is   Java Language Conversion Assistant, how all this cleaning
> > > up/revising  is going to work.
> > > Would it be  possible to build automated procedure which preserve all
> > .Net
> > > improvements after conversion from major upgrade from Java ?  I  am
> not
> > > sure.
> > > Even if to track somehow  only changed/added Java classes still for
> each
> > > such class merging new/revised Java  functionality with previous
> manual
> > > changes to utilise  .Net capabalities is required.
> > > You used term component , but Lucene is rather API with fine grained
> > classes
> > > and a simple change may propagate into  several  classes  (
> > files  in  Java
> > > ) .
> > > I don't know how George is coping with that and what would be the plan
> > if
> > > say tomorrow Lucene Java 3 will be realeased.
> > >
> > > Michael
> > >
> > > Ciaran Roarty wrote:
> > >
> > >
> > >> Michael
> > >>
> > >> I've been in touch with George about getting involved and he said to
> > >> post to
> > >> the mailing list.
> > >>
> > >> I reckon there's a fair amount of work could be done in changing the
> > >> codebase without affecting the published interface and I reckon
> that's
> > >> where
> > >> the bulk of the initial work would take place; as we know, the code
> is
> > >> not
> > >> yet optimised for .NET.
> > >>
> > >> Now, balanced against that, in my opinion are the following factors:
> > >>
> > >> - The code currently compiles against 1.1 and 2.0 (albeit with some
> > >> obsolence); any change to move Lucene.Net to 2.0 would leave the
> > >> 1.1codebase behind.
> > >> - There are different types of contribution to the codebase: cleaning
> > up
> > >> code; revising methods and classes to benefit .NET standards and
> > >> capabilities is a good thing. However, Lucene is a powerful IR
> > >> component and
> > >> if the core development of those capabilities happens in the Java
> > version
> > >> then we will need to follow that.
> > >>
> > >> That's my thoughts for the moment. Maybe we could take a specific
> part
> > of
> > >> the component and revise that. Learning lessons about the process and
> > the
> > >> codebase from that exercise, we can move into the guts of the
> > >> component......
> > >>
> > >> Any thoughts?
> > >>
> > >> Ciaran
> > >>
> > >> On 27/03/07, Michael Mitiaguin <mitiaguinm@optusnet.com.au> wrote:
> > >>
> > >>
> > >>> Ciaran,
> > >>>
> > >>> The only active contributor to the project is George Aroush and
> > perhaps
> > >>> he is the only person who will give you the most definite answer.
> > >>> I am also interested only in  Net2/3 codebase . Currently vesion
> 2.0.4
> > >>> still uses VS 2003 projects and my main concern are warning messages
> > >>> about deprecated and obsolete methods when compiled under Net2.
> > >>> Supposedly it 'll be fixed in 2.1
> > >>> Also Java Lucene is more mature project with a lot of people
> involved
> > >>> and it would be safer to crosstranslate new things from there taking
> > >>> into consideration  .Net specifics.
> > >>> From other hand in my case if Lucene will be part of a  project
> where
> > >>> all warning messages considered to be the errors which must be
> > >>> eliminated , it it beyond my competency what can be done to achieve
> > >>> that. ( JavaCC generated code crosstranslation creates a lot of them
> )
> > >>>
> > >>> Michael
> > >>>
> > >>> Ciaran Roarty wrote:
> > >>>
> > >>>
> > >>>> Anthony
> > >>>>
> > >>>> I too have used Lucene.Net with C# 2.0 to great effect. However,
I
> am
> > >>>> discussing the use of .Net 2.0 in the codebase itself; and, if
not,
> > >>>>
> > >>> the
> > >>>
> > >>>> optimisation of the codebase for .Net in general.
> > >>>>
> > >>>> Ciaran
> > >>>>
> > >>>>
> > >>>> On 26/03/07, tony njedeh <njedeh@yahoo.com> wrote:
> > >>>>
> > >>>>
> > >>>>> I set up my lucene to a .net 2.0 framework, using VB and it
works
> > >>>>> well in
> > >>>>> that environment.
> > >>>>>
> > >>>>> Anthony
> > >>>>>
> > >>>>> Ciaran Roarty <ciaran.roarty@gmail.com> wrote:
> > >>>>> George et al
> > >>>>>
> > >>>>> I have been using Lucene.Net in a proof-of-concept environment
for
> > >>>>>
> > >>> the
> > >>>
> > >>>>> last
> > >>>>> couple of months - with my colleague Guy Steel - and we wanted
to
> > get
> > >>>>> involved in its development.
> > >>>>>
> > >>>>> I am a .NET developer for a large consultancy company and would
> > >>>>>
> > >>> like to
> > >>>
> > >>>>> get
> > >>>>> involved in making Lucene.Net more aligned to .NET and .NET
2/3 in
> > >>>>> particular. However, I am not sure if that is something which
is
> > >>>>> initially
> > >>>>> planned for Lucene.Net. As I understand it, the majority of
the
> > >>>>> conversion
> > >>>>> has been done, initially, using the Java Language Conversion
> > >>>>>
> > >>> Assistant.
> > >>>
> > >>>>> Some
> > >>>>> of the Java codebase uses patterns that are not best practice
for
> > >>>>>
> > >>> .NET
> > >>> -
> > >>>
> > >>>>> such as using Exceptions for non-exceptional circumstances.
This
> is
> > >>>>> not to
> > >>>>> denigrate Lucene.Net, it is one of the best pieces of software
I
> > have
> > >>>>> used.
> > >>>>>
> > >>>>> So, this email should be considered an introduction and a request
> > >>>>>
> > >>> to be
> > >>>
> > >>>>> allowed to get involved. I have never worked on an Open Source
> > >>>>>
> > >>> project
> > >>>
> > >>>>> before so I'll need some guidance but I am willing to learn.
I do
> > >>>>>
> > >>> have
> > >>> a
> > >>>
> > >>>>> couple of questions to start with:
> > >>>>>
> > >>>>> - Is there a roadmap for the product? Is there a roadmap for
> Lucene
> > >>>>>
> > >>> that
> > >>>
> > >>>>> we
> > >>>>> will try and follow?
> > >>>>> - Is there a preferred version of the .NET Framework that it
is
> > >>>>> planned to
> > >>>>> support?
> > >>>>>
> > >>>>> Enough for now, just wanted to introduce myself and get involved.
> > >>>>>
> > >>>>> Cheers,
> > >>>>> Ciaran
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message