lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ayende Rahien" <aye...@ayende.com>
Subject Re: Lucene.Net project involvement
Date Wed, 28 Mar 2007 23:42:04 GMT
I am not familiar enough with the internals of Lucene to talk, I am afraid.

On 3/29/07, Ciaran Roarty <ciaran.roarty@gmail.com> wrote:
>
> Ayende
>
> In your opinion, would you say that taking Lucene.Net 2.1 as a baseline
> and
> making it 'pure' .NET would be a sensible thing to do?
>
> Ciaran
>
>
> On 28/03/07, Ayende Rahien <ayende@ayende.com> wrote:
> >
> > I have some experience with porting projects from Java to C#, most
> often,
> > the port is done once, similar to the way it is done on Lucene, and
> > porting
> > new features is done on a per case basis, mostly by hand.
> > This allows to take greater advantage on the capabilities of the .Net
> > platform, as well as add additional behavior that may not exists in the
> > original platform
> >
> > On 3/28/07, Michael Garski <mgarski@mac.com> wrote:
> > >
> > > Everyone -
> > >
> > > I feel I have to chip my 2 cents in regarding the 'throw' issue.  The
> > > exception throwing inside Lucene, particularly during indexing
> > > operations and on a smaller scale when using QueryParser can be safely
> > > altered without affecting either of the 2 goals you list - making the
> > > index cross compatible with Java and maintaining consistent [external]
> > > API.
> > >
> > > The indexes we maintain are constantly being updated as they contain
> > > millions of small documents with relatively volatile data.  Seeing
> > > upwards of 8000/exceptions per second while maintaining those indexes
> > > prompted us to dig into the internals of Lucene.NET to alter the
> > > throws.  We also modified the internal data structures to use generic
> > > collections rather than synchronized arraylists and hashtables to cut
> > > down on the large amount of small object creation we were seeing in a
> > > profiler.  The end result cut the exceptions to 0 and significantly
> > > increased performance during index time.  All modifications we have
> made
> > > still result in passing unit tests.
> > >
> > > I would venture to say that the vast majority of Lucene.NET users
> would
> > > not greatly benefit from these performance improvements unless they
> are
> > > working on a _very_ high-volume application such as we are.  We
> > > currently maintain our own branch of Lucene.NET, incorporating any
> > > changes made to the SubVersion trunk into our branch.  As it appears
> > > these changes are not desired in the official Lucene.NET releases, the
> > > changes are not difficult for anyone to make on their own should they
> > > choose to do so.  One of the advantages of open source
> > >
> > > Thanks,
> > >
> > > Michael
> > >
> > > PS: if you have experience with Lucene.NET, high volume server
> > > applications, live in the Los Angeles area, and are looking for a new
> > > job, please email me off the list at mgarski[at]mac[dot]com with a
> > > recent resume... we are hiring.
> > >
> > > George Aroush wrote:
> > > > Hi Michael, Ciaran and all,
> > > >
> > > > Ciaran: welcome aboard to the mailing list and I am glad to see your
> > > email
> > > > generated some interest; I welcome any help you or anyone can offer
> > > working
> > > > on Lucene.Net.
> > > >
> > > > My goal of Lucene.Net are to meet the followings:
> > > > 1) Index is cross compatible with Java's Lucene such that you can
> > > read/write
> > > > to the same index concurrently using C# of Java Lucene.
> > > > 2) The APIs are consistent between C# and Java Lucene.  This is why
> I
> > > use
> > > > "GetXYZ()" instead of C# prosperities.
> > > >
> > > > Up to release 2.0, I kept Lucene.Net on .NET 1.1 because I wanted to
> > > support
> > > > more .NET installation as possible.  With Lucene.Net 2.1 release
> it's
> > > time
> > > > to move to .NET 2.0 -- I don't think anyone has any objection to
> this,
> > > but
> > > > Mono may have some issues.
> > > >
> > > > As for the code clean up, this maybe difficult and it depends on
> what
> > > clean
> > > > up you mean.  Take a look at open JIRA issues against Lucene.Net and
> > you
> > > > will see few about over using "throw".  Those, unfortunately, we
> can't
> > > fix.
> > > > Why?  Because those "throw" are also present in Java Lucene and
> trying
> > > to
> > > > 'fix' them in Lucene.Net may in effect alter the behavior of
> > Lucene.Net.
> > > > This said, any extra code or "throw" introduced into Lucene.Net due
> to
> > > > conversion mistakes should be fixed.
> > > >
> > > > As for the warnings, I don't have direct experience looking at them
> > > using
> > > > VS.NET 2005 (I still use VS.NET 2003)  But in VS.NET 2003, most of
> > those
> > > > warnings are from comments -- i.e.: the class and API XML
> > documentation
> > > that
> > > > don't get converted correctly from Java to C#.  If you can think of
> a
> > > tool
> > > > to clean them up, please let me know.  If it's something else you
> are
> > > > talking about, please let me know.
> > > >
> > > > Finally, making the Lucene.Net code more compliant to .NET / C#
> > standard
> > > > would be, in my opinion, a nice thing to have.  But before we can do
> > so,
> > > we
> > > > must get the port working and keep in mind my goal #2 above.
> > > >
> > > > Lets discuss this topic further.  Next week, I expect to release an
> > > early
> > > > release of Lucene.Net 2.1.  If folks can help to finish off the
> > > conversion,
> > > > then we can get this out much sooner then previous release.
> > > >
> > > > Regards,
> > > >
> > > > -- George Aroush
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Michael Mitiaguin [mailto:mitiaguinm@optusnet.com.au]
> > > > Sent: Tuesday, March 27, 2007 9:19 PM
> > > > To: lucene-net-dev@incubator.apache.org
> > > > Subject: Re: Lucene.Net project involvement
> > > >
> > > > Ciaran,
> > > >
> > > > What I can't understand if core of synchronising versions with Java
> > > > Lucene is   Java Language Conversion Assistant, how all this
> cleaning
> > > > up/revising  is going to work.
> > > > Would it be  possible to build automated procedure which preserve
> all
> > > .Net
> > > > improvements after conversion from major upgrade from Java ?  I  am
> > not
> > > > sure.
> > > > Even if to track somehow  only changed/added Java classes still for
> > each
> > > > such class merging new/revised Java  functionality with previous
> > manual
> > > > changes to utilise  .Net capabalities is required.
> > > > You used term component , but Lucene is rather API with fine grained
> > > classes
> > > > and a simple change may propagate into  several  classes  (
> > > files  in  Java
> > > > ) .
> > > > I don't know how George is coping with that and what would be the
> plan
> > > if
> > > > say tomorrow Lucene Java 3 will be realeased.
> > > >
> > > > Michael
> > > >
> > > > Ciaran Roarty wrote:
> > > >
> > > >
> > > >> Michael
> > > >>
> > > >> I've been in touch with George about getting involved and he said
> to
> > > >> post to
> > > >> the mailing list.
> > > >>
> > > >> I reckon there's a fair amount of work could be done in changing
> the
> > > >> codebase without affecting the published interface and I reckon
> > that's
> > > >> where
> > > >> the bulk of the initial work would take place; as we know, the code
> > is
> > > >> not
> > > >> yet optimised for .NET.
> > > >>
> > > >> Now, balanced against that, in my opinion are the following
> factors:
> > > >>
> > > >> - The code currently compiles against 1.1 and 2.0 (albeit with some
> > > >> obsolence); any change to move Lucene.Net to 2.0 would leave the
> > > >> 1.1codebase behind.
> > > >> - There are different types of contribution to the codebase:
> cleaning
> > > up
> > > >> code; revising methods and classes to benefit .NET standards and
> > > >> capabilities is a good thing. However, Lucene is a powerful IR
> > > >> component and
> > > >> if the core development of those capabilities happens in the Java
> > > version
> > > >> then we will need to follow that.
> > > >>
> > > >> That's my thoughts for the moment. Maybe we could take a specific
> > part
> > > of
> > > >> the component and revise that. Learning lessons about the process
> and
> > > the
> > > >> codebase from that exercise, we can move into the guts of the
> > > >> component......
> > > >>
> > > >> Any thoughts?
> > > >>
> > > >> Ciaran
> > > >>
> > > >> On 27/03/07, Michael Mitiaguin <mitiaguinm@optusnet.com.au>
wrote:
> > > >>
> > > >>
> > > >>> Ciaran,
> > > >>>
> > > >>> The only active contributor to the project is George Aroush and
> > > perhaps
> > > >>> he is the only person who will give you the most definite answer.
> > > >>> I am also interested only in  Net2/3 codebase . Currently vesion
> > 2.0.4
> > > >>> still uses VS 2003 projects and my main concern are warning
> messages
> > > >>> about deprecated and obsolete methods when compiled under Net2.
> > > >>> Supposedly it 'll be fixed in 2.1
> > > >>> Also Java Lucene is more mature project with a lot of people
> > involved
> > > >>> and it would be safer to crosstranslate new things from there
> taking
> > > >>> into consideration  .Net specifics.
> > > >>> From other hand in my case if Lucene will be part of a  project
> > where
> > > >>> all warning messages considered to be the errors which must be
> > > >>> eliminated , it it beyond my competency what can be done to
> achieve
> > > >>> that. ( JavaCC generated code crosstranslation creates a lot of
> them
> > )
> > > >>>
> > > >>> Michael
> > > >>>
> > > >>> Ciaran Roarty wrote:
> > > >>>
> > > >>>
> > > >>>> Anthony
> > > >>>>
> > > >>>> I too have used Lucene.Net with C# 2.0 to great effect. However,
> I
> > am
> > > >>>> discussing the use of .Net 2.0 in the codebase itself; and,
if
> not,
> > > >>>>
> > > >>> the
> > > >>>
> > > >>>> optimisation of the codebase for .Net in general.
> > > >>>>
> > > >>>> Ciaran
> > > >>>>
> > > >>>>
> > > >>>> On 26/03/07, tony njedeh <njedeh@yahoo.com> wrote:
> > > >>>>
> > > >>>>
> > > >>>>> I set up my lucene to a .net 2.0 framework, using VB and
it
> works
> > > >>>>> well in
> > > >>>>> that environment.
> > > >>>>>
> > > >>>>> Anthony
> > > >>>>>
> > > >>>>> Ciaran Roarty <ciaran.roarty@gmail.com> wrote:
> > > >>>>> George et al
> > > >>>>>
> > > >>>>> I have been using Lucene.Net in a proof-of-concept environment
> for
> > > >>>>>
> > > >>> the
> > > >>>
> > > >>>>> last
> > > >>>>> couple of months - with my colleague Guy Steel - and we
wanted
> to
> > > get
> > > >>>>> involved in its development.
> > > >>>>>
> > > >>>>> I am a .NET developer for a large consultancy company
and would
> > > >>>>>
> > > >>> like to
> > > >>>
> > > >>>>> get
> > > >>>>> involved in making Lucene.Net more aligned to .NET and
.NET 2/3
> in
> > > >>>>> particular. However, I am not sure if that is something
which is
> > > >>>>> initially
> > > >>>>> planned for Lucene.Net. As I understand it, the majority
of the
> > > >>>>> conversion
> > > >>>>> has been done, initially, using the Java Language Conversion
> > > >>>>>
> > > >>> Assistant.
> > > >>>
> > > >>>>> Some
> > > >>>>> of the Java codebase uses patterns that are not best practice
> for
> > > >>>>>
> > > >>> .NET
> > > >>> -
> > > >>>
> > > >>>>> such as using Exceptions for non-exceptional circumstances.
This
> > is
> > > >>>>> not to
> > > >>>>> denigrate Lucene.Net, it is one of the best pieces of
software I
> > > have
> > > >>>>> used.
> > > >>>>>
> > > >>>>> So, this email should be considered an introduction and
a
> request
> > > >>>>>
> > > >>> to be
> > > >>>
> > > >>>>> allowed to get involved. I have never worked on an Open
Source
> > > >>>>>
> > > >>> project
> > > >>>
> > > >>>>> before so I'll need some guidance but I am willing to
learn. I
> do
> > > >>>>>
> > > >>> have
> > > >>> a
> > > >>>
> > > >>>>> couple of questions to start with:
> > > >>>>>
> > > >>>>> - Is there a roadmap for the product? Is there a roadmap
for
> > Lucene
> > > >>>>>
> > > >>> that
> > > >>>
> > > >>>>> we
> > > >>>>> will try and follow?
> > > >>>>> - Is there a preferred version of the .NET Framework that
it is
> > > >>>>> planned to
> > > >>>>> support?
> > > >>>>>
> > > >>>>> Enough for now, just wanted to introduce myself and get
> involved.
> > > >>>>>
> > > >>>>> Cheers,
> > > >>>>> Ciaran
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message