lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Berryman" <topd...@gmail.com>
Subject Re: Lucene.Net project involvement
Date Wed, 28 Mar 2007 20:15:49 GMT
Actually ... I better idea would be to list out all the places that you made
object replacements (and of course what they were).  And then as a group we
could decide what we think would be best to encorporate into the main
baseline.

The project that I have been using Lucene for has some high-volitile indexes
as well and I ran across many different types of performance problems during
implementation.  I bet that we did similar tasks along the way in our
"profiling".  We were able to work through these without making changes to
the baseline though.  One thing that I really believe in with Lucene is the
structures and architecture.  However, I do know that there could be some
vast improvements made on object usage within the .NET Framework based on
what I have seen in the code.

My point being ... I think we can make a lot of improvements as a whole for
the entire user-base whether everyone really needs it or not.  But it would
definately take Lucene.NET to a whole new level as far as the pure .NET code
is concerned.

Andy


On 3/28/07, Max Metral <max@artsalliancelabs.com> wrote:
>
> I (and others) would sure love to get your modifications...  Can we see
> them somewhere?
>
> -----Original Message-----
> From: Michael Garski [mailto:mgarski@mac.com]
> Sent: Wednesday, March 28, 2007 3:57 PM
> To: lucene-net-dev@incubator.apache.org
> Subject: Re: Lucene.Net project involvement
>
> Everyone -
>
> I feel I have to chip my 2 cents in regarding the 'throw' issue.  The
> exception throwing inside Lucene, particularly during indexing
> operations and on a smaller scale when using QueryParser can be safely
> altered without affecting either of the 2 goals you list - making the
> index cross compatible with Java and maintaining consistent [external]
> API.
>
> The indexes we maintain are constantly being updated as they contain
> millions of small documents with relatively volatile data.  Seeing
> upwards of 8000/exceptions per second while maintaining those indexes
> prompted us to dig into the internals of Lucene.NET to alter the
> throws.  We also modified the internal data structures to use generic
> collections rather than synchronized arraylists and hashtables to cut
> down on the large amount of small object creation we were seeing in a
> profiler.  The end result cut the exceptions to 0 and significantly
> increased performance during index time.  All modifications we have made
>
> still result in passing unit tests.
>
> I would venture to say that the vast majority of Lucene.NET users would
> not greatly benefit from these performance improvements unless they are
> working on a _very_ high-volume application such as we are.  We
> currently maintain our own branch of Lucene.NET, incorporating any
> changes made to the SubVersion trunk into our branch.  As it appears
> these changes are not desired in the official Lucene.NET releases, the
> changes are not difficult for anyone to make on their own should they
> choose to do so.  One of the advantages of open source
>
> Thanks,
>
> Michael
>
> PS: if you have experience with Lucene.NET, high volume server
> applications, live in the Los Angeles area, and are looking for a new
> job, please email me off the list at mgarski[at]mac[dot]com with a
> recent resume... we are hiring.
>
> George Aroush wrote:
> > Hi Michael, Ciaran and all,
> >
> > Ciaran: welcome aboard to the mailing list and I am glad to see your
> email
> > generated some interest; I welcome any help you or anyone can offer
> working
> > on Lucene.Net.
> >
> > My goal of Lucene.Net are to meet the followings:
> > 1) Index is cross compatible with Java's Lucene such that you can
> read/write
> > to the same index concurrently using C# of Java Lucene.
> > 2) The APIs are consistent between C# and Java Lucene.  This is why I
> use
> > "GetXYZ()" instead of C# prosperities.
> >
> > Up to release 2.0, I kept Lucene.Net on .NET 1.1 because I wanted to
> support
> > more .NET installation as possible.  With Lucene.Net 2.1 release it's
> time
> > to move to .NET 2.0 -- I don't think anyone has any objection to this,
> but
> > Mono may have some issues.
> >
> > As for the code clean up, this maybe difficult and it depends on what
> clean
> > up you mean.  Take a look at open JIRA issues against Lucene.Net and
> you
> > will see few about over using "throw".  Those, unfortunately, we can't
> fix.
> > Why?  Because those "throw" are also present in Java Lucene and trying
> to
> > 'fix' them in Lucene.Net may in effect alter the behavior of
> Lucene.Net.
> > This said, any extra code or "throw" introduced into Lucene.Net due to
> > conversion mistakes should be fixed.
> >
> > As for the warnings, I don't have direct experience looking at them
> using
> > VS.NET 2005 (I still use VS.NET 2003)  But in VS.NET 2003, most of
> those
> > warnings are from comments -- i.e.: the class and API XML
> documentation that
> > don't get converted correctly from Java to C#.  If you can think of a
> tool
> > to clean them up, please let me know.  If it's something else you are
> > talking about, please let me know.
> >
> > Finally, making the Lucene.Net code more compliant to .NET / C#
> standard
> > would be, in my opinion, a nice thing to have.  But before we can do
> so, we
> > must get the port working and keep in mind my goal #2 above.
> >
> > Lets discuss this topic further.  Next week, I expect to release an
> early
> > release of Lucene.Net 2.1.  If folks can help to finish off the
> conversion,
> > then we can get this out much sooner then previous release.
> >
> > Regards,
> >
> > -- George Aroush
> >
> >
> > -----Original Message-----
> > From: Michael Mitiaguin [mailto:mitiaguinm@optusnet.com.au]
> > Sent: Tuesday, March 27, 2007 9:19 PM
> > To: lucene-net-dev@incubator.apache.org
> > Subject: Re: Lucene.Net project involvement
> >
> > Ciaran,
> >
> > What I can't understand if core of synchronising versions with Java
> > Lucene is   Java Language Conversion Assistant, how all this cleaning
> > up/revising  is going to work.
> > Would it be  possible to build automated procedure which preserve all
> .Net
> > improvements after conversion from major upgrade from Java ?  I  am
> not
> > sure.
> > Even if to track somehow  only changed/added Java classes still for
> each
> > such class merging new/revised Java  functionality with previous
> manual
> > changes to utilise  .Net capabalities is required.
> > You used term component , but Lucene is rather API with fine grained
> classes
> > and a simple change may propagate into  several  classes  ( files  in
> Java
> > ) .
> > I don't know how George is coping with that and what would be the plan
> if
> > say tomorrow Lucene Java 3 will be realeased.
> >
> > Michael
> >
> > Ciaran Roarty wrote:
> >
> >
> >> Michael
> >>
> >> I've been in touch with George about getting involved and he said to
> >> post to
> >> the mailing list.
> >>
> >> I reckon there's a fair amount of work could be done in changing the
> >> codebase without affecting the published interface and I reckon
> that's
> >> where
> >> the bulk of the initial work would take place; as we know, the code
> is
> >> not
> >> yet optimised for .NET.
> >>
> >> Now, balanced against that, in my opinion are the following factors:
> >>
> >> - The code currently compiles against 1.1 and 2.0 (albeit with some
> >> obsolence); any change to move Lucene.Net to 2.0 would leave the
> >> 1.1codebase behind.
> >> - There are different types of contribution to the codebase: cleaning
> up
> >> code; revising methods and classes to benefit .NET standards and
> >> capabilities is a good thing. However, Lucene is a powerful IR
> >> component and
> >> if the core development of those capabilities happens in the Java
> version
> >> then we will need to follow that.
> >>
> >> That's my thoughts for the moment. Maybe we could take a specific
> part of
> >> the component and revise that. Learning lessons about the process and
> the
> >> codebase from that exercise, we can move into the guts of the
> >> component......
> >>
> >> Any thoughts?
> >>
> >> Ciaran
> >>
> >> On 27/03/07, Michael Mitiaguin <mitiaguinm@optusnet.com.au> wrote:
> >>
> >>
> >>> Ciaran,
> >>>
> >>> The only active contributor to the project is George Aroush and
> perhaps
> >>> he is the only person who will give you the most definite answer.
> >>> I am also interested only in  Net2/3 codebase . Currently vesion
> 2.0.4
> >>> still uses VS 2003 projects and my main concern are warning messages
> >>> about deprecated and obsolete methods when compiled under Net2.
> >>> Supposedly it 'll be fixed in 2.1
> >>> Also Java Lucene is more mature project with a lot of people
> involved
> >>> and it would be safer to crosstranslate new things from there taking
> >>> into consideration  .Net specifics.
> >>> From other hand in my case if Lucene will be part of a  project
> where
> >>> all warning messages considered to be the errors which must be
> >>> eliminated , it it beyond my competency what can be done to achieve
> >>> that. ( JavaCC generated code crosstranslation creates a lot of them
> )
> >>>
> >>> Michael
> >>>
> >>> Ciaran Roarty wrote:
> >>>
> >>>
> >>>> Anthony
> >>>>
> >>>> I too have used Lucene.Net with C# 2.0 to great effect. However, I
> am
> >>>> discussing the use of .Net 2.0 in the codebase itself; and, if not,
>
> >>>>
> >>> the
> >>>
> >>>> optimisation of the codebase for .Net in general.
> >>>>
> >>>> Ciaran
> >>>>
> >>>>
> >>>> On 26/03/07, tony njedeh <njedeh@yahoo.com> wrote:
> >>>>
> >>>>
> >>>>> I set up my lucene to a .net 2.0 framework, using VB and it works
> >>>>> well in
> >>>>> that environment.
> >>>>>
> >>>>> Anthony
> >>>>>
> >>>>> Ciaran Roarty <ciaran.roarty@gmail.com> wrote:
> >>>>> George et al
> >>>>>
> >>>>> I have been using Lucene.Net in a proof-of-concept environment for
>
> >>>>>
> >>> the
> >>>
> >>>>> last
> >>>>> couple of months - with my colleague Guy Steel - and we wanted to
> get
> >>>>> involved in its development.
> >>>>>
> >>>>> I am a .NET developer for a large consultancy company and would
> >>>>>
> >>> like to
> >>>
> >>>>> get
> >>>>> involved in making Lucene.Net more aligned to .NET and .NET 2/3
in
> >>>>> particular. However, I am not sure if that is something which is
> >>>>> initially
> >>>>> planned for Lucene.Net. As I understand it, the majority of the
> >>>>> conversion
> >>>>> has been done, initially, using the Java Language Conversion
> >>>>>
> >>> Assistant.
> >>>
> >>>>> Some
> >>>>> of the Java codebase uses patterns that are not best practice for
> >>>>>
> >>> .NET
> >>> -
> >>>
> >>>>> such as using Exceptions for non-exceptional circumstances. This
> is
> >>>>> not to
> >>>>> denigrate Lucene.Net, it is one of the best pieces of software I
> have
> >>>>> used.
> >>>>>
> >>>>> So, this email should be considered an introduction and a request
> >>>>>
> >>> to be
> >>>
> >>>>> allowed to get involved. I have never worked on an Open Source
> >>>>>
> >>> project
> >>>
> >>>>> before so I'll need some guidance but I am willing to learn. I do
> >>>>>
> >>> have
> >>> a
> >>>
> >>>>> couple of questions to start with:
> >>>>>
> >>>>> - Is there a roadmap for the product? Is there a roadmap for
> Lucene
> >>>>>
> >>> that
> >>>
> >>>>> we
> >>>>> will try and follow?
> >>>>> - Is there a preferred version of the .NET Framework that it is
> >>>>> planned to
> >>>>> support?
> >>>>>
> >>>>> Enough for now, just wanted to introduce myself and get involved.
> >>>>>
> >>>>> Cheers,
> >>>>> Ciaran
> >>>>>
> >>>>>
> >>>>>
> >>>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message