lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Rodenburg" <jeff.rodenb...@gmail.com>
Subject Remote searching with Lucene - forward progress
Date Wed, 13 Sep 2006 15:31:59 GMT
An update on the Remote Searching project I'm bringing forward.  I've
completed the base code for hand-off to the community.  I'm presently
working through a remoting/serialization issue that's popped up recently.
This appears to be something new in the Lucene 2.0 release.  I'm working
through that issue now, but I haven no expectation of when that's resolved.

Rather than release a non-working system, I'm going to resolve this problem
first.  Once things are working appropriately, I'll send out a release
message.

Thanks and if you have remoting experience and suggestions, feel free to
ping me.  :-)

cheers,
jeff r.


On 9/7/06, Jeff Rodenburg <jeff.rodenburg@gmail.com> wrote:
>
> All -
>
> Another update on the remote searching application code that's been
> mentioned in this thread.  I'm near completion of the entire collection of
> files that are needed for this project -- libraries, applications, unit
> tests, and documentation.  There's quite a bit to this, and thanks for
> everybody's patience as I assemble the code into something that's less than
> confusing.  There are several working pieces, so I'm packaging it for
> consumption.
>
> I expect to have this available sometime in the next few days, barring
> things like my life and regular job from getting in the way.  Again, I'll
> share an announcement to the list when I've made the files available.
>
> Thanks,
> jeff r.
>
>
>
> On 8/26/06, Jeff Rodenburg <jeff.rodenburg@gmail.com> wrote:
> >
> > As promised, an update to the list.
> >
> > I have code ready for delivery, if I can get svn access to the contrib
> > section.  A request has been made for this but it's going nowhere, so I'm
> > going to find another place to host the files.
> >
> > There's quite a bit of documentation behind this so I'm working
> > diligently to explain how this works.  If anyone has a place to hold the
> > code until the uber-powers at apache decide to grant me access, we would
> > greatly appreciate the assistance.
> >
> > cheers,
> > jeff r.
> >
> >
> >
> > On 8/23/06, Jeff Rodenburg < jeff.rodenburg@gmail.com> wrote:
> > >
> > > Just a follow-up to everyone on this topic.  I received a lot of
> > > offlist mail about this, so this message has a rather wide distribution.
> > >
> > > I'm in process of modifying the code for our distributed search
> > > components so that they're generic enough for general usage and public
> > > consumption.  This is taking a little of my time, but nonetheless I expect
> > > to complete it soon.
> > >
> > > As for distributing the code, it will be located in the contrib
> > > portion of the Lucene.Net repository at apache.org .  There is some
> > > logistic work involved, but ideally this is moving forward.
> > >
> > > As soon as I have more information to relay, I'll pass it along to the
> > > list.
> > >
> > > cheers,
> > > jeff r.
> > >
> > >
> > >
> > >
> > > On 8/21/06, Jeff Rodenburg < jeff.rodenburg@gmail.com> wrote:
> > > >
> > > > Hello all -
> > > >
> > > > I've been watching this thread to follow the direction and thought I
> > > > might be able to offer some assistance.  I run a search system that involves
> > > > 4 separate search servers -- 3 serving search objects via RemoteSearchable,
> > > > and a 4th that serves in an index updating role.
> > > >
> > > > The codebase for Lucene.Net provides all the library routines one
> > > > needs to provide distributed search capabilities, but does not provide
> > > > facilities for distributed search operation -- nor should it.  The ideas
> > > > presented here are certainly possible; I've implemented a working operation
> > > > without requiring the changes described here.  I'm confident in our
> > > > implementation; for the calendar year, our uptime/availability of search
> > > > services is 99.99%.  Our only outage was related to network
> > > > hardware, otherwise we're sitting solid at 100%.
> > > >
> > > > I've been authorized to provide our operational code for distributed
> > > > search under Lucene.Net to the community at large.  Some of the code
> > > > is customized to our operation, but for the most part it's rather generic.
> > > > We started the project under Lucene v1.4.3, but the operational
> > > > aspect still applies under v1.9.
> > > >
> > > > The system consists of a LuceneServer, which provides searchability
> > > > against indexes as defined in XML configuration files.  In addition, an
> > > > IndexUpdateServer provides master index updating, master/slave index
> > > > replication and automated index maintenance.  Integration with our web
site
> > > > ensures the index stays available, updated and current.  There's a great
> > > > deal of applied knowledge and learned behavior of many of the underlying
> > > > sub-system components that distributed search under Lucene.Net makes
> > > > use of -- .Net remoting, garbage collection, etc.
> > > >
> > > > If anyone has interest, please reply.  Contributing this code
> > > > requires a little cleanup of our customization work, so my response may
not
> > > > be immediate but I would make efforts to release the code in short order.
> > > >
> > > > thanks,
> > > > jeff r.
> > > >
> > > >
> > > >
> > > >
> > > > On 8/19/06, Robert Boulanger < robert@boulanger.at> wrote:
> > > > >
> > > > > Hi Elena, hi Rest,
> > > > >
> > > > > > Dear All,
> > > > > >
> > > > > > The application I am working on is intended to make use of the
> > > > > > distributed search capabilities of the Lucene library. While
> > > > > trying to
> > > > > > work with the Lucene's RemoteSearchable class, I faced some
> > > > > problems
> > > > > > cased by the current Lucene implementation. In following I'll
> > > > > try to
> > > > > > describe them, as well as the possible ways of their solution,
I
> > > > > > identified. The most important question for me is, if these
> > > > > changes
> > > > > > have a chance to be integrated in the coming Lucene versions,
> > > > > such
> > > > > > that remote searches would really become feasible. I would
> > > > > appreciate
> > > > > > any feedback.
> > > > >
> > > > > Same problem for me and I found some more issues which I explain
> > > > > below:
> > > > >
> > > > > >
> > > > > > The first problem concerns the construction of the
> > > > > RemoteSearchable
> > > > > > object. .Net framework allows for both, server and client
> > > > > activation
> > > > > > models of the remote objects. Currently, RemoteSearchable class
> > > > > > possesses only one constructor that requires knowledge of a
> > > > > local
> > > > > > Searchable object:
> > > > > >
> > > > > > public RemoteSearchable(Lucene.Net.Search.Searchable local)
> > > > > >
> > > > > I just added a new constructor to RemoteSearchable
> > > > > public RemoteSearchable(): base()
> > > > > {
> > > > > this.local = this.local;
> > > > > }
> > > > >
> > > > > not the fine method but for me it works so far.
> > > > >
> > > > > > Since this "local" object is located on the server, knowledge
of
> > > > > the
> > > > > > server's index paths is needed for its creation. However, there
> > > > > are at
> > > > > > least some scenarios where only the server, but not the client,
> > > > > knows
> > > > > > where the indexes are stored on the server side. I think this
> > > > > problem
> > > > > > could be solved by extending RemoteSearchable class with a
> > > > > standard
> > > > > > constructor that reads the names of the indexes to be published
> > > > > out of
> > > > > > a configuration file on the server side.
> > > > > >
> > > > > My "Server" now implements a Class which inherits directly from
> > > > > Remote
> > > > > Searchable.
> > > > > in the parameterless constructor there I read the server sided
> > > > > configfile which contains the index location , create a new
> > > > > IndexReader
> > > > > and pass it as Argument to MyBase.New()
> > > > > See sample below.
> > > > >
> > > > > > 2. Bug in Term construction
> > > > > [snip]
> > > > >
> > > > > This whole chapter was very useful and I can commit everything
> > > > > works
> > > > > fine from there on.
> > > > >
> > > > > But there is still a bug in FieldDocSortedHitQueue line 130 and
> > > > > below:
> > > > > I figured out that the castings are not working when the system is
> > > > > running in a non english globalization context.
> > > > > The String in docAFields[i] which might be for example 1.345678 is
> > > > > casted to 1345678.0 since the decimal sign is misinterpreted in
> > > > > German
> > > > > systems as it seems.
> > > > > So the casting results in an overflow.
> > > > >
> > > > > So I changed it as follows:
> > > > >
> > > > > case SortField.SCORE:
> > > > > float r1 = (float)Convert.ToSingle(docA.fields[i],
> > > > > System.Globalization.NumberFormatInfo.InvariantInfo );
> > > > > float r2 = (float)Convert.ToSingle(docA.fields[i],
> > > > > System.Globalization.NumberFormatInfo.InvariantInfo);
> > > > > if (r1 > r2)
> > > > > c = - 1;
> > > > > if (r1 < r2)
> > > > > c = 1;
> > > > > break;
> > > > >
> > > > > Same in line 172 and 174:
> > > > >
> > > > > float f1 = (float)Convert.ToSingle(docA.fields[i],
> > > > > System.Globalization.NumberFormatInfo.InvariantInfo);
> > > > > //UPGRADE_TODO: The equivalent in .NET for method
> > > > > 'java.lang.Float.floatValue' may return a different value.
> > > > >
> > > > > "ms-help://MS.VSCC.v80/dv_commoner/local/redirect.htm?index='!DefaultContextWindowIndex'&keyword='jlca1043'"
> > > > > float f2 = (float)Convert.ToSingle(docB.fields[i],
> > > > > System.Globalization.NumberFormatInfo.InvariantInfo );
> > > > >
> > > > >
> > > > >
> > > > > A tiny Client Server Solution now looks like this (Here in VB.NET)
> > > > > SERVER:
> > > > > Public Class RemoteQuery
> > > > > Inherits RemoteSearchable
> > > > > Public Sub New()
> > > > > MyBase.New(New IndexSearcher("C:\lucene\index"))
> > > > > End Sub
> > > > > Public Sub New(ByVal local As Searchable)
> > > > > MyBase.New(local)
> > > > > End Sub
> > > > >
> > > > > End Class
> > > > >
> > > > > Module Module1
> > > > > Public Sub Main(ByVal args As System.String())
> > > > > Dim chnl As New HttpChannel(8888)
> > > > > ChannelServices.RegisterChannel (chnl, False)
> > > > > Dim indexName As System.String = Nothing
> > > > > RemotingConfiguration.RegisterWellKnownServiceType
> > > > > (GetType(RemoteQuery),
> > > > > "Searchable", WellKnownObjectMode.Singleton)
> > > > > System.Console.ReadLine()
> > > > > End Sub
> > > > > End Module
> > > > > CLIENT
> > > > > Sub Main()
> > > > > Dim searchables As Lucene.Net.Search.Searchable() = New
> > > > > Lucene.Net.Search.Searchable() {LookupRemote()}
> > > > > Dim searcher As Searcher = New MultiSearcher(searchables)
> > > > > Dim sort As New Lucene.Net.Search.Sort
> > > > > sort.SetSort(Lucene.Net.Search.SortField.FIELD_SCORE)
> > > > > Dim query As Query = QueryParser.Parse("Harry", "body", New
> > > > > StandardAnalyzer())
> > > > > Dim result As Hits = searcher.Search (query, sort)
> > > > > End Sub
> > > > > Private Function LookupRemote() As Lucene.Net.Search.Searchable
> > > > > Return CType(Activator.GetObject(GetType(
> > > > > Lucene.Net.Search.Searchable),
> > > > > " http://192.168.8.7:8888/Searchable"),
> > > > > Lucene.Net.Search.Searchable)
> > > > > End Function
> > > > >
> > > > > Hope this helps you and anybody else how has problems with
> > > > > remotesearch
> > > > > so far.
> > > > >
> > > > > BTW: this all refers Version 1.9rc1
> > > > >
> > > > > --Robert Boulanger
> > > > >
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message