portals-jetspeed-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Luta, Raphael (VUN)" <Raphael.L...@groupvu.Com>
Subject RE: [Proposal] Lucene Search Service
Date Wed, 21 May 2003 08:15:35 GMT

De : Jeremy Ford [mailto:caius1440@hotmail.com]
> In terms of the service, perhaps there should be a generic 
> SearchService 
> interface of which the Lucene service is an implementation.  
> That way, if 
> there was some other great search engine someone wanted to 
> use, they could 
> swap it out.


> I was thinking that there could be 4 basic methods.  These would be 
> add(Object o), update(Object o), remove(Object o), and 
> search(query).  In 
> order to support more than one object type, we could setup 
> the service to 
> accept LuceneDocuement loaders, which would know how to turn 
> the generic 
> object into a Lucene document that can be added to the index. 
>  Here's an 
> outline.

My experience with Verity search engines tells me that usually you don't 
want to have a single index for all your documents:
- some serach engines can customize their behavior per index (like
  metadata indexing, language optimized search algorithms, etc...)
- a suingle index may very soon become *big* and that will create 
  performance (index update) and administrative issue (index corrpution,
  backups, etc...)

So I'd propose that the service uses a concept of "Catalog"s (matched with
individual indices) in which you can store objects/documents.
Jetspeed may use have some well-known system catalog like "portlet" that be
used system-wise to access all available.

I've not really thought in detail of the resulting API but I guess 
it could look somewhat like the Registry interface...
> services.SearchService.classname=org.apache.jetspeed.services.
> search.lucene.LuceneSearchService
> services.SearchService.index=WEB-INF/index
> services.SearchService.handler.name=ObjectXHandler
> services.SearchService.handler.ObjectXHandler.classname=com.my
> company.lucene.ObjectXToDocument
> services.SearchService.handler.ObjectXHandler.object_type=com.
> mycompany.ObjectX
> services.SearchService.handler.name=ObjectYHandler
> services.SearchService.handler.ObjectYHandler.classname=com.my
> company.lucene.ObjectYToDocument
> services.SearchService.handler.ObjectYHandler.object_type=com.
> mycompany.ObjectY
> So, when it comes time to add the object to the index, the 
> service looks up 
> the appropriate object handler, uses it to convert the object 
> to a Lucene 
> document, and then adds/updates/removes it from the index.  
> In terms of 
> searching, this would allow all kinds of different indexed 
> documents to be 
> returned from a search.  Perhaps a filter could be placed in 
> the search so 
> that only certain types of documents that originally came 
> from certain types 
> of objects could be returned.


Raphaƫl Luta - raphael@apache.org
Jakarta Jetspeed - Enterprise Portal in Java

Vivendi Universal - HTTP://www.vivendiUniversal.com: 
The information transmitted is intended only for the person or entity
to which it is addressed and may contain confidential and/or privileged
material of Vivendi Universal which is for the exclusive use of the
individual designated above as the recipient. Any review, retransmission,
dissemination or other use of, or taking of any action in reliance upon, 
this information by persons or entities other than the intended recipient 
is prohibited. If you received this in error, please contact immediately 
the sender by returning e-mail and delete the material from any computer. 
If you are not the specified recipient, you are hereby notified that all 
disclosure, reproduction, distribution or action taken on the basis of this 
message is prohibited.

To unsubscribe, e-mail: jetspeed-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: jetspeed-dev-help@jakarta.apache.org

View raw message