portals-jetspeed-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Spencer <paulspen...@mindspring.com>
Subject Re: [Proposal] Lucene Search Service
Date Wed, 21 May 2003 12:31:46 GMT
Additions to the ParsedObject to allow for language and additional 
searchable fields to be stored in the index.

   interface ParsedObject
   {
         String getContent();
         void setContent(String content);
         String getDescription();
         void setDescription(String description);
         String[] getKeywords();
         void setKeywords(String[] keywords);
         String getLanguage();
         void setLanguage(String language);
         String getTitle();
         void setTitle(String title);
         Map getFields();
         void setFields(Map fields);
   }

Paul Spencer

Paul Spencer wrote:

> Jeremy,
> In general I am +1 on the proposal, but it is not complete.  Below are 
> suggested additions and comments that should be address by the proposal. 
> o The query syntax may be search engine specific, unless you want to 
> define a query language. 
> o The SearchResult class will need to be updated to contain Object not 
> URL
>
> o The search portlet must be able to "display" the Object.  Currently 
> this is done by passing the
>   URL to the browser via "href=URL".  Thus a the "Display" must be 
> pluggable, i,e createLinkToObject(Object o) or
>   LinkToObjectService(Object o).  The createLinkToObject() does NOT 
> belong in the handler class.
>
> o "handler" and ParsedObject interfaces
>   /**
>    * "handlers" called by the search services MUST implement this 
> interface
>    */
>   interface ObjectHandler
>   {
>       ParsedObject parseObject(Object o);
>    }
>
>    interface ParsedObject
>    {
>          String getContent();
>          void setContent(String content);
>          String getDescription();
>          void setDescription(String description);
>          String[] getKeywords();
>          void setKeywords(String[] keywords);
>          String getTitle();
>          void setTitle(String title);
>    }
>
> o How is the index maintained, including updating the index when the 
> Object changes?
>   The LuceneSearchService was intentionally simple and restricted to 
> indexing URL content.  This was due to time constraints.  Although the 
> service is viewed as a stepping-stone to a more generalized search 
> portlet.
>
> Paul Spencer
>
> Jeremy Ford wrote:
>
>> I've noticed the new LuceneSearchService and have been giving some 
>> thought as to how I might like to put it to use.  I'll admit to not 
>> having that much experience with Lucene, so if anyone thinks that 
>> this won't work, just let me know.
>>
>> In terms of the service, perhaps there should be a generic 
>> SearchService interface of which the Lucene service is an 
>> implementation.  That way, if there was some other great search 
>> engine someone wanted to use, they could swap it out.
>>
>> I was thinking that there could be 4 basic methods.  These would be 
>> add(Object o), update(Object o), remove(Object o), and 
>> search(query).  In order to support more than one object type, we 
>> could setup the service to accept LuceneDocuement loaders, which 
>> would know how to turn the generic object into a Lucene document that 
>> can be added to the index.  Here's an outline.
>>
>> services.SearchService.classname=org.apache.jetspeed.services.search.lucene.LuceneSearchService
>> services.SearchService.index=WEB-INF/index
>>
>> services.SearchService.handler.name=ObjectXHandler
>> services.SearchService.handler.ObjectXHandler.classname=com.mycompany.lucene.ObjectXToDocument
>> services.SearchService.handler.ObjectXHandler.object_type=com.mycompany.ObjectX
>>
>> services.SearchService.handler.name=ObjectYHandler
>> services.SearchService.handler.ObjectYHandler.classname=com.mycompany.lucene.ObjectYToDocument
>> services.SearchService.handler.ObjectYHandler.object_type=com.mycompany.ObjectY
>>
>> So, when it comes time to add the object to the index, the service 
>> looks up the appropriate object handler, uses it to convert the 
>> object to a Lucene document, and then adds/updates/removes it from 
>> the index.  In terms of searching, this would allow all kinds of 
>> different indexed documents to be returned from a search.  Perhaps a 
>> filter could be placed in the search so that only certain types of 
>> documents that originally came from certain types of objects could be 
>> returned.
>>
>> Again, just an idea.  But it strikes me as a powerful one with 
>> respect to a general indexing solution.
>>
>> Jeremy Ford
>>
>> _________________________________________________________________
>> Protect your PC - get McAfee.com VirusScan Online  
>> http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: jetspeed-dev-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: jetspeed-dev-help@jakarta.apache.org
>>
>>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: jetspeed-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: jetspeed-dev-help@jakarta.apache.org
>
>




---------------------------------------------------------------------
To unsubscribe, e-mail: jetspeed-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: jetspeed-dev-help@jakarta.apache.org


Mime
View raw message