portals-jetspeed-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeremy Ford" <caius1...@hotmail.com>
Subject Re: [Proposal] Lucene Search Service
Date Thu, 22 May 2003 15:47:57 GMT



>From: Paul Spencer <paulspencer@mindspring.com>
>Reply-To: "Jetspeed Developers List" <jetspeed-dev@jakarta.apache.org>
>To: Jetspeed Developers List <jetspeed-dev@jakarta.apache.org>
>Subject: Re: [Proposal] Lucene Search Service
>Date: Wed, 21 May 2003 08:24:08 -0400
>
>Jeremy,
>In general I am +1 on the proposal, but it is not complete.  Below are 
>suggested additions and comments that should be address by the proposal.
>
>o The query syntax may be search engine specific, unless you want to define 
>a query language.

I'm fine with the portlet being search engine specific.

>o The SearchResult class will need to be updated to contain Object not URL

I agree that the SearchResult should probably have an object instead of a 
URL.  What kind of object are we talking about.  Is it the obejct 
represented by the document placed into the index, or is it the document 
from the index itself?

If it's the object represented by the document, then we will need something 
to reverse the process of converting the object to the document.  Also, if 
it's the object itself, it could affect the next point.


>o The search portlet must be able to "display" the Object.  Currently this 
>is done by passing the
>   URL to the browser via "href=URL".  Thus a the "Display" must be 
>pluggable, i,e createLinkToObject(Object o) or
>   LinkToObjectService(Object o).  The createLinkToObject() does NOT belong 
>in the handler class.

I agree that for the current implmentation that this is needed.  I was also 
thinking that there could be more specific search portlets that know how to 
handle certain types of objects/documents.  (See above).  Example:    if you 
could search the portlet registry for portlets based on title/description, 
you could maybe have a results page that allows you to directly add your 
choices to your psml.

>o "handler" and ParsedObject interfaces
>   /**
>    * "handlers" called by the search services MUST implement this 
>interface
>    */
>   interface ObjectHandler
>   {
>       ParsedObject parseObject(Object o);
>    }
>
>    interface ParsedObject
>    {
>          String getContent();
>          void setContent(String content);
>          String getDescription();
>          void setDescription(String description);
>          String[] getKeywords();
>          void setKeywords(String[] keywords);
>          String getTitle();
>          void setTitle(String title);
>    }
>
>o How is the index maintained, including updating the index when the Object 
>changes?
>   The LuceneSearchService was intentionally simple and restricted to 
>indexing URL content.  This was due to time constraints.  Although the 
>service is viewed as a stepping-stone to a more generalized search portlet.

I'm not sure about the best approach for this is, but I have a couple of 
ideas.  If Jetspeed wants to embed the use of the search service within 
itself, you could maybe make use of the search service directoy everytime 
you add/modify/remove a portlet/psml/etc...  This would require some 
overhead of adding this functionality to all necessary actions.

Another approach would be to have a daemon that runs in the background 
updating the indexes.  This daemon would know about all Jetspeed specific 
indexing requirements.




>Paul Spencer
>
>Jeremy Ford wrote:
>
>>I've noticed the new LuceneSearchService and have been giving some thought 
>>as to how I might like to put it to use.  I'll admit to not having that 
>>much experience with Lucene, so if anyone thinks that this won't work, 
>>just let me know.
>>
>>In terms of the service, perhaps there should be a generic SearchService 
>>interface of which the Lucene service is an implementation.  That way, if 
>>there was some other great search engine someone wanted to use, they could 
>>swap it out.
>>
>>I was thinking that there could be 4 basic methods.  These would be 
>>add(Object o), update(Object o), remove(Object o), and search(query).  In 
>>order to support more than one object type, we could setup the service to 
>>accept LuceneDocuement loaders, which would know how to turn the generic 
>>object into a Lucene document that can be added to the index.  Here's an 
>>outline.
>>
>>services.SearchService.classname=org.apache.jetspeed.services.search.lucene.LuceneSearchService
>>services.SearchService.index=WEB-INF/index
>>
>>services.SearchService.handler.name=ObjectXHandler
>>services.SearchService.handler.ObjectXHandler.classname=com.mycompany.lucene.ObjectXToDocument
>>services.SearchService.handler.ObjectXHandler.object_type=com.mycompany.ObjectX
>>
>>services.SearchService.handler.name=ObjectYHandler
>>services.SearchService.handler.ObjectYHandler.classname=com.mycompany.lucene.ObjectYToDocument
>>services.SearchService.handler.ObjectYHandler.object_type=com.mycompany.ObjectY
>>
>>So, when it comes time to add the object to the index, the service looks 
>>up the appropriate object handler, uses it to convert the object to a 
>>Lucene document, and then adds/updates/removes it from the index.  In 
>>terms of searching, this would allow all kinds of different indexed 
>>documents to be returned from a search.  Perhaps a filter could be placed 
>>in the search so that only certain types of documents that originally came 
>>from certain types of objects could be returned.
>>
>>Again, just an idea.  But it strikes me as a powerful one with respect to 
>>a general indexing solution.
>>
>>Jeremy Ford
>>
>>_________________________________________________________________
>>Protect your PC - get McAfee.com VirusScan Online  
>>http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: jetspeed-dev-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: jetspeed-dev-help@jakarta.apache.org
>>
>>
>
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: jetspeed-dev-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: jetspeed-dev-help@jakarta.apache.org
>

_________________________________________________________________
Add photos to your e-mail with MSN 8. Get 2 months FREE*.  
http://join.msn.com/?page=features/featuredemail


---------------------------------------------------------------------
To unsubscribe, e-mail: jetspeed-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: jetspeed-dev-help@jakarta.apache.org


Mime
View raw message