lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Martz <benma...@gmail.com>
Subject Re: using Porter
Date Thu, 18 Feb 2010 19:45:06 GMT
Pamela,

I strongly recommend than anyone planning to work with Lucene.Net follows
Michael's suggestion and reads LIA for background but once you've done that,
here's a summary in just a few lines that may help get you up and running. I
use Snowball (read about algorithmic justifications here:
http://snowball.tartarus.org/) in my product for stemming. IIRC, Snowball
can be found in the contrib directory. The approach just using the
PorterStemming analyzer is pretty much identical.

Indexing:

IndexWriter writer = new IndexWriter(indexPath, new SnowballAnalyzer(
"English"), false);


Searching:

Analyzer analyzer = new SnowballAnalyzer("English");

QueryParser qp = new QueryParser("Contents", analyzer);

Query query = qp.Parse(inQuery);


Also keep in mind that if you use Highlighter that you will need to again
use the Snowball analyzer when fragmenting the results (I'm actually reading
the original field contents from a cached file on disk here, don't mind me
:D)

QueryScorer qs = new QueryScorer(query, reader, "Contents");

 Highlighter hl = new Highlighter(qs);

hl.SetMaxDocBytesToAnalyze(int.MaxValue);


 TokenStream stream = new SnowballAnalyzer("English").TokenStream("Contents",
new StringReader(fileContents));


 TextFragment[] frag = hl.GetBestTextFragments(stream, fileContents, false,
inDetailLevel == Int32.MaxValue ? k_MaxQueryFragments : inDetailLevel);

Good luck,
Ben

On Thu, Feb 18, 2010 at 10:43 AM, Michael Garski <mgarski@myspace-inc.com>wrote:

> Pamela,
>
> To get the results you expect using a stemmer, you would need to use it
> at both index and search time.
>
> I don't have any examples on the use of a stemmer, and suggest checking
> out the book Lucene in Action - there is an early access copy of the
> next version available at http://www.manning.com/hatcher3/.  While the
> book details the Java version of Lucene, the same APIs are present in
> Lucene.Net.
>
> Michael
>
> -----Original Message-----
> From: Pamela Foxcroft [mailto:pamelafoxcroft@gmail.com]
> Sent: Thursday, February 18, 2010 10:39 AM
> To: lucene-net-dev@lucene.apache.org
> Subject: using Porter
>
> I am confused about where I use PorterSteming algorithm. Do I use it in
> the
> indexer or the searcher?
>
> Also if anyone has any examples that would be great!
>
> Thanks
>
> Pamela
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message