lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pamela Foxcroft <pamelafoxcr...@gmail.com>
Subject Re: using Porter
Date Fri, 19 Feb 2010 12:50:54 GMT
Thanks Ben and Michael!

Pam

On Thu, Feb 18, 2010 at 2:45 PM, Ben Martz <benmartz@gmail.com> wrote:

> Pamela,
>
> I strongly recommend than anyone planning to work with Lucene.Net follows
> Michael's suggestion and reads LIA for background but once you've done
> that,
> here's a summary in just a few lines that may help get you up and running.
> I
> use Snowball (read about algorithmic justifications here:
> http://snowball.tartarus.org/) in my product for stemming. IIRC, Snowball
> can be found in the contrib directory. The approach just using the
> PorterStemming analyzer is pretty much identical.
>
> Indexing:
>
> IndexWriter writer = new IndexWriter(indexPath, new SnowballAnalyzer(
> "English"), false);
>
>
> Searching:
>
> Analyzer analyzer = new SnowballAnalyzer("English");
>
> QueryParser qp = new QueryParser("Contents", analyzer);
>
> Query query = qp.Parse(inQuery);
>
>
> Also keep in mind that if you use Highlighter that you will need to again
> use the Snowball analyzer when fragmenting the results (I'm actually
> reading
> the original field contents from a cached file on disk here, don't mind me
> :D)
>
> QueryScorer qs = new QueryScorer(query, reader, "Contents");
>
>  Highlighter hl = new Highlighter(qs);
>
> hl.SetMaxDocBytesToAnalyze(int.MaxValue);
>
>
>  TokenStream stream = new
> SnowballAnalyzer("English").TokenStream("Contents",
> new StringReader(fileContents));
>
>
>  TextFragment[] frag = hl.GetBestTextFragments(stream, fileContents, false,
> inDetailLevel == Int32.MaxValue ? k_MaxQueryFragments : inDetailLevel);
>
> Good luck,
> Ben
>
> On Thu, Feb 18, 2010 at 10:43 AM, Michael Garski <mgarski@myspace-inc.com
> >wrote:
>
> > Pamela,
> >
> > To get the results you expect using a stemmer, you would need to use it
> > at both index and search time.
> >
> > I don't have any examples on the use of a stemmer, and suggest checking
> > out the book Lucene in Action - there is an early access copy of the
> > next version available at http://www.manning.com/hatcher3/.  While the
> > book details the Java version of Lucene, the same APIs are present in
> > Lucene.Net.
> >
> > Michael
> >
> > -----Original Message-----
> > From: Pamela Foxcroft [mailto:pamelafoxcroft@gmail.com]
> > Sent: Thursday, February 18, 2010 10:39 AM
> > To: lucene-net-dev@lucene.apache.org
> > Subject: using Porter
> >
> > I am confused about where I use PorterSteming algorithm. Do I use it in
> > the
> > indexer or the searcher?
> >
> > Also if anyone has any examples that would be great!
> >
> > Thanks
> >
> > Pamela
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message