lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pamela Foxcroft <pamelafoxcr...@gmail.com>
Subject Re: using Porter
Date Fri, 19 Feb 2010 13:08:04 GMT
I just complied it myself. However I am still curious as to whether there is
a version floating around somewhere.

Pam

On Fri, Feb 19, 2010 at 8:04 AM, Pamela Foxcroft
<pamelafoxcroft@gmail.com>wrote:

> Can anyone tell me where I can get the snowball.dll from?
>
> There was a lucene.net.dll, but I can't find the snowball one?
>
> Thanks
>
> Pam
>
>
> On Fri, Feb 19, 2010 at 7:50 AM, Pamela Foxcroft <pamelafoxcroft@gmail.com
> > wrote:
>
>> Thanks Ben and Michael!
>>
>> Pam
>>
>>
>> On Thu, Feb 18, 2010 at 2:45 PM, Ben Martz <benmartz@gmail.com> wrote:
>>
>>> Pamela,
>>>
>>> I strongly recommend than anyone planning to work with Lucene.Net follows
>>> Michael's suggestion and reads LIA for background but once you've done
>>> that,
>>> here's a summary in just a few lines that may help get you up and
>>> running. I
>>> use Snowball (read about algorithmic justifications here:
>>> http://snowball.tartarus.org/) in my product for stemming. IIRC,
>>> Snowball
>>> can be found in the contrib directory. The approach just using the
>>> PorterStemming analyzer is pretty much identical.
>>>
>>> Indexing:
>>>
>>> IndexWriter writer = new IndexWriter(indexPath, new SnowballAnalyzer(
>>> "English"), false);
>>>
>>>
>>> Searching:
>>>
>>> Analyzer analyzer = new SnowballAnalyzer("English");
>>>
>>> QueryParser qp = new QueryParser("Contents", analyzer);
>>>
>>> Query query = qp.Parse(inQuery);
>>>
>>>
>>> Also keep in mind that if you use Highlighter that you will need to again
>>> use the Snowball analyzer when fragmenting the results (I'm actually
>>> reading
>>> the original field contents from a cached file on disk here, don't mind
>>> me
>>> :D)
>>>
>>> QueryScorer qs = new QueryScorer(query, reader, "Contents");
>>>
>>>  Highlighter hl = new Highlighter(qs);
>>>
>>> hl.SetMaxDocBytesToAnalyze(int.MaxValue);
>>>
>>>
>>>  TokenStream stream = new
>>> SnowballAnalyzer("English").TokenStream("Contents",
>>> new StringReader(fileContents));
>>>
>>>
>>>  TextFragment[] frag = hl.GetBestTextFragments(stream, fileContents,
>>> false,
>>> inDetailLevel == Int32.MaxValue ? k_MaxQueryFragments : inDetailLevel);
>>>
>>> Good luck,
>>> Ben
>>>
>>> On Thu, Feb 18, 2010 at 10:43 AM, Michael Garski <
>>> mgarski@myspace-inc.com>wrote:
>>>
>>> > Pamela,
>>> >
>>> > To get the results you expect using a stemmer, you would need to use it
>>> > at both index and search time.
>>> >
>>> > I don't have any examples on the use of a stemmer, and suggest checking
>>> > out the book Lucene in Action - there is an early access copy of the
>>> > next version available at http://www.manning.com/hatcher3/.  While the
>>> > book details the Java version of Lucene, the same APIs are present in
>>> > Lucene.Net.
>>> >
>>> > Michael
>>> >
>>> > -----Original Message-----
>>> > From: Pamela Foxcroft [mailto:pamelafoxcroft@gmail.com]
>>> > Sent: Thursday, February 18, 2010 10:39 AM
>>> > To: lucene-net-dev@lucene.apache.org
>>> > Subject: using Porter
>>> >
>>> > I am confused about where I use PorterSteming algorithm. Do I use it in
>>> > the
>>> > indexer or the searcher?
>>> >
>>> > Also if anyone has any examples that would be great!
>>> >
>>> > Thanks
>>> >
>>> > Pamela
>>> >
>>> >
>>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message