lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "immanouel (JIRA)" <j...@apache.org>
Subject [jira] Created: (LUCENENET-111) lucene score formula
Date Tue, 18 Mar 2008 20:31:24 GMT
lucene score formula
--------------------

                 Key: LUCENENET-111
                 URL: https://issues.apache.org/jira/browse/LUCENENET-111
             Project: Lucene.Net
          Issue Type: Bug
         Environment: ASP.NET   C#
            Reporter: immanouel


I´m working with lucene. net in C# and i, being quite confuse, have a question :

First of all i created an IndexWriter code to insert text on lucene, then 
an indexsearcher ...
here´s the index write code:
/........................................................................................................
            IndexWriter writer = new IndexWriter(sIndexPath, new StandardAnalyzer(), false);
                                     
        Document doc = new Document();
        doc.Add(new Field("title", sHeader, Field.Store.YES, Field.Index.UN_TOKENIZED));
        doc.Add(new Field("link", sType, Field.Store.YES, Field.Index.UN_TOKENIZED));
       doc.Add(new Field("content", sContent, Field.Store.YES, Field.Index.TOKENIZED));
        writer.AddDocument(doc);
        writer.Optimize();
        writer.Close();
/......................................................................................................
here´s the  IndexSearcher code:
(As you can see there´s an - explanation - code output)
//....................................................................................................................................
    IndexSearcher searcher = new IndexSearcher(IndexLocation());
      QueryParser oParser = new QueryParser("content", new StandardAnalyzer());
     string sSearchQuery = TextBox1.Text;
      Hits oHitColl = searcher.Search(oParser.Parse(sSearchQuery));

       for (int i = 0; i < oHitColl.Length(); i++)
       {
                                                        

      Explanation explanation = searcher.Explain(query, oHitColl.Id(i));//generate explanation
of single document for query 
     // Document oDoc = oHitColl.Doc(i);
    string conteudo = oDoc.Get("content");
      if (conteudo != null)
          {
          Label1.Text = Label1.Text + explanation.ToString() + "<br>";//output explanation
        Label2.Text = Label2.Text + "</p>-------------<br>";
                                                        }
//.............................As you can see the code is nothing special..............................................



Everything went well, except that i don´t understand something in the output... :



1,356585 = fieldWeight(content:açores in 448), product of: 1 = tf(termFreq(content:açores)=1)
2,713169 = idf(docFreq=85) 0,5 = fieldNorm(field=content, doc=448) 
----------
0,4239327 = fieldWeight(content:açores in 253), product of: 2 = tf(termFreq(content:açores)=4)
2,713169 = idf(docFreq=85) 0,078125 = fieldNorm(field=content, doc=253) 
----------
0,4153675 = fieldWeight(content:açores in 125), product of: 2,44949 = tf(termFreq(content:açores)=6)
2,713169 = idf(docFreq=85) 0,0625 = fieldNorm(field=content, doc=125) 
----------
0,3791769 = fieldWeight(content:açores in 210), product of: 2,236068 = tf(termFreq(content:açores)=5)
2,713169 = idf(docFreq=85) 0,0625 = fieldNorm(field=content, doc=210) 
----------
0,3671364 = fieldWeight(content:açores in 259), product of: 1,732051 = tf(termFreq(content:açores)=3)
2,713169 = idf(docFreq=85) 0,078125 = fieldNorm(field=content, doc=259) 
----------
0,3634466 = fieldWeight(content:açores in 95), product of: 2,44949 = tf(termFreq(content:açores)=6)
2,713169 = idf(docFreq=85) 0,0546875 = fieldNorm(field=content, doc=95) 
etc..


So here goes the question:
shouldn´t the output order be based on the termFreq ?? how come termfrequecy = 1 be in the
top order list? 
Shouldn´t the (termFreq(content:açores)=1) be the last (on the list) and  (termFreq(content:açores)=6)
at top (of the list)?
Am i doing something wrong? Is there something about lucene score formula that i should know?

Thanks!

PS - Please response!!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message