lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Elena Demidova <demid...@l3s.de>
Subject Remote searches with Lucene
Date Fri, 11 Aug 2006 10:59:42 GMT
Dear All,

The application I am working on is intended to make use of the 
distributed search capabilities of the Lucene library. While trying to 
work with the Lucene’s RemoteSearchable class, I faced some problems 
cased by the current Lucene implementation. In following I’ll try to 
describe them, as well as the possible ways of their solution, I 
identified. The most important question for me is, if these changes have 
a chance to be integrated in the coming Lucene versions, such that 
remote searches would really become feasible. I would appreciate any 
feedback.

Best wishes,
Elena Demidova

Now to the problems themselves:

1. Architecture issue

The first problem concerns the construction of the RemoteSearchable 
object. .Net framework allows for both, server and client activation 
models of the remote objects. Currently, RemoteSearchable class 
possesses only one constructor that requires knowledge of a local 
Searchable object:

public RemoteSearchable(Lucene.Net.Search.Searchable local)

Since this “local” object is located on the server, knowledge of the 
server’s index paths is needed for its creation. However, there are at 
least some scenarios where only the server, but not the client, knows 
where the indexes are stored on the server side. I think this problem 
could be solved by extending RemoteSearchable class with a standard 
constructor that reads the names of the indexes to be published out of a 
configuration file on the server side.

2. Bug in Term construction

Another problem occurs as you try to perform a function call of a 
RemoteSearchable object. The only function which really works correctly 
is the MaxDoc() function. If you ask, for instance, for the document 
frequency using DocFreq(new Term(“field”,”value”)), you’ll always get 
“0” out of it. The reason for that is that all values, that are passed 
as arguments (and return values) for the remote calls need to be 
correctly serialized. For DocFreq function this argument is the Term 
object, which can not be correctly reconstructed on the server side. The 
constructor of the Term object performs additional “intern”-operation on 
the field names, which is not called during the default serialization. 
Thus the field names contained in the reconstructed Term object are not 
comparable with those in the index.

This problem can be solved by overloading of the serialization procedure 
for the objects of the Term class. In order to do that, Term class 
should be derived from the ISerializable interface and overload its 
serialization function "GetObjectData". The class itself need to store 
the “intern” value passed to its constructor, since this knowledge is 
required for the correct reconstruction of the object. Function 
GetObjectData describes then how the object is serialized.  Additional 
deserialization constructor allows then for the correct reconstruction 
of the object. The both operations are called automatically during the 
remote call execution. In following the necessary code changes in the 
Term class are presented:

//add derivation from the ISerializable interface
[Serializable()]
public sealed class Term : System.IComparable, ISerializable
…
//store the object’s intern value needed by the constructor
private bool intern;

internal Term(System.String fld, System.String txt, bool intern)
{
		…
		//store the object’s intern value
		this.intern=intern;
	}

      //Serialization function
public void GetObjectData(SerializationInfo info, StreamingContext context)
         {
             	info.AddValue("field", field);
             	info.AddValue("text", text);
		info.AddValue("intern",intern);
         }
		
//Deserialization constructor.
	public Term(SerializationInfo info, StreamingContext ctxt)
	{
		String fld=(String)info.GetValue("field", typeof(String));
		this.intern=(bool)info.GetValue("intern", typeof(bool));
		this.field = intern ? String.Intern(fld) : fld;				
this.text = (String)info.GetValue("text", typeof(String));
	}

Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message