lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Stewart <Robert_Stew...@epam.com>
Subject Re: [Lucene.Net] Lucene Steroids
Date Thu, 07 Jul 2011 11:10:28 GMT
I have built something similar using NTFS hard-links and re-using existing local snapshot files,
etc.  It runs in production for 3+ years now with more than 100 million docs, and distributes
new snapshots from master servers every minute.  It does not use any rsync, but only leverages
unique file names in lucene - it only copies files not already existing on slaves, and uses
NTFS hard links to "copy" existing local files into new snapshot directory. Also, on the masters,
it just uses NTFS hard links to create a new "snapshot" of the master index, and then slaves
just look for new snapshot directories on the master servers.  When new directory shows up,
it looks at existing local snapshot to see which files are new on master (or have been deleted
by master), and then only copies new files.  It does not need to send any explicit commit
operations, and there is no explicit communication between masters and slaves (slaves just
look in some remote directory for new snapshot sub-directories).   This has worked great with
no problems at all.  All this was built prior to SOLR being available on windows.  Going forward
we are transitioning to Java and SOLR on Linux (it is just to hard to keep up with improvements
otherwise IMO).



On Jul 6, 2011, at 8:22 PM, Guilherme Balena Versiani wrote:

> Hi,
> 
> I am working on a derived work of Solr for .NET. The purpose is to obtain a similar solution
of Lucene replication available at Solr, but without the need to port all Solr code.
> 
> There is a SnapShooter, SnapPuller and a SnapInstaller. The SnapShooter does similar
work as in Solr script. The SnapPuller uses cwRsync to replicate the database between machines,
but without storing the snapshot.current.MACHINENAME files on master, as cwRsync does no support
sync with the server. The SnapInstaller tries to substitute the Lucene database files "in-place"
-- the Lucene application should use a "SteroidsFSDirectory" that creates a special "SteroidsFSIndexInput"
that permits to rename files in use; after that, SnapInstaller sends a "commit" operation
through a Windows named pipe to the application to reset its current IndexSearcher instance.
> 
> This solution has the "suggestive" name of Lucene Steroids, and was hosted in BitBucket.org.
What is the best way to continue to distribute it? Should I continue to maintain it on BitBucket.org
or should I apply to Lucene.NET project (I don't know how) to include it on Contrib modules?
> 
> The current code is available at http://bitbucket.org/guibv/lucene.steroids. The work
is incomplete; the first stable version should be available on next few days.
> 
> Best regards,
> Guilherme Balena Versiani.


Mime
View raw message