lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simon Svensson (JIRA)" <j...@apache.org>
Subject [jira] [Closed] (LUCENENET-600) Creating an IndexWriter with a RAMDirectory causes two exceptions to be thrown
Date Mon, 21 Jan 2019 08:14:00 GMT

     [ https://issues.apache.org/jira/browse/LUCENENET-600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Simon Svensson closed LUCENENET-600.
------------------------------------
    Resolution: Won't Fix

I'm closing this as a Won't Fix since our current behavior matches that of Lucene 4.8. We
may revisit this issue in the future when we're looking into making the code look more like
.NET rather than a Java port.

> Creating an IndexWriter with a RAMDirectory causes two exceptions to be thrown
> ------------------------------------------------------------------------------
>
>                 Key: LUCENENET-600
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-600
>             Project: Lucene.Net
>          Issue Type: Bug
>          Components: Lucene.Net Core
>    Affects Versions: Lucene.Net 4.8.0
>            Reporter: Howard van Rooijen
>            Priority: Minor
>
> I have a document scoring algorithm built on top of Lucene. I've just upgraded it to
the 4.8.0-beta00005 packages (great job by the way).
> We essentially create an in memory index for a single document in order to do some parsing
/ processing / scoring / classification.
> I noticed while running our test suite that the CPU was spiking and also noticed that
a large number of first chance exceptions were being generated by these two lines of code:
> {{var directory = new RAMDirectory();}}
> {{var indexWriter = new IndexWriter(directory, new IndexWriterConfig(LuceneVersion.LUCENE_48,
new ScorableDocumentAnalyzer(LuceneVersion.LUCENE_48)));}}
> The first exception is:
> {{'System.IO.FileNotFoundException' in Lucene.Net.dll ("segments.gen"). }}
> The second exception is:
> {{'Lucene.Net.Index.IndexNotFoundException' in Lucene.Net.dll ("no segments* file found
in RAMDirectory@21af1a5 lockFactory=Lucene.Net.Store.SingleInstanceLockFactory:}}
> Based on reading / research, I believer this is because the RAMDirectory is initialised
to be null, and when the IndexWriter is created it tries to query the RAMDirectory and FileNotFoundException
is thrown.
> Is it possible to either initialized as empty rather than null - i.e. reading the directory
would not throw an exception - this might involve trying to add an "segments.gen" entry and
a matching "segments_n" segmentinfo entry, alternatively is it possible not to throw an exception
in this use case? 
> Or do you have a suggestion for how it would be possible to manually initialise the RAMDirectory
before passing it to the IndexWriter?
> Because these two lines are being called per request - we're seeing 2 exceptions per
request - this seems like an expensive way of initialising an IndexWriter. We've already had
to replace QueryParser with SimpleQueryParser because QueryParser was throwing 50+ exception
internally when being instantiated.
> If anyone can point me in the right direction, I'd be more than happy to try and create
a fix / PR. But I'm wondering as RAMDirectory is often used for unit testing scenarios - does
anyone have any deep knowledge about why this current behaviour is the default behaviour? 
> Many Thanks,
> Howard
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message