lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simon Svensson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENENET-537) Memory leak while indexing more than 500.000 documents.
Date Mon, 24 Feb 2014 16:19:19 GMT

    [ https://issues.apache.org/jira/browse/LUCENENET-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13910441#comment-13910441
] 

Simon Svensson commented on LUCENENET-537:
------------------------------------------

Hi,

The dump can be effectively compressed using your favorite archive utility, but it will probably
be several hundreds of megabytes, perhaps even over a gigabyte in size. It would be better
if you can publish this on http/ftp link. Keep in mind that the memory dump may contain private
information, it will contain whatever your process has loaded at the moment (credentials,
database connection strings, urls, ...)

Do you have a stack trace of the exception? Or any performance counters what it happens? Some
relevant counters would be "Gen 0 heap size", "Gen 1 heap size", "Gean 2 heap size", "Large
Object Heap size", "% Time in GC" found under .NET CLR Memory.

Can this problem be reliable reproduced? This it occur everytime you execute the code snippet?
Perhaps it's easier to try to index the documents in batches of thousands, calling IndexWriter.Flush
between? What does your CreateIndexWriter do? Are you calling SetRAMBufferSizeMB or SetMaxBufferedDocs
with impossible large?

// Simon

> Memory leak while indexing more than 500.000 documents.
> -------------------------------------------------------
>
>                 Key: LUCENENET-537
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-537
>             Project: Lucene.Net
>          Issue Type: Bug
>          Components: Lucene.Net Core
>    Affects Versions: Lucene.Net 3.0.3
>         Environment: IIS
>            Reporter: Jörg Hubacher
>            Priority: Blocker
>
> When I'm reindexing a hudge number of documents I've getting an out of memory exception
after some hours.
>                 using (IndexWriter writer = this.CreateIndexWriter(_directory))
>                 {
>                     writer.DeleteAll();
>                     foreach (Product product in _service.GetAllProducts())
>                     {
>                         Document doc = CreateDocument(product);
>                         writer.AddDocument(doc);
>                     }
>                     writer.Commit();
>                     writer.Optimize();
>                 }
> This came up with version 3.0.3. Before it was ok.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message