lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Pook <andy.p...@gmail.com>
Subject Segments files
Date Tue, 30 Apr 2013 18:56:02 GMT
I'm having some trouble understanding when the "segments_*" and
"segments.gen" files are in sync with the segment files.

New segment files will be created when things like RAMBufferSizeMB are
exceeded and when merges happen (ie when Flush() is called). All good.
However the appropriate segements_x and segment.gen files are only updated
when a Commit() is performed.

This means that there are valid segment files on disk that are not
referenced the the current segments_ file. So if the process hosting the
IndexWriter dies (machine power failure) then a large number of segments
will be deleted on restart because they not referenced via segments.gen and
segments_x files.

Looking at the code this seems to be "by design". But my naive perspective
suggests that these should be kept in sync with the actual segments written
to disk.

Flush() will write the segment files but only Commit() will write the
segments.gen ans segments_x files.

Can anyone give some background on this (my google foo doesn't seem to be
working today).

Thanks,
  Andy

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message