lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <>
Subject [GitHub] [lucenenet] NightOwl888 commented on issue #403: How to use HIGH_COMPRESSION in Lucene.Net 4.8
Date Fri, 22 Jan 2021 01:08:56 GMT

NightOwl888 commented on issue #403:

   @rclabo is partially correct. This requires a custom codec configuration and the implementation
he provided is what I would recommend. However, when *reading* an index the codec must be
properly registered for Lucene.NET to instantiate it.
   When *writing* the index, the codec can be specified as:
   IndexWriterConfig indexConfig = new IndexWriterConfig(LuceneVersion.LUCENE_48, standardAnalyzer);
   indexConfig.Codec = new Lucene46HighCompressionCodec();
   Or when properly registered (see registration steps below):
   IndexWriterConfig indexConfig = new IndexWriterConfig(LuceneVersion.LUCENE_48, standardAnalyzer);
   indexConfig.Codec = Codec.ForName("Lucene46HighCompression");
   ## Codec Registration
   When *reading*, the codec is instantiated based on its text name (in this case `"Lucene46HighCompression"`)
which is stored in the index file. Codecs are supplied to Lucene.NET using an abstract factory
pattern (which fits nicely with dependency injection if you are using it).
   In your application's startup path you will need to add the following line to register
your custom codec with high compression:
   Codec.SetCodecFactory(new DefaultCodecFactory {
       CustomCodecTypes = new Type[] { typeof(Lucene46HighCompressionCodec) }
   > Alternatively, see the [codec documentation](
for examples of using `Microsoft.Extensions.DependencyInjection` to register codecs. Registering
with other dependency injection containers is similar.
   > **NOTE:** Codecs are registered as singleton lifestyle. That is, once instantiated
the same instance with that name is always used for all read/write operations.
   This tells Lucene.NET what type to load. The `CodecNameAttribute` that decorates the custom
codec (which is optional) specifies the name of the codec.
   > **IMPORTANT:** since high compression is binary incompatible with normal compression,
you should not reuse the same name as `"Lucene46"` for your custom codec name. Reusing the
same name is allowed and the last registration wins, but will only work if the codec has the
exact same binary format.

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:

View raw message