lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shad Storhaug <s...@shadstorhaug.com>
Subject RE: Worsening of indexing performance with Lucene .Net 4.8_beta005 when compared with equivalent Java version and Lucene .Net 3.0.3
Date Fri, 15 Jun 2018 03:59:18 GMT
Juliya,

Thanks for the report.

Could you put together an integration test or a console application that demonstrates how
you have Lucene.Net configured to see these performance issues when compared with Lucene.Net
3.0.3 and Lucene 4.8.0? There are many factors at play including which Codec, which Analyzer(s),
and which data types you are using that may have less than optimal code, so we would need
more information to narrow it down.

Do note that the IndexWriter was designed to throw exceptions and catch exceptions at various
levels in the execution stack as a way to control the flow of the application, which is sure
to make Lucene.Net somewhat less optimal than Lucene, especially when debugging. This probably
doesn't explain all of the difference you are seeing, though.

Thanks,
Shad Storhaug (NightOwl888)


-----Original Message-----
From: juliya james [mailto:juliyamj@yahoo.co.in.INVALID] 
Sent: Monday, June 4, 2018 4:56 PM
To: dev@lucenenet.apache.org; mikemccand@apache.org; ehatcher@apache.org
Subject: Re: Worsening of indexing performance with Lucene .Net 4.8_beta005 when compared
with equivalent Java version and Lucene .Net 3.0.3

Hi All,
Any info on the performance problem with Lucene .Net 4.8.0 mentioned in the previous email
would be appreciable.
Thanks & Regards,Juliya
============================================================================Hi,
We are observing substantial performance worsening while indexing with Lucene .Net 4.8_beta005,
when compared with equivalent Java version and Lucene .Net 3.0.3. 
The table below shows the comparison of  indexing time with different Lucene versions for
different index sizes.As you can see, indexing time has increased by almost x2 with .Net
4.8_beta005, especially with bigger index sizes.
Are there any known issues related to indexing performance with Lucene .Net 4.8? Or, is there
any explanation for such a behavior? 

| # | Lucene .Net 4.8_beta005 | Lucene Java 4.8.0 | Lucene .Net 3.0.3 |
|   | Index size(MB)  | IndexingTime(s) | Index size(MB)  | IndexingTime(s) | Index size(MB) 
| IndexingTime(s) |
| 1 | 5.4 | 3 | 5 | 2 | 8.2 | 1 |
| 2 | 27.46 | 14 | 25 | 8 | 40.6 | 8 |
| 3 | 41.32 | 21 | 32 | 13 | 58.49 | 12 |
| 4 | 47.66 | 32 | 45 | 15 | 78.85 | 16 |
| 5 | 95.3 | 60 | 90 | 25 | 157.3 | 33 |
| 6 | 238.14 | 143 | 221 | 62 | 388.15 | 82 |
| 7 | 476.4 | 282 | 385 | 140 | 771.09 | 169 |


Note: - The quantity of data given for indexing(input) is the same for the measurements shown
in a row: Only the Lucene versions used were changed.  Data was split to several documents,
each document may have ~1MB of data.  Most of the data was indexed with the field property
[Field.Store.NO, Field.Index.ANALYZED_NO_NORMS]- "Index Size" column shows the size of index
generated(in MB) and "Indexing Time" column shows the time taken to index that (in seconds).
Thanks & Regards,Juliya
 

    On Wednesday, April 4, 2018 6:21 PM, juliya james <juliyamj@yahoo.co.in> wrote:
 

 Hi,
We are observing substantial performance worsening while indexing with Lucene .Net 4.8_beta005,
when compared with equivalent Java version and Lucene .Net 3.0.3. 
The table below shows the comparison of  indexing time with different Lucene versions for
different index sizes.As you can see, indexing time has increased by almost x2 with .Net
4.8_beta005, especially with bigger index sizes.
Are there any known issues related to indexing performance with Lucene .Net 4.8? Or, is there
any explanation for such a behavior? 

| # | Lucene .Net 4.8_beta005 | Lucene Java 4.8.0 | Lucene .Net 3.0.3 |
|   | Index size(MB)  | IndexingTime(s) | Index size(MB)  | IndexingTime(s) | Index size(MB) 
| IndexingTime(s) |
| 1 | 5.4 | 3 | 5 | 2 | 8.2 | 1 |
| 2 | 27.46 | 14 | 25 | 8 | 40.6 | 8 |
| 3 | 41.32 | 21 | 32 | 13 | 58.49 | 12 |
| 4 | 47.66 | 32 | 45 | 15 | 78.85 | 16 |
| 5 | 95.3 | 60 | 90 | 25 | 157.3 | 33 |
| 6 | 238.14 | 143 | 221 | 62 | 388.15 | 82 |
| 7 | 476.4 | 282 | 385 | 140 | 771.09 | 169 |


Note: - The quantity of data given for indexing(input) is the same for the measurements shown
in a row: Only the Lucene versions used were changed.  Data was split to several documents,
each document may have ~1MB of data.  Most of the data was indexed with the field property
[Field.Store.NO, Field.Index.ANALYZED_NO_NORMS]- "Index Size" column shows the size of index
generated(in MB) and "Indexing Time" column shows the time taken to index that (in seconds).
Thanks & Regards,Juliya






   
Mime
View raw message