lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Venkadesh Thangavel <Venkadesh.Thanga...@genesys.com>
Subject RE: HINDI LANGUAGE ANALYZER FROM LUCENE
Date Thu, 09 Apr 2020 12:40:31 GMT
Hello Shad,

Thanks for your assistance.

Is there some reason you are not using the NuGet package instead of using the source code
inside of your project?

We are already using LUCENE.NET existing source code (older one ) in our solution that's why
we just tried out to incorporate HINDI along with existing languages in our project.

I will check internally and connect with you if need any assistance.

Thanks & Regards,
Venkadesh T

-----Original Message-----
From: Shad Storhaug <shad@shadstorhaug.com> 
Sent: 09 April 2020 18:02
To: Venkadesh Thangavel <Venkadesh.Thangavel@genesys.com>; dev@lucenenet.apache.org
Cc: Gal Ferrera <Gal.Ferrera@genesys.com>; Tidhar Israel <tidhar.israel@genesys.com>;
Subramaniyan Hariharan <Subramaniyan.Hariharan@genesys.com>; michael@bongohr.org
Subject: RE: HINDI LANGUAGE ANALYZER FROM LUCENE

Hi Vankadesh,

Please see the update to TokenStream where we implement the .NET dispose pattern:

https://github.com/apache/lucenenet/blob/master/src/Lucene.Net/Analysis/TokenStream.cs#L189-L198

The Close() method was replaced with Dispose(bool) consistently throughout the source code.

Is there some reason you are not using the NuGet package instead of using the source code
inside of your project? If you cannot use the NuGet package, then I suggest bringing all of
the source code into your project from a specific release to account for all for all of the
API changes, rather than doing it piece by piece.


Regards,
Shad Storhaug (NightOwl888)
Project Chairperson - Apache Lucene.NET


-----Original Message-----
From: Venkadesh Thangavel <Venkadesh.Thangavel@genesys.com> 
Sent: Thursday, April 9, 2020 6:29 PM
To: Shad Storhaug <shad@shadstorhaug.com>; dev@lucenenet.apache.org; michael@bongohr.org
Cc: Gal Ferrera <Gal.Ferrera@genesys.com>; Tidhar Israel <tidhar.israel@genesys.com>;
Subramaniyan Hariharan <Subramaniyan.Hariharan@genesys.com>
Subject: RE: HINDI LANGUAGE ANALYZER FROM LUCENE

Hello Shad/Michel,

Please find the path from GitHub which we were used for download the source code For HINDI.

https://github.com/apache/lucenenet/tree/master/src/Lucene.Net.Analysis.Common/Analysis

Yes, it is an early source code of Lucene.NET 4.8.0.

Thanks & Regards,
Venkadesh T
-----Original Message-----
From: Shad Storhaug <shad@shadstorhaug.com> 
Sent: 09 April 2020 14:09
To: dev@lucenenet.apache.org; Venkadesh Thangavel <Venkadesh.Thangavel@genesys.com>;
michael@bongohr.org
Cc: Gal Ferrera <Gal.Ferrera@genesys.com>; Tidhar Israel <tidhar.israel@genesys.com>;
Subramaniyan Hariharan <Subramaniyan.Hariharan@genesys.com>
Subject: RE: HINDI LANGUAGE ANALYZER FROM LUCENE

Hi Vankadesh,

Are these changes being copied into a Lucene.NET 3.x application or to early source code of
Lucene.NET 4.8.0 prior to the release on NuGet?

To conform with .NET conventions, we have renamed "close()" in Java to the corresponding method
in .NET, "Dispose()" in Lucene.NET 4.8.0. In Java, there is a Closeable interface that has
been replaced by the nearest .NET match, IDisposable. As such, we wouldn't want to use "Close()"
in .NET unless we are not using "Dispose()" according to .NET best practices (for example,
if it is expected that the object will be "opened" again).

Hi Michael,

We have an open JIRA ticket to review these changes to ensure we are following .NET best practices
(https://issues.apache.org/jira/projects/LUCENENET/issues/LUCENENET-626), and another that
seemed to point in the direction that TokenStreams were being reopened (https://issues.apache.org/jira/projects/LUCENENET/issues/LUCENENET-611),
thus might be better off using "Close()" instead of "Dispose()" in that case. However, a bug
was since patched in the TestFramework that may have been the culprit that was causing this
"reopen" to occur.

We need to do a full assessment from a usability point of view to be sure of which to use,
and in what cases, but in general we should use "Dispose()" unless we have a good reason not
to.


Regards,
Shad Storhaug (NightOwl888)
Project Chairperson - Apache Lucene.NET


-----Original Message-----
From: michael@bongohr.org <michael@bongohr.org> 
Sent: Thursday, April 9, 2020 2:08 PM
To: 'Venkadesh Thangavel' <Venkadesh.Thangavel@genesys.com>; dev@lucenenet.apache.org
Cc: 'Gal Ferrera' <Gal.Ferrera@genesys.com>; 'Tidhar Israel' <tidhar.israel@genesys.com>;
'Subramaniyan Hariharan' <Subramaniyan.Hariharan@genesys.com>
Subject: RE: HINDI LANGUAGE ANALYZER FROM LUCENE

Dear Venkadesh

Thanks so much for your follow up. If necessary, would it be possible for me to incorporate
your changes to the source code to benefit other users?

Kind regards
Michael

-----Original Message-----
From: Venkadesh Thangavel <Venkadesh.Thangavel@genesys.com> 
Sent: Thursday, April 2, 2020 2:03 PM
To: michael@bongohr.org
Cc: Gal Ferrera <Gal.Ferrera@genesys.com>; Tidhar Israel <tidhar.israel@genesys.com>;
Subramaniyan Hariharan <Subramaniyan.Hariharan@genesys.com>
Subject: RE: HINDI LANGUAGE ANALYZER FROM LUCENE

Hello Michal,

Thanks for your response.

Please find the details steps as below.

1. Down load LUCENE.NET latest source code from the below path: for hindi analyzer.
https://github.com/apache/lucenenet/tree/master/src/Lucene.Net.Analysis.Common/Analysis/Hi

2. We had already LUCENE.NET project to support other languages but we don't want to disturb
older one.
Hence we used only Hindi analyzer from LUCENE.NET and added to existing LUCENE.

3. We found that it has missing classes, we had added it as separate for HINDI analyzer and
subordinate classes without disturbing existing one And compiled source code. 

4. And identified that the error is throwing as "close() method is missing" and caused in
the file (TokenStream.cs- latest file).
5. Added the virtual method as like below.

public virtual void  Close()
       {
       }

6. This method was override into the following subordinate classes.

ClassicTokenizer.cs							
StandardTokenizer.cs
UAX29URLEmailTokenizer.cs	
CharTokenizer.cs	
FilteringTokenFilter.cs							
TokenFilter.cs	

Now, the problem was resolved and running successfully.

Thanks in-Advance.

Kindly let us know, if you required any more inputs.

Thanks & Regards,
Venkadesh T

-----Original Message-----
From: Gal Ferrera <Gal.Ferrera@genesys.com>
Sent: 31 March 2020 19:30
To: Venkadesh Thangavel <Venkadesh.Thangavel@genesys.com>
Subject: Re: HINDI LANGUAGE ANALYZER FROM LUCENE

Did you notice?



´╗┐On 31/03/2020, 16:31, "Michael Condillac on behalf of dev@bongohr.org" <michael@bongohr.org
on behalf of dev@bongohr.org> wrote:

    Hi Venkadesh,
    
    I am new to the project but if you can give me some more technical details
    and specific errors you are seeing I can try and reproduce your issues. 
    
    Thanks
    Michael
    
    -----Original Message-----
    From: Venkadesh Thangavel <Venkadesh.Thangavel@genesys.com> 
    Sent: Tuesday, March 31, 2020 7:15 PM
    To: dev@lucenenet.apache.org; user@lucenenet.apache.org
    Cc: Gal Ferrera <Gal.Ferrera@genesys.com>
    Subject: FW: HINDI LANGUAGE ANALYZER FROM LUCENE
    
    Hi,
    
    I have sent email to
    user@lucenenet.apache.org<mailto:user@lucenenet.apache.org> about the issue
    in LUCENE.NET but didn't receive any response.
    
    Can you please comment on my issue ?
    
    Thanks & Regards,
    Venkadesh T
    From: Venkadesh Thangavel
    Sent: 19 March 2020 16:40
    To: user@lucenenet.apache.org
    Subject: HINDI LANGUAGE ANALYZER FROM LUCENE
    
    Hello,
    
    I have downloaded LUCENE.NET source code  from GITHUB last year mid for
    Hindi language and used Hindi analyser.
    
    I faced some issues while compiling LUCENE regarding close method from
    analyser and implemented the same in LUCENE .
    
    Then it was successfully compiled and used.
    
    Is it right behaviour or some went wrong.
    
    Kindly suggest.
    
    Thanks & Regards,
    Venkadesh T
    
    
    
    


Mime
View raw message