lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shad Storhaug (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENENET-469) Convert Java Iterator classes to implement IEnumerable<T>
Date Wed, 09 Aug 2017 20:41:00 GMT

    [ https://issues.apache.org/jira/browse/LUCENENET-469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120627#comment-16120627
] 

Shad Storhaug commented on LUCENENET-469:
-----------------------------------------

{quote}It's this Enum class I was calling the "collection" type 'cos it kind of holds the
collection state.

What I was proposing was to make the Enum also look like an IEnumerable so that you can foreach
over it.
Which means it needs a GetEnumerable() to return an IEnumerator. This new enumerator is the
EnumEnumerator type from in previous messages.

So, I'm taking an Enum, adding an IEnumerable facade, to return a new IEnumerable type which
pokes the Enum to move forward.{quote}

That is an interesting idea.

I am not sure if the state is actually being "iterated over" so much as it is being calculated
on the fly. But then, {{IEnumerable<T>}} is just an interface - it doesn't actually
define any behavior. In fact, I had to redefine what the key and value {{ICollection<T>}}
s do in CharArrayMap's KeyCollection and ValueCollection, so maybe this is also possible.

> NB: some thought would be needed to define what the "item" class (representing Current)
should look like in each case.
> NB: Calling a Current changing method (ie one of the Seek...() methods) from inside a
foreach would result in undefined behavior.

So then we end up with some behavior that doesn't fit the mold.

I thought for a moment that it could be possible these SeekCeil and SeekExact methods are
only defined here because there are no such thing as extension methods in Java, but since
they are overridden in subclasses, that is probably not a valid assumption.

Speaking of which, I suggest you have a look at how they are used in the TermsEnum overloads
in the codecs (for example [Lucene40TermVectorsReader](https://github.com/apache/lucenenet/blob/468199e3fa95c7e1f77b14e7f00aeaafd7c2f8b9/src/Lucene.Net/Codecs/Lucene40/Lucene40TermVectorsReader.cs#L446))
or [Lucene45DocValuesProducer.TermsEnumAnonymousInnerClassHelper](https://github.com/apache/lucenenet/blob/468199e3fa95c7e1f77b14e7f00aeaafd7c2f8b9/src/Lucene.Net/Codecs/Lucene45/Lucene45DocValuesProducer.cs#L1090).

As you can see, the complexity of these enums becomes quite high (lots of internal state),
and it seems to me it would be simpler to port future changes from Lucene if the general structure
of the "extras" (that is, the properties and Seek* methods) were left alone (or at least left
intact and used by some other type of facade). The DocsAndPositionsEnum s are even more complex
than this.

The only thing I was hoping to get out of this was the added ability to use a foreach loop
on Terms (and possibly the DocIdSetIterator). Sure we could refactor the TermsEnum data structure
to be more .NET-like, but then where would we be when we need to connect the dots on the next
Lucene upgrade? I am not necessarily against it, but finding bugs in the codec code is no
picnic, so we would need some clear way to ensure future codecs behavior can be mapped onto
the new structure without causing too much pain. Keep in mind, the codecs change in every
major (and nearly every minor) version.

{quote}Phew... Did that make sense; fit with your view of the universe?{quote}

Wow, do I sound so closed minded? I think I need to work on my delivery :). 

> Convert Java Iterator classes to implement IEnumerable<T>
> ---------------------------------------------------------
>
>                 Key: LUCENENET-469
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-469
>             Project: Lucene.Net
>          Issue Type: Sub-task
>          Components: Lucene.Net Contrib, Lucene.Net Core
>    Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g, Lucene.Net 3.0.3, Lucene.Net
4.8.0
>         Environment: all
>            Reporter: Christopher Currens
>             Fix For: Lucene.Net 4.8.0
>
>
> The Iterator pattern in Java is equivalent to IEnumerable in .NET.  Classes that were
directly ported in Java using the Iterator pattern, cannot be used with Linq or foreach blocks
in .NET.
> {{Next()}} would be equivalent to .NET's {{MoveNext()}}, and in the below case, {{Term()}}
would be as .NET's {{Current}} property.  In cases as below, it will require {{TermEnum}}
to become an abstract class with {{Term}} and {{DocFreq}} properties, which would be returned
from another class or method that implemented {{IEnumerable<TermEnum>}}.
> {noformat} 
> 	public abstract class TermEnum : IDisposable
> 	{
> 		public abstract bool Next();
> 		public abstract Term Term();
> 		public abstract int DocFreq();
> 		public abstract void  Close();
> 	        public abstract void Dispose();
> 	}
> {noformat} 
> would instead look something like:
> {noformat} 
> 	public class TermFreq
> 	{
> 		public abstract Term { get; }
> 		public abstract int { get; }
> 	}
>         public abstract class TermEnum : IEnumerable<TermFreq>, IDisposable
>         {
>                 // ...
>         }
> {noformat}
> Keep in mind that it is important that if the class being converted implements {{IDisposable}},
the class that is enumerating the terms (in this case {{TermEnum}}) should inherit from both
{{IEnumerable<T>}} *and* {{IDisposable}}.  This won't be any change to the user, as
the compiler automatically calls {{IDisposable}} when used in a {{foreach}} loop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message