lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shad Storhaug (JIRA)" <>
Subject [jira] [Commented] (LUCENENET-469) Convert Java Iterator classes to implement IEnumerable<T>
Date Wed, 09 Aug 2017 14:23:01 GMT


Shad Storhaug commented on LUCENENET-469:

No problem. I am just a bit surprised given the choice for some prime green field development
vs this headache that you would choose this path. But nonetheless, my aching head thanks you

By now you have probably worked out that {{IBytesRefIterator}} should inherit {{IEnumerator<T>}}
and that implementing that interface isn't much more complicated than dividing the 1 {{BytesRef
Next();}} method into {{bool MoveNext();}} and {{BytesRef Current \{ get;\}}}. The other fun
stuff such as {{SeekCeil()}} and {{SeekExact()}} most likely won't be affected. And yes, this
means the {{Dispose()}} method should be a no-op in our case, but who knows, maybe it won't
be in a subclass somewhere - just make sure to put in the dispose pattern so we are covered.

The only real trick is ensuring the other methods on {{TermsEnum}} and its subclasses are
exposed when implementing the {{IEnumerable<BytesRef>}} on {{Terms}} (and anybody else
using {{TermsEnum}}). As I mentioned before, I think overloading {{GetEnumerator()}} with
a method that returns the {{TermsEnum}} type is the solution there. But, that probably doesn't
take into account any subclasses of {{Terms}} that have a different type of iterator returned
from their {{GetIterator()}} method than the abstract class {{TermsEnum}}.

{quote}Lastly, there's the observation that the while vs foreach examples you gave are not
equivalent as the while version does not call Dispose(). {quote}

That is just because {{TermsEnum}} doesn't implement {{IDisposable}} yet. Of course, that
all changes when implementing {{IEnumerator<T>}}...

And there are several places throughout the codebase where enumerators are utilized without
a using block or calling {{Dispose()}}. The tests aren't such a big concern, but we really
shouldn't be doing this in the production code. I am sure there are some places where you
can put your own implementation with its own unmanaged resources and watch the resource leaks
pile up because we aren't doing our job and calling {{Dispose()}}. The real trick is locating
them. I really wish Microsoft would put a feature in Visual Studio that searches for code
that should exist but doesn't :).

Then again, I haven't been using Code Analysis. Maybe that (or Resharper?) can be used to
quickly detect all of the {{Dispose()}} violations. If you can find them, maybe that is a
PR that should go ahead of the refactoring.

{quote}It's odd for a type to be disposing it's "parent" (though I know some StreamWriter/Reader
classes do, it's still odd/annoying).{quote}

Agreed - when using {{Dispose()}} in conjunction with the decorator pattern things get odd.
But how does that apply to {{TermsEnum}}? Disposing a superclass is not quite the same thing
(and is rather expected using the dispose pattern).

{quote}All very Tricky...

I think there's some combination of

# isolating the "item" type
# using the collection object behaviors from inside the loop
# implementing the enumerator such that it is ok about the cursor being changed from outside

It's going to be very hard to get the semantics of each type in a sane shape such that it's
hard for a dev to get tripped up.{quote}

In Java the "item" type is usually the one that is returned from the {{Next()}} method. But
the real issue is all of the "extras" that come along with these iterators and how to expose
them without casting the result of {{GetEnumerator()}}. Keep in mind the "extras" (ala LINQ)
are usually part of the {{IEnumerable<T>}} not part of the {{IEnumerator<T>}}.
Maybe trying to expose the "extras" is the wrong approach altogether. Maybe those "extras"
are crying out to be refactored into services of their own that are separate from the enumerator.

If you go that route, note that because we are a port we had to redefine what "maintainable"
means. So, when refactoring a class into many, the typical approach I follow is to drop all
of that code into the same code file that is being refactored (in the same order as much as
possible). This will make it easy to figure out where that code is when upgrading to the next
Lucene version. The standard approach of putting each type into a file on its own would be
chaos for a port version upgrade.

But you seem to have a handle on things. If there is a solution to be found here, I am sure
you will be able to find it. If you need a sounding board to run your ideas by, I am here.

{quote}Which is why this have been open for 5+ years {quote}

Yea, I bet the decision to wait until the "next release" was a decision that came with some
regret. Unfortunately, Lucene.Net struggled to get over the hump of getting a 4.x release
because the project grew by a factor of 10. We are almost there now, and the path to catch
up to Lucene 6.x is now clear considering how much smaller the change set is between here
and there than the monster change set between 3.x and 4.x.

> Convert Java Iterator classes to implement IEnumerable<T>
> ---------------------------------------------------------
>                 Key: LUCENENET-469
>                 URL:
>             Project: Lucene.Net
>          Issue Type: Sub-task
>          Components: Lucene.Net Contrib, Lucene.Net Core
>    Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g, Lucene.Net 3.0.3, Lucene.Net
>         Environment: all
>            Reporter: Christopher Currens
>             Fix For: Lucene.Net 4.8.0
> The Iterator pattern in Java is equivalent to IEnumerable in .NET.  Classes that were
directly ported in Java using the Iterator pattern, cannot be used with Linq or foreach blocks
in .NET.
> {{Next()}} would be equivalent to .NET's {{MoveNext()}}, and in the below case, {{Term()}}
would be as .NET's {{Current}} property.  In cases as below, it will require {{TermEnum}}
to become an abstract class with {{Term}} and {{DocFreq}} properties, which would be returned
from another class or method that implemented {{IEnumerable<TermEnum>}}.
> {noformat} 
> 	public abstract class TermEnum : IDisposable
> 	{
> 		public abstract bool Next();
> 		public abstract Term Term();
> 		public abstract int DocFreq();
> 		public abstract void  Close();
> 	        public abstract void Dispose();
> 	}
> {noformat} 
> would instead look something like:
> {noformat} 
> 	public class TermFreq
> 	{
> 		public abstract Term { get; }
> 		public abstract int { get; }
> 	}
>         public abstract class TermEnum : IEnumerable<TermFreq>, IDisposable
>         {
>                 // ...
>         }
> {noformat}
> Keep in mind that it is important that if the class being converted implements {{IDisposable}},
the class that is enumerating the terms (in this case {{TermEnum}}) should inherit from both
{{IEnumerable<T>}} *and* {{IDisposable}}.  This won't be any change to the user, as
the compiler automatically calls {{IDisposable}} when used in a {{foreach}} loop.

This message was sent by Atlassian JIRA

View raw message