ant-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Loren Kratzke <LKrat...@blueorigin.com>
Subject RE: Possible Ivy bug (and suggested fix) in ChainResolver
Date Wed, 25 Mar 2015 18:16:08 GMT
I found the root cause of the fishiness. It was not a bug (and was not actually in ChainResolver)
but is in my opinion a rather unfortunate and poorly chosen default setting that is to blame
(the latest-strategy).

During resolution in the chain resolver, the current sub-resolver tries to determine whether
it should be skipped and the previous artifact be selected over the current artifact. Assuming
that the "force" and "dynamic" tests do not result in an early rejection of the current artifact,
the "latest" of the two artifacts is selected by sorting the artifacts in a List using a Comparator
suited to the current latest-strategy (latest-revision, latest-time, etc).

The problem here is quite simple. The default latest-strategy is "latest-revision". So when
a second artifact has been resolved, and it is of the same revision as the first, nothing
gets sorted. The result is that you will get the first artifact found based upon the ordering
of your repositories in the chain instead of the newer of the two artifacts.

I think that either latest-time needs to be the default strategy or the latest revision comparator
needs to do a secondary sort to sort by lastModified time.

Possibly allow configuration of this behavior (should there be a secondary sort by time to
avoid stale artifacts, or not so that repository order breaks the tie). This is important
(critical) behavior and should be configurable.

Defaulting to latest-revision will not only deliver undesired stale artifacts, but it is unclear
to the user why they are getting stale artifacts or how to make it stop happening. The latest-time
strategy will give you the latest revision 99% of the time and the latest artifact 100% of
the time. But the latest-revision strategy will give you the latest artifact 100%, 50%, 33%,
or 25% of the time when the revision numbers are the same, depending upon how many resolvers
you have (1/n), and assuming that any repository may contain the latest artifact.

Furthermore, the docs do not give a great description of "force". I learned much about the
actual behavior of this attribute while debugging. First thing I learned was that it is not
a good name.

What force actually does is it allows a resolver to be considered when a previous resolver
has found an artifact. After the first artifact has been found, only repositories with force=true
will have a chance of competing (for instance, they might have a newer version of 1.0.0-SNAPSHOT).
Otherwise, they are discarded immediately and no date comparison is attempted.

Force should actually be named "considerAlways" or "considerAnyway". That seems to be a more
suitable name. No action requested here, but pointing out that this attribute has a misleading
name.

Summary:
1 - Please reconsider changing the default latest-strategy to be latest-time.

2 - Please consider adding a secondary "lastModified" sort to LatestRevisionStrategy.ArtifactInfoComparator
whether or not you change the default latest-strategy to latest-time or not.

3 - Please document, illustrate, and demonstrate in one place the behaviors of chain resolver
in combination with force, returnFirst, defaultLatestStrategy, ivy.resolver.default.check.modified,
useOrigin, and other settings and attributes that affect resolution behavior. (I am working
on this document now.)

4 - Please consider creating independent caches by default for each repository. I have not
drilled down on this issue yet, but I suspect that it fixes serious cache collision issues
that I think I saw while debugging (found and selected local repo artifact, checked cache
before delivery, ended up delivering cached stale artifact that came from a totally different
repo :( ).

Thanks,

L.K.

From: Loren Kratzke
Sent: Tuesday, March 24, 2015 1:09 PM
To: 'dev@ant.apache.org'
Subject: Possible Ivy bug (and suggested fix) in ChainResolver

I have a some observations about how the chain resolver selects a dependency. I think this
may be a bug but I am not sure because the intent of the source code is not entirely clear.
It reads one way, but behaves in a different way. I have pinpointed the exact spots in code
where this happens.

Here is my simple test setup used to debug this issue. I have two resolvers (Filesystem and
URL) configured in a ChainResolver in that order. I publish to one resolver and then the other
repeatedly and consume the result in another project. I use checkModified=true and changingPattern=".*"
on both resolvers.

My artifact is simply a text file with the current date and time so it is easy to see whether
you get fresh or stale artifacts from the repos.

When I consume the published artifact from the other project, I will get the artifact from
the first configured resolver in the chain (Filesystem in this case). But I know from debugging
that the second resolver is also evaluated. So as an experiment, I added force="true" on the
second resolver to see if I could force Ivy to ignore the first result and favor an artifact
returned by the second resolver. Instead, Ivy returned the artifact from the first resolver
even though the second artifact was newer AND the second resolver had force="true".

When I debugged this to see why the first artifact was chosen over the second artifact, I
found something very fishy.

ChainResolver.getDependency() iterates over each resolver in the chain. First it found the
Filesystem resolver and the artifact and next it found the URL resolver and artifact. Next
it calls BasicResolver.getDependency() which will compare the previously resolved artifact
with the current artifact.

This is where it gets very fishy. At the end of the getDependency() method it calls AbstractResolver.checkLatest()
which I assume is intended to return the latest of the two artifacts. But that comparison
never happens. AbstractResolver.isAfter is invoked with two artifacts to be compared and a
null Date. Since the date is null, the two artifacts are never compared and no matter what,
the first artifact will be returned and the second one discarded and a verbose message will
be emitted stating that the second artifact is older than the first artifact, every time.
The message is on line 533 of AbstractResolver. (I am looking at Ivy-2.3.0 so if that line
does not make sense on trunk then let me know.)

    Message.debug("\tmodule revision kept as younger: " + newModuleDesc);
    saveModuleRevisionIfNeeded(dd, newModuleFound);
    return newModuleFound;

The message is not true. The artifact that was kept was the older of the two and a comparison
of lastModified never happened (and never can happen in the current code as far as I can tell).

So the actual logic in AbstractResolver.checkLatest() simply returns the first artifact found.
While this is not a bad behavior, it does not seem like it is the intended behavior. I mean,
why go through all the trouble of pretending to compare two artifacts using date methods when
the logic never executes because the passed in Date object is null. And why emit a message
stating that one was determined to be older than the other. That is super fishy.

Furthermore, the next line in ChainResolver.getDependency() after resolver.getDependency()
is called (ChainResolver line105) references isReturnFirst(). That is fishy because none of
that matters any more. The current artifact was rejected on the previous line of code and
the previous (aka first) artifact is now the current artifact and is the one that will be
returned (without a date comparison, and for the arbitrary reason that is was found before
the other one).

I think that the intent of the null Date object is to compare each artifact to a static Date
configured elsewhere (I have no idea where), but if the code were to actually compare the
lastModified dates of the two artifacts, a useful result would happen - Ivy would return the
latest artifact from across multiple repositories.

That is huge because I have never been able to get Ivy to do this. I have never seen anybody
get Ivy to search multiple repositories and return the latest artifact. This is useful for
local development when you publish locally to consume locally modified artifacts. It would
be nice to have the option of picking up newer artifacts from a central repo when those occur
without having to blow away a local repository and its cache.

(By the way, giving my local repo its own cache seems to have solved some other strange issues
I was having. I recommend this to everybody and I think it should be a default in Ivy, but
that is debatable and would need some more research and concensus.)

I think that this is a good feature and should be configurable. I think possibly it was intended
to be configured via ChainResolver.returnFirst="true|false" but that code executed when it
was too late and the decision had already been made. If I were to make this a feature, and
make it configurable, I would configure this using an attribute named returnFirst because
that is the exact facet of functionality that we are talking about here.

Thanks for your attention. Hope I am helping here. I am considering coding this to see if
it works as expected. I would be happy to report my results and provide a patch if anybody
is interested in evaluating this.

L.K.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message