ant-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antoine Levy Lambert <anto...@gmx.de>
Subject Re: Possible Ivy bug (and suggested fix) in ChainResolver
Date Mon, 30 Mar 2015 02:35:16 GMT
Hello Loren,

thanks for digging into all this.

It might not be a good thing for softwares like ivy which have a lot of users in the wild
to change defaults - because people like to be able to upgrade and still have an old behavior,
but making the needed behaviors possible at least should be done.

Your point 2 makes sense to me though, if two artifacts have as version 1.0-SNAPSHOT we should
compare their timestamp.

Point 3 your help will be welcome.

Best regards,

Antoine



On Mar 25, 2015, at 3:00 PM, Martin Gainty <mgainty@hotmail.com> wrote:

> 
> 
> 
>> From: LKratzke@blueorigin.com
>> To: dev@ant.apache.org
>> Subject: RE: Possible Ivy bug (and suggested fix) in ChainResolver
>> Date: Wed, 25 Mar 2015 18:16:08 +0000
>> 
>> I found the root cause of the fishiness. It was not a bug (and was not actually in
ChainResolver) but is in my opinion a rather unfortunate and poorly chosen default setting
that is to blame (the latest-strategy).
>> 
>> During resolution in the chain resolver, the current sub-resolver tries to determine
whether it should be skipped and the previous artifact be selected over the current artifact.
Assuming that the "force" and "dynamic" tests do not result in an early rejection of the current
artifact, the "latest" of the two artifacts is selected by sorting the artifacts in a List
using a Comparator suited to the current latest-strategy (latest-revision, latest-time, etc).
>> 
>> The problem here is quite simple. The default latest-strategy is "latest-revision".
So when a second artifact has been resolved, and it is of the same revision as the first,
nothing gets sorted. The result is that you will get the first artifact found based upon the
ordering of your repositories in the chain instead of the newer of the two artifacts.
>> 
>> I think that either latest-time needs to be the default strategy or the latest revision
comparator needs to do a secondary sort to sort by lastModified time.
>> 
>> Possibly allow configuration of this behavior (should there be a secondary sort by
time to avoid stale artifacts, or not so that repository order breaks the tie). This is important
(critical) behavior and should be configurable.
>> 
>> Defaulting to latest-revision will not only deliver undesired stale artifacts, but
it is unclear to the user why they are getting stale artifacts or how to make it stop happening.
The latest-time strategy will give you the latest revision 99% of the time and the latest
artifact 100% of the time. But the latest-revision strategy will give you the latest artifact
100%, 50%, 33%, or 25% of the time when the revision numbers are the same, depending upon
how many resolvers you have (1/n), and assuming that any repository may contain the latest
artifact.
>> 
>> Furthermore, the docs do not give a great description of "force". I learned much
about the actual behavior of this attribute while debugging. First thing I learned was that
it is not a good name.
>> 
>> What force actually does is it allows a resolver to be considered when a previous
resolver has found an artifact. After the first artifact has been found, only repositories
with force=true will have a chance of competing (for instance, they might have a newer version
of 1.0.0-SNAPSHOT). Otherwise, they are discarded immediately and no date comparison is attempted.
>> 
>> Force should actually be named "considerAlways" or "considerAnyway". That seems to
be a more suitable name. No action requested here, but pointing out that this attribute has
a misleading name.
>> 
>> Summary:
>> 1 - Please reconsider changing the default latest-strategy to be latest-time.
>> 
>> 2 - Please consider adding a secondary "lastModified" sort to LatestRevisionStrategy.ArtifactInfoComparator
whether or not you change the default latest-strategy to latest-time or not.
>> 
>> 3 - Please document, illustrate, and demonstrate in one place the behaviors of chain
resolver in combination with force, returnFirst, defaultLatestStrategy, ivy.resolver.default.check.modified,
useOrigin, and other settings and attributes that affect resolution behavior. (I am working
on this document now.)
>> 
>> 4 - Please consider creating independent caches by default for each repository. I
have not drilled down on this issue yet, but I suspect that it fixes serious cache collision
issues that I think I saw while debugging (found and selected local repo artifact, checked
cache before delivery, ended up delivering cached stale artifact that came from a totally
different repo :( ).
>> 
>> Thanks,
>> 
>> L.K.
> 
> MG>i think we can take hints from maven brothers on a tested strategy
> MG>to referencing dev jars during development..their solution is to employ SNAPSHOT
version during CI cycles
> MG>SNAPSHOTs are available until the jar is promoted to RELEASE at which point a tag
is assigned to version
> MG>SNAPSHOT delivers ${project.id}-YYYYMMDD.hhmmss.jar so unless you have multiple
machines able to gen
> MG>deployables within a second the last second is the arbiter which clearly identifies
the latest jar
> MG>Snapshot versions are ephemeral until the next snapshot build so remote lookup
would not be implemented
> http://books.sonatype.com/mvnref-book/reference/pom-relationships-sect-pom-syntax.html
> 
> MG>repository caches when stored within a regular Nexus Repository are typed as Proxy/Hosted/Virtual
> MG>ProxyApache,ProxyCentral or ProxyCodehaus
> MG>Hosted3rdParty,HostedRelease,HostedSnapshot
> MG>VirtualRepo (Virtual repos are generally for OSGI bundles)
> MG>Once you know the general type Proxy or Hosted or Virtual you can then select sub-type
(such as Hosted3rdParty,HostedRelease,HostedSnapshot)
> https://books.sonatype.com/nexus-book/reference/confignx-sect-manage-repo.html
> MG>thank you for taking the necessary time to think this through
> 
>> 
>> From: Loren Kratzke
>> Sent: Tuesday, March 24, 2015 1:09 PM
>> To: 'dev@ant.apache.org'
>> Subject: Possible Ivy bug (and suggested fix) in ChainResolver
>> 
>> I have a some observations about how the chain resolver selects a dependency. I think
this may be a bug but I am not sure because the intent of the source code is not entirely
clear. It reads one way, but behaves in a different way. I have pinpointed the exact spots
in code where this happens.
>> 
>> Here is my simple test setup used to debug this issue. I have two resolvers (Filesystem
and URL) configured in a ChainResolver in that order. I publish to one resolver and then the
other repeatedly and consume the result in another project. I use checkModified=true and changingPattern=".*"
on both resolvers.
>> 
>> My artifact is simply a text file with the current date and time so it is easy to
see whether you get fresh or stale artifacts from the repos.
>> 
>> When I consume the published artifact from the other project, I will get the artifact
from the first configured resolver in the chain (Filesystem in this case). But I know from
debugging that the second resolver is also evaluated. So as an experiment, I added force="true"
on the second resolver to see if I could force Ivy to ignore the first result and favor an
artifact returned by the second resolver. Instead, Ivy returned the artifact from the first
resolver even though the second artifact was newer AND the second resolver had force="true".
>> 
>> When I debugged this to see why the first artifact was chosen over the second artifact,
I found something very fishy.
>> 
>> ChainResolver.getDependency() iterates over each resolver in the chain. First it
found the Filesystem resolver and the artifact and next it found the URL resolver and artifact.
Next it calls BasicResolver.getDependency() which will compare the previously resolved artifact
with the current artifact.
>> 
>> This is where it gets very fishy. At the end of the getDependency() method it calls
AbstractResolver.checkLatest() which I assume is intended to return the latest of the two
artifacts. But that comparison never happens. AbstractResolver.isAfter is invoked with two
artifacts to be compared and a null Date. Since the date is null, the two artifacts are never
compared and no matter what, the first artifact will be returned and the second one discarded
and a verbose message will be emitted stating that the second artifact is older than the first
artifact, every time. The message is on line 533 of AbstractResolver. (I am looking at Ivy-2.3.0
so if that line does not make sense on trunk then let me know.)
>> 
>>    Message.debug("\tmodule revision kept as younger: " + newModuleDesc);
>>    saveModuleRevisionIfNeeded(dd, newModuleFound);
>>    return newModuleFound;
>> 
>> The message is not true. The artifact that was kept was the older of the two and
a comparison of lastModified never happened (and never can happen in the current code as far
as I can tell).
>> 
>> So the actual logic in AbstractResolver.checkLatest() simply returns the first artifact
found. While this is not a bad behavior, it does not seem like it is the intended behavior.
I mean, why go through all the trouble of pretending to compare two artifacts using date methods
when the logic never executes because the passed in Date object is null. And why emit a message
stating that one was determined to be older than the other. That is super fishy.
>> 
>> Furthermore, the next line in ChainResolver.getDependency() after resolver.getDependency()
is called (ChainResolver line105) references isReturnFirst(). That is fishy because none of
that matters any more. The current artifact was rejected on the previous line of code and
the previous (aka first) artifact is now the current artifact and is the one that will be
returned (without a date comparison, and for the arbitrary reason that is was found before
the other one).
>> 
>> I think that the intent of the null Date object is to compare each artifact to a
static Date configured elsewhere (I have no idea where), but if the code were to actually
compare the lastModified dates of the two artifacts, a useful result would happen - Ivy would
return the latest artifact from across multiple repositories.
>> 
>> That is huge because I have never been able to get Ivy to do this. I have never seen
anybody get Ivy to search multiple repositories and return the latest artifact. This is useful
for local development when you publish locally to consume locally modified artifacts. It would
be nice to have the option of picking up newer artifacts from a central repo when those occur
without having to blow away a local repository and its cache.
>> 
>> (By the way, giving my local repo its own cache seems to have solved some other strange
issues I was having. I recommend this to everybody and I think it should be a default in Ivy,
but that is debatable and would need some more research and concensus.)
>> 
>> I think that this is a good feature and should be configurable. I think possibly
it was intended to be configured via ChainResolver.returnFirst="true|false" but that code
executed when it was too late and the decision had already been made. If I were to make this
a feature, and make it configurable, I would configure this using an attribute named returnFirst
because that is the exact facet of functionality that we are talking about here.
>> 
>> Thanks for your attention. Hope I am helping here. I am considering coding this to
see if it works as expected. I would be happy to report my results and provide a patch if
anybody is interested in evaluating this.
>> 
>> L.K.


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message