ant-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Siegerman <>
Subject [REVIEW/SUBMIT] Pattern-matching optimization
Date Tue, 27 Mar 2001 02:46:14 GMT
For our project, we want to explicitly specify which java files
are compiled, rather than trust ant's usual catchall
"src/**/*.java" approach.  So I made a manifest file listing all
the desired files (250 or so), and referred to it in:
        <javac includesfile=.../>

... only to discover that ant took agonizingly long to figure out
which files to recompile.  This turned out to be because of the
expensive matchPath() calls on all of those 250 non-patterns.

Taken together, the two attached patches solve this problem, by
doing a simple string-compare when the pattern is a simple
string, ie. when it contains no pattern metacharacters.

I've split this into two patches for readability.  Patch 1 is a
refactoring; it extracts the pattern-matching code out of
DirectoryScanner into a new Pattern class.  Patch 2 adds the
simple-string-compare optimization to Pattern.  There should be
no user-visible change (except speed, of course).

The savings are dramatic.  Here are some timings (taken with no
compilations needed, so that most of the work was ant overhead).
The "ant 1.3" test was a default build, ie. including the "jars"
and "dist-lite" targets.  The "our project" test includes a
<depend> task that takes several seconds.

                        ant       our
Use                     1.3     project
---                     ---     -------
Vanilla ant 1.3         25"       71"
Patched ant 1.3         15"        8"
Savings                 40%       89%

Even on ant's build.xml itself, which doesn't use simple-filename
"patterns" nearly as heavily as our project, the optimization
saves 40%.

I've included the two patches (against the Release 1.3 sources),
and a new with a bezillion test cases.

PROBLEMS: The code seems to run fine, but the JUnit tests have
two issues:

 1. The PatternTest class runs fine from the command line (except
    for a few failed tests; see below), but if I run it using
    "ant run-tests", *every* test case throws the following

        Testcase: test000 took 0.134 sec
            Caused an ERROR
        try to access class org/apache/tools/ant/Pattern from class org/apache/tools/ant/PatternTest
        java.lang.IllegalAccessError: try to access class org/apache/tools/ant/Pattern from
class org/apache/tools/ant/PatternTe

            at java.lang.reflect.Method.invoke(Native Method)
            at junit.framework.TestCase.runTest(
            at junit.framework.TestCase.runBare(
            at junit.framework.TestResult$1.protect(
            at junit.framework.TestResult.runProtected(
            at junit.framework.TestSuite.runTest(

    I haven't a clue why this could be.  It may well be a simple
    error -- I've never used JUnit before.  (I saw a similar
    report in another context on the ant-dev archive, with a
    suggested fix of instantiating AntClassLoader objects with a
    systemFirst argument of "false" -- but <junit> already does


 2. Several tests fail.  The obvious approach of checking how
    they worked before my change is problematic because:
      - The way I wrote the test cases, I can't do this -- they
        depend on the Pattern class's having been factored out.
      - It wouldn't be helpful anyway.  Quite possibly, the bugs
        (if any) were already there in the preexisting code.

    I need some feedback from people who've been working with ant
    longer than I have (which wouldn't be hard; I've only been
    here a week).  My question is, what are the *expected*
    results from the tests in question?  Before I go "fixing" the
    code, I want to make sure it's broken...

    Here are the cases, and what I *think* the expected results
    should be.  The "method" column shows the method being tested
    (in ant 1.3, these are in DirectoryScanner, but my Patch 1
    moves them to a new Pattern class).

                                             Candidate       Expected
    test #   Method              Pattern     Filename        Result
    ======   =================   =========   ==========      ========
    # Empty(ish)-filename tests
    003      matchPath           "/*"        "/"             true
    004      matchPath           "*"         ""              true
    2000     matchPatternStart   "/aa/bb"    "/"             true
    2001     matchPatternStart   "aa/bb"     ""              true

    # Filename ending in "/".  In all of these cases, the
    # trailing "/" has no effect; my expected "false" assumes it
    # should have one.  Should it?
    1041     matchPath           "/aa/*/cc"  "/aa/bb/cc/"    false
    2041     matchPatternStart   "/aa/*/cc"  "/aa/bb/cc/"    false
    2043     matchPatternStart   "/aa/*/cc"  "/aa/"          false
    2063     matchPatternStart   "/aa/?/cc"  "/aa/b/cc/"     false

    # Pattern ending in "/".  I think I recall reading that "/aa/" is
    # supposed to be equivalent to "/aa/**"; is this correct?
    1160     matchPath           "/aa/"      "/aa/cc"        true
    2160     matchPatternStart   "/aa/"      "/aa/cc"        true

Thanks much!


|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.
|  |  /
With sufficient thrust, pigs fly just fine. However, this is not
necessarily a good idea.
        - RFC 1925 (quoting an unnamed source)

View raw message