ant-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kev Jackson <>
Subject OutOfMemory error when testing AppFuse
Date Mon, 17 Apr 2006 05:01:56 GMT

Matt Raible (Spring Live, AppFuse etc), mentioned on his blog that 
during testing AppFuse with Ant, he experienced an OutOfMemoryError.

Bouncing a couple of emails back'n'forth, he thinks it may occur in the 
Copy task as his target uses Copy extensively and he has a lot of files 
to copy.  Looking at Copy, there are two parts, 1) build up a collection 
of files to copy, and 2) copy them.  It's mentioned in the source that 
this is done for performance reasons, as a file by file copy would take 
too long, so they are batched and copied later (that's my reading of the 
comments anyway).

I have a few questions/suggestions after looking at the code.

1) When we construct a String with + ie ("Copying " + fileCopyMap.size() 
+ " file" + (fileCopyMap.size() == 1 ? "" : "s") + " to " + 
destDir.getAbsolutePath()) we use a lot of temporary objects (which will 
be sized based on the length of the path, plus an overhead).  When will 
these be released for gc?  I've always thought that these temporary 
strings will be released after the method has exited, when there are no 
more references to them.

Now consider this code
(from doFileOperations)

            Enumeration e = fileCopyMap.keys();
            while (e.hasMoreElements()) {
                String fromFile = (String) e.nextElement();
                String[] toFiles = (String[]) fileCopyMap.get(fromFile);

                for (int i = 0; i < toFiles.length; i++) {
                    String toFile = toFiles[i];

                    if (fromFile.equals(toFile)) {
                        log("Skipping self-copy of " + fromFile, verbosity);
                    try {
                        log("Copying " + fromFile + " to " + toFile, 
verbosity);  << this is creating a lot of temporary objects for each 
file copied

We create a lot of cruft for logging purposes, but if the verbosity is 
set too low(not sure what the correct terminology is for this, but when 
we set the verbosity to DEBUG, we get a load of output, when we set to 
QUIET we get none, so QUIET-ish level of logging for example), then we 
create this cruft in memory without it ever being used - it's never 
written to the log.  For a large number of files, this cruft will 
gradually eat up memory without going out of scope (as the method won't 
exit until all the files are processed), and it won't be elegible for gc.

Forgive me if my understanding of the way this is going to interact with 
the gc system is incorrect (and therefore this entire post is 
incorrect), but I think that this will use up memory unnecessarily, and 
may cause problems related to the OutOfMemoryError mentioned previously
So what are the possible solutions (if indeed this is a problem).

1) StringBuffer - the old favourite for Java programmers, use 
StringBuffer to get a mutable string, therfore using up less memory as 
less temp objects get assigned to the heap (although the majority of the 
temp objects will be in Eden right?  But when Eden is full, a minor gc 
starts reclaiming unreferenced objects and pushing still referenced ones 
into the young generation.  I think with a long enough running loop 
inside this single method, this code will eventually fill up all the 
young generations + Eden and the amount of memory that is not really 
referenced (through objects), but is still in scope (of the running 
method) will cause this OOM error)
2) MessageFormat.format - not sure if this will save memory or not, the 
API docs only go as far back as 1.3.1, so if this wasn't present in 
Java1.2, we can't use it and retain compilability on JDK1.2.  The work 
being done inside the format method (source) looks like a lot, so my 
guess is that it will actually cost more memory to implement logging 
using MessageFormat
3) static final strings, more complex logic to only build the message if 
the verbosity is set above a threshold.  This will save memory by not 
creating the string for the log unless the user has selected to run in 
verbose mode, but it complicates the code, and it kinda subverts the 
verbosity flag of the log method.

I could spend some time hacking up a revised Copy task with some tweaks 
to try to reduce the memory consumption for copeis involving a large 
number of files, but I'd rather get input from the rest of you about the 
possible consequences before starting anything.  There are other places 
I'd like to change things (slightly) within the Copy task, but this log 
+ String concatenation looks like it is a space inefficient 
implementation, although it only manifests itself for large filesets.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message