xmlgraphics-fop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dario Laera <la...@cs.unibo.it>
Subject Active node tree pruning in FOP trunk
Date Tue, 21 Oct 2008 14:36:16 GMT
Hi all,

after noticing that the layout tree pruning in the prototype [1] was  
improving performance I decided to implement it in FOP trunk to see  
how it was behaving in the real world. The concept I've applied to the  
general breaking algorithm can be defined as a "last-n-lines fit"  
algorithm; if n=5, when the algorithm find the first node for the  
sixth line it choose the final active node for the first line that  
have the best active node as child (this is a correction from the  
prototype implementation). Other nodes for line 1 gets discarded with  
relative children. When the algorithm determine a line break for line  
7 the pruning is performed for line 2. This is equivalent to say that  
line 1 is chosen as the best in the context of lines 1-5, line 2 in  
the context of lines 2-6, and so on. While this algorithm won't give  
you the best possible layout (well, not always, depending on n value)  
as total fit do, it can improve performance and shrink memory usage by  
reducing the number of active node in some cases. Anyway it gets  
"nice" layouts, surely better than best fit, and by choosing the value  
of 'n' we can set the trade off between performance and quality.
In the tests I've done I saw a sensible improvement when two  
conditions were satisfied:
1) the layout constraints was leading to an high number of active  
nodes (short lines, hyphenation enabled, non justified alignment);
2) the paragraph was long enough to produce many lines (if n is  
greater than the total number lines the pruning doesn't even get  

As worst case I considered an A4 page with a single column with  
justified text split in many paragraphs: no performance improvements.  
Even if I process a single paragraph that is 8 page long the  
processing time is the same as if pruning was disabled, while the  
layout result is slightly different (but not bad).
As best case I considered an A4 page with two columns, hyphenation  
enabled, left aligning for a single paragraph 8 page long. Here the  
performance boost is embarrassing:

          | cpu t | mem
No prune | 2'20" | 570 MB
n = 30   | 0'12" | 35 MB
n = 1    | 0'07" | 35 MB

Well, this is a really extreme test that unlikely will represent a use  
case, let's look at more tests:

  Just | 1 col | 1 par || no pru | n=30 |  n=1
   X   |   X   |   X   ||    6"7 |  6"6 |  6"6
   X   |   X   |       ||    6"6 |  6"6 |  6"7
   X   |       |   X   ||   14"4 |  7"1 |  6"9
   X   |       |       ||    7"9 |  7"3 |  6"9
       |   X   |   X   ||   32"6 |  9"0 |  6"7
       |   X   |       ||    9"9 |  9"0 |  6"6
       |       |   X   || 2'20"2 | 12"0 |  7"0
       |       |       ||   21"0 | 13"0 |  7"0

The test was done with the same text modified as follows: justified or  
left aligned, 1 or 2 columns, within a single block or split in 9  
blocks of the same size.

I've attached the patch to be applied to trunk: to enable pruning you  
can set the system property from the command line with -DtreeDepth=1  
or any other number you choose for 'n'. If you don't define treeDepth  
the pruning doesn't get enabled. I've also attached the test fo files  
so you can check the resulting layout for the tests I reported above.


[1] http://mail-archives.apache.org/mod_mbox/xmlgraphics-fop-dev/200810.mbox/%3C236664AE-3F70-48F2-8D82-C26BA58BC92A@cs.unibo.it%3E

View raw message