xmlgraphics-fop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas L Delmelle <a_l.delme...@pandora.be>
Subject Re: FOP Memory issues (fwd from fop-users)
Date Thu, 11 Jan 2007 22:24:36 GMT
On Jan 11, 2007, at 22:31, J.Pietschmann wrote:

> Quite some time ago I did some statistics on number of children
> of FOs, using the FOP examples and FO files from bug reports.
> The breakdown was roughly the following
>  ~50% no children, mostly FOText nodes and FOs like region-body
>     and page-number-citation
>  ~40% one child, mostly blocks and inlines (fo:wrapper) having
>     exactly one FOText node as child
>  <10% 2..10 children
>  <<1% more than 10 children, mostly fo:flow, table and table-body
>     and a few blocks, usually wrapping other blocks.
>
> Real world documents with more tables and inline formatting might
> have more multi-child FOs.

Interesting figures...

>
> I haven't checked whether FOText still inherits the children field
> on trunk. If so, it is certainly a good idea to get rid of this
> (in the maintenance branch, this had widespread implications).
> The case of exactly one child might be worth optimizing too.

This was indeed altered, I don't know when, and by whom precisely  
(Glen or Finn, IIRC).
Anyways, the hierarchy is currently:

FONode
|->FOText
|->FObj

and only FObj has a protected childNodes instance member, which is a  
generic ArrayList (and as I hinted, they are all created with: new  
java.util.ArrayList(), which defaults to an initial backing Object[10]).

A FONode only holds the reference to the parent in the tree, and the  
FObj.childNodes list is only created when FObj.addChildNode() is  
called, so if there is no child element, this reference will always  
be null.

> Two possible solutions:
> A) all FO node implement a FOContainer interface, for example
>  FONode childAt(int)
>  int numberOfChildren()
> where FOText for example would hardcode return values of null and 0.
> B) Use a FOChildrenIterator interface with specific implementations
> for FO nodes which can have none or exactly one child.

I was already thinking along the lines of creating a subclass of  
Vector, or an implementation of List, but I'm beginning to wonder if  
it wouldn't be worth it to create a link between the children...?  
Instead of holding a reference to an ArrayList, each FObj would have  
three references: parent, firstChild, nextSibling. Add that  
FOContainer interface or a FONodeIterator to navigate through it...  
could work out. 8)

The benefit would be that the average size of an FObj becomes much  
more transparent: three references, the properties and a handful of  
private helpers.

> Furthermore, in the maintenance branch most of the more specific
> FOs copied children from the generic children list into properly
> typed fields before starting layout, in many cases the generic
> children list could have been deleted afterwards if this wouldn't
> have broken a few generic recursive algorithms like the one adjusting
> available space due to footnotes. The discussion then had Keiron said
> he'd even get rid of the generic children list in favour for properly
> typed fields, thereby giving up some flexibility needed for
> extensions.

In the trunk, extensions are treated very differently from what I can  
tell. Mainly thanks to Jeremias, IIC, who made extensive changes when  
implementing XMP metadata. They are stored in a separate collection  
(ExtensionAttachments), which adds its own flexibility, but also  
difficulty. I'm still not sure how one would have to write an  
arbitrary fo extension that is supposed to influence the flow of the  
layout algorithm without needing access to the deeper layout API.  
Maybe we'd need a sort of generic ExtensionLayoutManager, too...?


Andreas

Mime
View raw message