xmlgraphics-fop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas L Delmelle <a_l.delme...@pandora.be>
Subject Re: Initial soft hyphen support
Date Mon, 15 Jan 2007 21:20:46 GMT
On Jan 15, 2007, at 21:25, J.Pietschmann wrote:

> Andreas L Delmelle wrote:
>> BTW: I took a very quick look, and does anyone know if there is a  
>> good reason why Hyphenation.word is a String?
> The hyphenator  interface goes through several wrapping layers,
> probably due to the usual "take working code and wrap it to fit
> the caller" method.

Looks that way...
Traced it down, and in TextLM.getWordChars() we get

   sbChars.append(new String(textArray, ai.iStartIndex,
                           ai.iBreakIndex - ai.iStartIndex));

Not really sure what would be most efficient:
- a void method appending to a parameter StringBuffer
- a method returning a copy of the char[] from index to index...

Seen that every String ultimately has a backing char[](*) anyway, I'd  
say that we can safely return the copy, and remove the overhead of

StringBuffer.append(new String(char[])).toString().toCharArray()

Hmmm... Put it like that, and this would almost be one for the Daily  
WTF! 8-)

(*) which BTW, answers the question about the char[] instances being  
twice that of the text-nodes in the document in the snapshot posted  
by Richard earlier on in the thread about memory issues. Sure, there  
are some 39K text-nodes in the document, but there are most likely at  
least as many non-internalized property values (cfr. the number of  
String instances)...

> This which always seemed to be overly complicated for me. I tried
> to come up with a comprehensive API for hyphenation (which would
> also be applicable to spelling and other similar tasks).  
> Unfortunately,
> there doesn't seem to be any usable standard, all APIs I've seen
> are very specific or simply horrible. Any simplification is certainly
> welcome.

A quick-and-dirty hack to make the Hyphenator return a Hyphenation as  
I described earlier on --hyph-point for the SHY and the rest as two  
separate hyphenated words-- doesn't seem too hard to pull off, but it  
would be an exception for the SHY only. For a more comprehensive  
approach, I currently don't know enough about hyphenation basics, I'm  



View raw message