xmlgraphics-fop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vincent Hennebert <vincent.henneb...@anyware-tech.com>
Subject Re: Unicode soft hyphen and hyphenation
Date Sat, 13 Jan 2007 10:57:28 GMT
Jeremias Maerki a écrit :
> On 12.01.2007 09:25:59 Vincent Hennebert wrote:
>> Jeremias Maerki a écrit :
>>> Good to see that happen! Here's my take:
>>> On 11.01.2007 13:24:16 Manuel Mall wrote:
>>>> Hi,
>>>> when I implemented the UAX#14 line breaking I noticed that fop doesn't 
>>>> currently support the Unicode soft hyphen (SHY).
>>>> I am thinking of adding support for this character to the line breaking 
>>>> but am unsure of its correct behaviour in an XSL:FO environment. So I 
>>>> have few questions related to treatment of the SHY:
>>>> 1) If hyphenation is not enabled should a SHY still produce a valid 
>>>> break opportunity or should it be ignored?
>>> I think it should represent a valid break opportunity.
>> Well, I don't agree. See the description of SHY in section 15.2 of the
>> Unicode standard: SHY is used as a hint for automatic hyphenators and
>> overrides there behaviors. I would typically use it for nicely rendering
>> veryLongProgramVariablesLikeWeCanFindInJava in e.g. a portion of text
>> describing them in some documentation. Here I obviously want to force
>> hyphenation to occur between the words that make the variable name
>> (Long-Program-Variables instead of LongPro-gramVar-iables or whatever).
>> So, as a hint for hyphenators, SHY should be ignored when hyphenation is
>> disabled, and when enabled have the priority over automatic hyphenation.
> Hmm, I'm used to different behaviour in word processors and I don't read

Except that I wouldn't trust any word processor when it comes to
high-quality typography :-P
Does anyone know what InDesign is supposed to do?

> the UCD spec like you do. Also 5.3 in UAX#14 also doesn't give me the
> impression that a SHY is only active when hyphenation is enabled. It
> says: "The action of a hyphenation algorithm is equivalent to the
> insertion of a SHY. However, when a word contains an explicit SHY, it is
> customarily treated as overriding the action of the hyphenator for that
> word." I read this as: "SHY is the basic operator to add additional
> break points and a hyphenator can be added to do that task automatically."

Still don't agree. Overriding is not adding hyphenation points. The
following sentence in the description of SHY is pretty clear to me:
"The use of SHY is generally limited to situations where users need to
override the behavior of [an automatic] hyphenator."

> Interesting but moot point I think. FOP is the automatic hyphenator in
> this case and the hyphenate property could be argued to control which
> hyphenation algorithm FOP is using. If hyphenate="true" FOP is allowed
> to add its own hyphenation breaks. If hyphenate="false" it uses only
> user specified hyphenation breaks (= soft hyphens).

Well, again, the description of the "hyphenate" property (§7.9.4) sounds
clear to me: when false, "Hyphenation may not be used in the
line-breaking algorithm".


To summarize, my opinion is that:
- if "hyphenate" = false, no automatic hyphenation is performed, and
  soft hyphens are discarded
- if "hyphenate" = true, automatic hyphenation is performed, except for
  any word that contains soft hyphens, in which case the soft hyphens
  are used to create legal breakpoints.

Now if the majority is against me, I'll shut up right now to not prevent
things moving on.


View raw message