Issues for Braille Translation of Hyphenated Items

There are a large number of semantically distinct print items that properly use hyphens. It is also fairly common when transcribing print documents to run across cases where hyphens are used incorrectly. Both situations create a problem for braille translation.

The proper use of hyphens in print typography involves complex issues including some which are still the subject of argument in the print world. Jukka Korpela has made a heroic attempt to summarize this situation. Also it should be noted that the latest version of Unicode now specifies some 24 distinct dashes and hyphens, more than in Korpela's article. The latest list appears in Table 6.3 of The Unicode Standard 5.2.0 (Electronic Edition) Chapter 6, 6.2 General Punctuation, Dashes and Hyphens. Of course a six-dot braille code can't possibly handle this many distinctions.

American contracted English braille has translation rules specific to more than a dozen semantically distinct hyphenated items consisting only of letters and hyphens. (There are other special rules, not addressed here, for hyphenated items containing numbers.) Hopefully the present article, which uses examples from Amercian literary braille, will be useful to developers of translators for other braille systems.

[References to specific rules refer to English Braille, American Edition, 1994; Revised 2002 (EBAE) which may be obtained from the Braille Association of North America, the National Foundation of the Blind, or the American Printing House.]

Introduction

Hyphenated items consisting only of letters are conveniently divided into three broad syntactical categories. These are described in turn.

  1. Items with two or more adjacent hyphens
  2. Items with a leading or trailing hyphen
  3. Items with one or more embedded hyphens

1. Items with two or more adjacent hyphens

There isn't any correct useage for multiple hyphens in modern print typography. The use of a hyphen sequence as a surrogate to represent a dash character is an anachronism from the time of mechanical typewriters. Remember that braille translates a dash character as two braille hyphens. Here we discuss the use of multiple hyphens as surrogates and to signify missing letters.

Two or more hyphens used to represent some sort of dash.

The use of a hyphen sequence as a surrogate to represent a dash character is incorrect. The print dashes such as the em dash are actually unique print characters. It would be preferable if braille transcribers were to prepare documents using the correct characters for dashes and related symbols and avoid multiple adjacent hyphens. However, if documents do contain sequences of two or more print hyphens, they are usually translated by an equivalent number of braille hyphens. Note that on backtranslation two hyphens are normally backtranslated as an em dash whereas sequences of more than two braille hyphens are backtranslated by the equivalent number of print hyphens.

Two or more hyphens used to signifying missing letters

Print documents sometimes uses hyphens to indicate letters that are missing for discretionary purposes such as to avoid spelling out swear words. Again this is an anachronism. Braille rules are based on the assumption that there are the same number of hyphens as missing letters although this is not correct typographical style. A recent style guide states that a "two-em (four-hyphen) dash is used to show [all the adjacent] missing letters in a word."

The translation rule that requires a sequence of missing letters to be translated by an equivalent number of braille hyphens—whether leading, embedded, or trailing—can be ambiguous as far as backtranslation since once again two hyphens are normally backtranslated as an em dash whereas sequences of more than two braille hyphens are backtranslated by the equivalent number of print hyphens.

2. Items with a leading or trailing hyphen

A leading or trailing hyphen is used in disconnected compound words and when referring to a prefix or suffix.

Disconnected compound words

The most common use for leading or trailing hyphens is in disconnected compound words. A disconnected compound word is either the first or second component part of a two-part compound word standing alone with the remaining part implied from context. Examples with leading hyphens are the phrases state-owned and -operated and mid-May or -June; an example with a trailing hyphen is the phrase five- or six-pointed.

The hyphen in a disconnected compound word is translated by a braille hyphen. The case of a leading hyphen can cause difficulty for automated backtranslation because of the possible confusion with the contraction for com which uses the same braille cell. However this can often be ruled out because that interpretation leads to a nonsense word.

References to prefixes and suffixes

A leading or trailing hyphen is occasionally used when referring to a prefix or suffix as in the following sentences. "The prefix un- is used for negation." "Some plurals use -en rather than s."

3. Items with one or more embedded hyphens

The most difficult category of hyphenated items to translate correctly into braille are those with embedded hyphens. EBAE recognizes ten semantically distinct items in this category and uses a number of different translation rules for these items:

  1. Ranges of individual letters such as a-z require letter signs before both letters per Rule II.12.a.(4)
  2. Spelled-out words such as c-h-e-e-s-e are transcribed character-by-character without letter signs per Rule II.13.b
  3. Abbreviated spellings such as V-J are transcribed using the same procedure as spelled-out words
  4. Speech hesitations such as we-e-ek are transcribed using the same procedure as spelled-out words
  5. Vocal sounds such as br-r and hm-m-m are transcribed using the same procedure as spelled-out words
  6. Letter words such as e-mail and Circle-K require a lettersign before the letter but the "word" part is translated as it would be if part of a normal word per Rule II.12.a.(4)
  7. Stammered words such as w-w-will, and th-these and we-e-ek follow a special algorithm per Rule II.13.a
  8. Syllabified words such as in-form and will-ing-ness use a limited set of contractions per Rule II.13.d
  9. Genuine compound words such as self-knowledge, whip-poor-will and Bit-O-Honey are transcribed per Rule XI.36.a as though each component part is a stand-alone whole word with the exception that, per Rule XIII.44, the contraction for com may not be used at the start of the non-leading parts
  10. Arbitrary sequences of letters and hyphens used as part numbers, serial numbers, and other types of identifiers such as Ar-bddl-eee are transcribed character-by-character

It should also be noted for completeness that entry words or word fragments, alphabetic divisions, etc. that may involve hyphens are transcribed in various ways per Braille Formats, Rule 19.


For questions contact info at dotlessbraille.org.

Posted June 22, 2010.

Slightly corrected version re-posted June 22, 2010.