The Requirement for Multiple Translation Algorithms
to Produce Accurate Braille

Introduction

The basis of both the liblouis and Duxbury applications is the translation table algorithm. This algorithm has the significant advantage that a single translation engine can support internationalization by simply using different input tables for different braille systems. On the other hand, use of this algorithm may make it difficult to implement corrections that introduce additional translation algorithms in order to eliminate braille translation errors.

Sometime ago I made tests with older versions of both liblouis and Duxbury and neither tool provided the correct translations for all of the special cases noted below. I don't have data on how the latest versions of these tools function nor on whether braille systems other than English Braille American Edition (EBAE) exhibit similar issues with these tools.

The Translation Table Algorithm

For completeness, here's a brief summary of the basic translation table algorithm as applied to contracted braille systems for literary braille. The algorithm translates an item from left to right (or more correctly from start to end) by replacing one or more print characters with the longest eligible braille string that translates those characters. This algorithm employs special logic as necessary to determine the current eligibility context and then obtains the longest eligible replacement from a table where potential replacements are associated with a flag or opcode that identifies those contexts where that particular replacement is eligible for use.

[Note that what is generally now referred to as the translation table algorithm was originally proposed in 1970 by Jonathen K. Millen of MITRE Corporation and is documented in an internal report titled "DOTSYS II: Finite-State Syntax-Directed Braille Translation." The implementation of the algorithm in DOTSYS III, the precursor to Duxbury's current software, is documented in a fascinating and very detailed 1975 MITRE report. A scan of the latter report can be obtained from Education Resources Information Center by doing a Title search on the term DOTSYS III.]

Examples from EBAE of the Need for Multiple Translation Algorithms

American literary braille (EBAE) has a number of special translation rules that require either awkward extensions to the basic translation table algorithm or separate algorithms; it seems likely that other braille systems do as well.

A simple example of a special translation rule is the requirement to place a lettersign before the letter in letter words such as e-mail.1 This requirement is intended to avoid any possible confusion with the use of the whole-word single-letter contractions in true compound words. Note, however, that information about local context is not sufficient to ensure correct translation of letter words since neither spelled-out words (where all the letters are separated by hyphens) nor arbitrary identifiers (such as e-abcd) employ lettersigns before the letters.

Another example is the prescribed method for translating electronic addresses.2 This method does not use standard EBAE translation but, rather, embeds the entire item between the appropriate Computer Braille Code (CBC) switch indicators, translates the item using the separate CBC translation table, and formats the item according to CBC formatting specifications.

A third example is the EBAE rule for translating stammered words such as occur in both fiction and technical material.3 In this case, if the stammer involves a portion of a contraction, the rule states, in part, that the associated contraction may not be used anywhere in the translation of the stammer. Thus the stammer g-ghost must not use the contraction for "gh" but must use the contraction for "st".

Other examples of items which require specialized treatment include syllabified words, proper names, true compound words, homographs, acronyms, and abbreviations.

Implementing Multiple Translation Algorithms

There are of course various hacks and tricks which might allow different algorithms to be incorporated directly into the basic translation table algorithm. However, it is difficult to justify such a convoluted approach in a modern software development environment where refactoring can be easily achieved. The software engineering disadvantages of poor modularization are well known: computational inefficiency coupled with coding that is unnecessarily difficult and expensive to maintain and extend.

One can envision a modular process where a recognizer is followed by a switcher which is followed by one of several available translators. The recognizer would use semantics and/or syntax to flag items that require special treatment. (I use the term "item" to avoid getting into details as to whether an item is a word, a word plus attached punctuation, a sequence of words, etc.) The switcher would then pass the item and any required context information to an appropriate translator. The translator would translate the item.


1English Braille American Edition 1994, Revised 2002, 2007 Update., Rule II. 12.a.(4).
2Ibid. Appendix C. 3.
3Ibid. Rule II. 13.a. and Rule II. 13.b.



First posted June 20, 2010
Contact author: info at dotlessbraille dot org