An Analysis of UEB with Respect to BANA's Report on the Needs of Braille Users
(May 3, 2012 Draft)

Susan Jolly, Invited Expert, DAISY Pipeline Braille Working Group
Contact: info@dotlessbraille.org

Table of Contents

Executive Summary

Many problems faced by U.S. braille readers are described in a recent three-part article published by the Braille Authority of North America (BANA). The article suggests that the choice to adopt UEB (Unified English Braille) could mitigate the problems caused by the inability of BANA's present codes "to keep up with current trends in publishing and technology" and could provide a basis for BANA to be more responsive in the future.

A study of the likelihood that adopting UEB would mitigate the problems currently faced by braille readers was carried out as follows. First the most significant problems described in the BANA article were grouped into eight sets. However, analysis revealed that the causes of two of the reported problem sets are not related to the choice of braille code. (The analysis of these two problems, which occur when using refreshable braille, is included below for the convenience of persons who may have encountered similar problems.) The remaining six problem sets were then matched against UEB rules. The conclusion of this study was that the current UEB does a poor job of mitigating these problems.

The study found that UEB fails to resolve two of the six remaining problem sets:

  1. Email addresses and similar items are ambiguous due to the UEB rule requiring these items to be in contracted braille when embedded in other text.
  2. UEB does not solve the problem of erroneous translation of apostrophes and single quotes caused by a complex interaction between braille translation rules and a quirky print situation.

The study determined that UEB provides at best partial solutions to three problem sets:

  1. UEB translations are highly likely to be awkward for refreshable braille display (RBD) users who need correct braille symbols under their fingers since UEB transcriptions often require the reader to consult a separate symbols page and separate Transcriber Notes to determine the meaning of its braille equivalents.
  2. While the UEB does provide various mechanisms to unambiguously represent any print character, it does this in an unnecessarily awkward and clumsy manner. UEB's extensive use of indicators, its lack of good mnemonics for all symbols, its use of transcriber-defined symbols, and its insertion of braille enclosure symbols not present in print are all problematic. Most significantly, it fails to provide an option to directly represent Unicode character codes even though Unicode, which already specifies more than 100,000 characters, is the accepted international solution to the character problem for today and for the future.
  3. UEB does meet BANA's requirement of an extensible mechanism that provides multiple typeform indicators for representing the myriad of print character styling found in modern print documents. However, this is only a partial solution since the use of typeform indicators often fails to address the real issue of conveying to the braille reader the same information and benefits that character styling conveys to the print reader.

The study showed that UEB solves one problem set:

  1. UEB reduces some of the limiting assumptions of standard contracted braille by eliminating nine contractions and by ending the practice of omitting spaces between certain words. (It should, however, be pointed out that many experienced braille readers have objected to these changes which are necessitated by the design of UEB but optional in NUBS because it uses a different strategy to overcome these limiting assumptions.)

Finally, given the disappointing results as far as the ability of UEB to solve the current problems of U.S. braille readers as described in the BANA article together with additional problematic UEB features noted during the study, it seems essential that BANA not choose to adopt UEB.

Introduction

The Braille Authority of North America (BANA) has recently published an important three-part article titled The Evolution of Braille: Can the Past Help Plan the Future? http://www.brailleauthority.org/article/evolution_of_braille-full.pdf which describes a number of problems currently encountered by U. S. braille readers. The concluding section, titled "At a Crossroads," states that "braille in the United States must change to keep up with current trends in publishing and technology" and presents "several choices as to how to proceed." One choice is to "adopt UEB" (Unified English Braille).

This recent BANA article makes a persuasive case that braille must change to meet the needs of braille readers and I am in complete agreement that braille must change. Braille must change for the reasons BANA describes and, even more importantly, so as to better anticipate the growing use of ebooks for leisure reading and of digital textbooks.

Even though I agree that braille must change, I disagree that UEB would be a positive change for braille. I have written negatively about UEB off and on for the past ten years and was disappointed that BANA would present "adopt UEB" as a viable choice. On the other hand, given that my previous focus has been on technical materials but that the focus of the BANA article is on issues affecting all braille users, it seemed only fair to review the latest version of the UEB from this perspective.

UEB Analysis: Summary

A careful review of the UEB rules with respect to a number of problems extracted from the BANA article and cited below has made my opinion of UEB even more negative than it had previously been. UEB does not actually deliver what BANA hopes to provide the general braille reader. BANA would fail in its obligations to all braille readers were it to waste the resources needed to "keep up with current trends" by adopting UEB. This conclusion is documented in detailed sections linked to these summaries.

  1. UEB cannot solve the problem of characters not displaying properly on a refreshable braille display (RBD) when this is a software-hardware interface problem rather than a braille code problem.

  2. UEB cannot solve the problem of incorrect translation on an RBD when this is a result of poor quality translation software included in a screenreader application or display driver. Similarly, UEB cannot improve the potential accuracy of braille translation software by totally avoiding braille rules based on semantics which, while more difficult to automate than rules based on syntax, make braille translations more accurate and/or easier to read.

  3. UEB does not provide an entirely satisfactory technical basis for supplying correct symbols via "on-the-fly" braille or braille plus speech as used with RBDs. UEB was designed primarily for transcribing braille documents where the reader consults a separate symbols page and separate Transcriber Notes. RBD users, by contrast, need each braille symbol under their fingers to not only "automatically and correctly display" but to be immediately recognizable.

  4. UEB exacerbates the problem of ambiguous email addresses sometimes accidentally displayed in contracted braille since the use of contracted braille for embedded email addresses (and similar items) is required in UEB.

  5. UEB does not simplify the longstanding translation problems related to the various representations of single quotes and apostrophes in print and does not reduce the likelihood of transcription errors caused by this issue.

  6. UEB does solve the problem of outdated assumptions about print character placement implicit in certain rules of contracted braille by eliminating nine contractions and ending the practice of omitting spaces between certain words. (This same tactic could, of course, be applied to the present BANA codes as well.)

  7. UEB does not provide a comprehensive solution to the problem of rendering letters and symbols correctly given that the Unicode Consortium identifies more than 100,000 print characters. UEB does specify exact braille symbols for a greater number of print characters than does EBAE although the rationale for which print characters have UEB braille symbols and which don't is not apparent. UEB does specify additional braille symbols that transcribers can use in conjuction with transcriber notes or a symbol list "for any print symbol which has no UEB equivalent" and does permit symbols to be used unambiguously in any position.

  8. UEB does provide four sets of pre-defined typeform indicators and five sets of typeform indicators that can be transcriber-defined. These indicators, together with any required Transcriber's Notes, can be used to fully characterize print character styling. Possible problems with the acceptability of the design of the UEB typeform indicator symbols are discussed in the separate article titled Acceptability and Extensibility of Typeform Indicators in UEB and in NUBS. General issues mitigating against the use of typeform indicators are discussed in the separate article titled Problem: Reflecting Print Styling in Braille Transcriptions. The latter article demonstrates that the suggested use of typeform indicators in the UEB Rulebook has the potential to mislead the braille reader by failing to provide semantics. It also discusses the need for alternatives to alleviate the difficulties for braille readers of using typeform indicators to obtain information easily available to sighted readers via highlighting or special print styling.

UEB Analysis: Details

Each of the eight sections below presents a detailed analysis of one of the problem sets extracted from the BANA three-part article. The first subsection of each section provides the relevant quotation(s) from the BANA article which characterize the problem set. The last subsection of each section makes suggestions as to a path forward. Since there is a lot of information here the intent has been to use sufficient cross references so the eight sections can be read in any order. Technical information about UEB is based on the PDF version of "The Rules of Unified English Braille, June 2010" obtained from the International Council on English Braille (ICEB) UEB Project Page.

1. Problem: Different print characters show as the same braille cells on a refreshable braille display (RBD)

Different brands and models of braille embossers and of refreshable braille displays require different electronic braille inputs in order to display correct braille.

1.1 Problem Statement from BANA article

Using this “on-the-fly” translation [on a refreshable braille display] without transcriber intervention, the texts are often displayed incorrectly. .... For example, both the tilde and the caret display as dots 4-5. The underline character displays as dots 4-6, no matter where it is, creating confusion with the print “dot” that appears in virtually every electronic address. These ambiguities can make for garbled translations and incorrect information to the reader.

1.2 Problem Source

One cause of incorrect display on RBDs is an inconsistency between the output format of the translation software and the input format of the braille display; this problem is not a braille code problem. The problem can arise because it is common for the output format of translation software to be a formatted braille file targetted for an embosser and encoded in 6-dot North American ASCII Braille while the typical input format expected by braille displays is 8-dot computer braille.

For example, standard 6-dot ASCII Braille uses the ASCII caret to transliterate the dots 4-5 cell but doesn't use the tilde. However the default 8-dot computer braille table built into a BrailleLite associates dots 4-5 with the ASCII tilde and dots 4-5-7 with the ASCII caret in its direct or translate-off mode. (Different brands of RBDs use a variety of different computer braille tables.) In this case the braille cells will appear to be the same if the user has disabled the bottom row of dots (dots 7-8) on the display.

One way for braille display users to determine whether an interface issue is the source of problems they've encountered with their particular braille display is to make the following experiment with the display hooked up to a computer. The experiment requires the display driver to be set to translate-off mode. The user then enters characters from the computer keyboard and notes which braille cells are displayed when the bottom-row dots (dots 7 and 8) are enabled and which ones are displayed when the bottom-row dots are turned off. The following 10 print characters are usually the ones causing problems: left square bracket, reverse solidus, right square bracket, circumflex accent (caret), low line, grave accent, left curly bracket, vertical line, right curly bracket, and tilde. Once the user understands the nature of the problem it may be necessary to consult with the appropriate experts to find a solution.

1.3 Suggested Solution

The new Portable Embosser Format PEF is intended to resolve braille encoding issues. However, PEF has not yet been widely adopted.

2. Problem: Bad automated translation

When braille users read pre-transcribed braille on their braille displays, the braille is likely to be more accurate than when braille is translated "on-the-fly." This difference is typically a result of using different transcribing software. Software used to produce downloadable electronic braille books is often much better designed and tested and more comprehensive than is the typical translation software included in a multi-functional screenreader or built into a display driver. This is fundamentally a market problem; this problem is not a braille code problem.

Persons interested in a more technical introduction to the intrinsic issues that can limit the accuracy of automated braille transcription might wish to consult the article "Current Issues for Automated Conversion of Print to Braille". This article includes information about the history of automated braille transcription and presents a detailed example of the most commonly-used translation algorithm for contracted braille. The article has important information about braille transcribing that needs to be taken into account by anyone hoping to make a positive contribution to the development of a path forward.

2.1 Problem Statement from BANA article

Because the existing technology makes it possible to produce braille more easily, it is often used in cash-strapped education settings by people who are not necessarily knowledgeable about braille itself. On the other hand, the work of knowledgeable transcribers, still extremely important, can be far more efficient with the use of this technology. Translation software and braille embossers, combined with the ability to scan documents and the availability of electronic source files from publishers, has created the potential to greatly speed the transcription of braille books.

2.2 How braille rules affect the potential accuracy of automated braille translation

Many articles about unified braille imply that a single comprehensive braille code is easier to automate and easier to understand than several separate codes. This claim is akin to the false idea that a text printed as a single thick book is easier to understand than the same text printed as several separate volumes. It is true that a human might envision that what EBAE refers to as "Entering and Exiting Computer Braille Code" is conceptually different from the use of different rules in different modes in a single "unified" code like UEB; however, it's all the same to a computer. In other words, there is no technical reason why it should be any more or less difficult to implement accurate transcribing software in either case.

One technical area where transcribing applications do have some excuse for not following the rules is when the rules depend on the semantics of an item. Automatic identification of semantics can be difficult for an automated process. An example is the apostrophe versus single quote issue described in Section 5

Another example where semantics plays a role is that both EBAE and UEB limit the use of shortform contractions in proper names. (EBAE does not allow shortform contractions in any proper names whereas UEB Rule 10.9.3 allows certain shortform contractions in proper names under certain conditions. It is not clear whether UEB Rule 10.9.3 has the potential to cause problems in the future.) In the sentence "Friendship abounds at Friendship Manor" a human can easily tell that the first occurence of Friendship is not a proper name but a computer might find this difficult.

A third example of semantics-based rules that occur in both EBAE and UEB are the two different contracted braille rules for translating genuine hyphenated compound words and for translating words shown in syllables. Alphabetic contractions are used in genuine hyphenated compound words such as whip-poor-will but are not permitted in words shown in syllables such as will-ing-ness. Again, while a human can usually differentiate a hyphenated compound word from a word where hyphens are used to separate the syllables, this can be harder for an automated process. Cf. Issues for Braille Translation of Hyphenated Items.

2.3 The need for translation rules based on semantics

It is not always reasonable for braille codes to reduce the potential for translation errors by avoiding all rules based on semantic distinctions—as UEB does by eliminating the rule that embedded email addresses be shown in uncontracted braille and (per its Rule 10.12.16) by eliminating EBAE's special rule for stammered words—since that tactic can result in braille translations that are less accurate and/or more difficult to understand.

2.4 A modern solution

There are often sophisticated algorithms that can be quite accurate at detecting semantics but these algorithms tend not to be used directly even in high-end commercial transcribing software. However, there is a growing use of semantic markup in electronic source files based on formats such as DAISY. These source files are designed to be used for multiple purposes so braille transcribing software that uses these source files as input should be able to produce more accurate braille.

3. Problem: Braille Symbols Should Be Correct in All Cases

One of BANA's major goals is to ensure that the braille rules make it possible for automated translation to at the very least present the correct symbols to the braille reader. [Here what is meant by "correct" is the braille equivalent of the print symbols in the print source file. It is, of course, always possible that the print symbols are incorrect.] Getting the correct symbols is especially important to the RBD user who is accessing a variety of print sources in real time.

[Section 1 points out that one possible cause of incorrect symbols on RBDs is not a deficiency in braille translation rules but some sort of miscommunication between the electronic braille output format of transcribing software and the expected electronic braille input format of the display. Section 2 includes a reminder that incorrect translation can result from implementation errors in transcribing software. The rest of this analysis focusses on problems inherent in braille rules, not on artifacts related to braille displays or particular software.]

UEB does provide explicit braille equivalents for a large number of print characters and also specifies a number of symbols, indicators, and constructs that a transcriber can assign as necessary when UEB does not provide an explicit symbol. [The print character corresponding to a transcriber assignment must, of course, be shown on a symbols page or in a transcriber's note.] UEB translation rules are described in detail in Section 7.

3.1 Problem Statements from BANA article

Using this “on-the-fly” translation [on a refreshable braille display] without transcriber intervention, the texts are often displayed incorrectly.
Since a human transcriber is not always part of the equation, it becomes increasingly important for our translation software to at least be able to render the words and symbols correctly. That need factors strongly into the code changes as well and will become an increasingly pressing necessity as print continues to evolve.
...A transcriber [can] include a transcriber's note indicating [what character] is shown ... in print. Of course, this solution is clear, but it requires the involvement of a transcriber rather than the name automatically and correctly displaying on a braille device.
Mainstreamed students and employed blind people are expected to be able to produce print similar to that of fellow students or colleagues at work. Their textbooks need to help them prepare for this.

3.2 Background Information on Unicode

The modern standard for print characters is Unicode. The Unicode Consortium is an international organization with the goal of identifying every character in every one of the world's current writing systems as well as some ancient writing systems such as cuneiform. So far they've identified more than 100,000 characters.

Unicode labels every character with a unique official name and also with a unique official numerical character code. The Consortium provides downloadable print charts of the different sets of characters such as the ASCII characters and the Braille Patterns. These charts, which are designed for sighted persons, have the names, numbers, and typical visual appearances of the characters.

Unicode also provides two search capabilities for characters. The chart that includes a particular character can be located by entering its Unicode character code into the search box on the charts page. Alternatively both the character code and the corresponding chart can be located by using a browser find option to search the Unicode Character Name Index.

3.3 UEB Approach

UEB addresses the issue of correct symbols with two techniques as described in detail in Section 7. First it provides UEB equivalents for hundreds of print characters. Second it provides transcriber-defined symbols, indicators, and constructs that can be used in representing any print characters that don't have an explicit UEB equivalent. Moreover, all UEB special symbols are defined in such a way that they can be used in any position relative to other items without ambiguity.

A UEB rule that might be surprising to UEB proponents who point to UEB's always being consistent with print is this rule:

7.6.1 Use one-cell (nonspecific) quotation marks 8 and 0 for the predominant quotation marks in the text in all instances where the specific form of the quotation marks ("double", "single", "Italian" or "nondirectional") has no significance, that is, in the great majority of cases. Indicate the print form of the nonspecific quotation marks on the symbols page or in a transcriber's note.
Rule 7.6.1 is apparently intended to facilitate the sharing of texts between countries that use double quotation marks for outer quotation marks and those that use single quotation marks for outer quotation marks. [This rule of course does nothing about the numerous different spelling conventions in these countries nor about different systems of measurement.]

3.4 Commentary

Unfortunately even though UEB can provide unambiguous braille equivalents for print characters and symbols, its approach does not meet BANA's goal of having the braille equivalent "automatically and correctly [display] on a braille device" and is unlikely to provide a satisfactory solution for RBD users who need each braille symbol under their fingers to be immediately recognizable. First, it may be hard for even experienced braille readers to remember UEB's large number of new multi-cell print equivalents, especially those with little or no mnemonic basis, without the use of some sort of auxiliary symbol table. Second, the use of transcriber-defined equivalents presents even more serious challenges than the use of explicit symbols. The reader not only has to locate the symbols page or the appropriate Transcriber's Note to identify the print character but is likely to have to contend with the use of the same transcriber-defined symbol to represent different print characters in different documents and possibly even within the same document.

A third problem is that the UEB approach does not appear designed to ensure that a braille reader will always know the exact identity of a print character that has been translated to a UEB braille equivalent. Examples where there can be uncertainty include the problem of determining the correct name of a letter equivalent that uses a UEB modifier indicator in its braille symbol and the use of transcriber-defined shapes to represent Unicode characters.

3.5 Suggested Solution

It can depend on the situation what a braille reader needs to know about a particular braille symbol. It may be that the reader simply needs to understand what the symbol means in order to understand what is being read in braille. In this case UEB is likely adequate.

On the other hand, the reader may need to know the exact identity of the corresponding print character, that is, either its Unicode name or its Unicode numerical code. The BANA problem statement at the start of this section states that braille users need "to be able to produce print similar to that of [sighted peers]." The most foolproof method that anyone can use to ensure the correct character when that character cannot be entered via a simple keypress from a computer keyboard is to utilize its Unicode numerical code. This numerical code can be entered directly into a wordprocessor to display the correct character and can be referenced in an electronic file format such as HTML to ensure later rendering of the correct character.

See the associated article titled "The Problem of the Braille Representation of Print Characters" for more information about Unicode and how braille users might take advantage of this standard.

4. Problem: E-mail addresses are ambiguous when displayed in contracted literary braille

Contracted English braille was only ever intended for translating print sequences such as ordinary English words where the reader could be certain that the uncontracted equivalent of a contraction is identical to the original print. This is why EBAE does not permit shortform contractions to be used in proper names or foreign words where they could be ambiguous and why Computer Braille Code (CBC) is required for brailling electronic addresses and similar information.

4.1 Problem Statement by BANA

Yet, when reading in contracted refreshable braille from a computer screen, an e-mail address will display in contracted literary braille, making the characters ambiguous. The user can take steps to view the address with no translation applied, but then the surrounding text is also displayed in uncoded characters.

4.2 Problem Source

Here the problem cause is likely deficient "on-the-fly" software that is inconsistent with the EBAE requirement to avoid ambiguity by brailling email addresses and similar information in embedded Computer Braille Code (CBC). The cause could also be the difficulty for an automated process to always correctly identify computer items or other items that cannot be distinguished only by syntax. However, there is a growing appreciation in the print world of the value in enriching electronic source files by tagging or marking up so as to support interoperability and reusability as well as future durability so the latter cause will continue to have less and less impact.

Note that the use of contracted braille for email addresses is correct UEB according to this rule:

10.12.3 Use contractions in computer material, such as email addresses, web sites, URLs, and filenames when it is embedded in regular text.
[From UEB Rules, p. 155.]

4.3 Suggested Solution

Ensure that any software used for "on-the-fly" translation is adequate. Retain EBAE+CBC approach or other separate computer braille approach. Do not adopt UEB.

5. Problem: Inconsistencies in the braille translation of single quotes and apostrophes

Correct print always uses the same character for a right single quote and an apostrophe whereas braille uses different braille symbols depending on semantics. The translation problem is further complicated by the use of different characters in informal and typeset print and by frequent inconsistencies and errors in print source documents.

5.1 Problem Statement by BANA

There is often a great deal of confusion among single quotation marks, apostrophes, and accent marks. Because of the various ways these symbols are used in print, sometimes inner quotation marks display in refreshable braille as apostrophes (dot 3), and sometimes a mark that is intended as an apostrophe or accent mark is shown as an opening inner quotation mark (dots 6, 2-3-6).

5.2 Problem Source

The root cause of the problem is a complex interaction between traditional but problematic braille rules and a quirky print situation. These factors are explained in great detail in a separate background article which explains why the problem persists in UEB.

5.3 Suggested Solution

BANA should first have the above referenced background article reviewed for accuracy and edited as necessary by the appropriate experts including print publishers and persons familiar with English literary braille so it can be used as a factual basis for a solution. BANA should then have a working group of braille experts determine an appropriate solution in view of the background article.

6. Problem: Outdated assumptions about print character placement limits the potential usefulness of braille transcriptions

Contracted braille rules are context-dependent since this tactic can save space by permitting more contractions. However these rules can be inflexible if they make assumptions about the structure of print that are subject to change. In practice it is only necessary to delete a few standard contracted braille rules to greatly minimize inflexibility from this source.

6.1 Problem Statements by BANA

Unlike the print dollar sign, the braille symbol is dependent upon its placement for its meaning; in other contexts, dots 2-5-6 has numerous possible meanings. How, then, should we handle the name of the pop music sensation that is pronounced "Kesha," but who uses a dollar sign instead of an S in the middle of her name?

...A transcriber encountering this name may spell it Kesha, but include a transcriber's note indicating that the s is shown as a dollar sign in print. Of course, this solution is clear, but it requires the involvement of a transcriber rather than the name automatically and correctly displaying on a braille device.

6.2 Problem Source

The BANA problem statements actually covers two somewhat related problems. One problem is assumptions about the structure of print that were built into the rules for contractions in order to make possible a greater number of contractions. Examples include the assumptions that a hyphen never occurs at the start of a word-like item and that certain individual capital letters are never embedded in a word.

The second problem is that some EBAE braille symbols for print characters, including the dollar sign, are not allowed in certain relative positions because they would be ambiguous.

6.3 Suggested Solution

The problem of assumptions about print built into the rules for contractions was solved in UEB by eliminating nine contractions including those for "com", "ation", and "dd" and by eliminating sequencing or the omission of spaces between certain contractions that represent whole words. These same changes could also be applied to EBAE, to the current Nemeth code, or to NUBS. It is, however, important to note that many facile braille readers are opposed to any changes to the rules for contractions. One advantage of NUBS over UEB is that these changes would be optional in NUBS but are mandatory in UEB due to its design.

Problems related to ensuring that the interpretation of brailled "special symbols" is independent of their relative position is related to the general problem of an accurate representation of print characters. This problem is discussed in the next section.

7. Problem: The large number of print characters

Section 3 discusses some reasons as to why the UEB approach to representing print characters might be inconvenient for someone reading braille on a braille display. This section presents a more general perspective on the UEB approach to representing print characters.

The overriding issue in designing a braille code is that there are more than 100,000 different characters that can appear in print documents. Braille readers (as well as print readers) are likely to encounter a growing number of unfamiliar characters because of globalization and because of the increasing presence of technical symbols in non-technical materials.

Given the large number of print characters, the best a particular braille code can hope to accomplish is to provide convenient braille equivalents for a reasonable number of the characters that the braille reader using that code is most likely to encounter. (The arguments that any braille code that uses the same characters for the decimal digits and for ten letters of the alphabet is not convenient are well known and are not addressed here.) Unfortunately the developers of UEB have not (to my knowledge) presented a rationale for which print characters have UEB braille symbols and which do not.

UEB represents a greater number of letters and common print symbols than does EBAE and the corresponding braille symbols retain their meaning independently of their relative position although this latter feature necessitates a greater use of mode indicators. The number of braille symbols is, of course, still only a small fraction of the total number of existing print characters. And, obviously, given that no six-dot braille code can use more than 63 braille cells, the more print characters that a braille code represents, the more mode indicators and symbols comprised of three or more braille cells are required.

UEB does provide some flexibility and extensibility through the use of transcriber-assigned characters. However, this approach has practical drawbacks. The reader not only has to locate the symbols page or the appropriate transcriber's note to identify the assigned print character but is likely to have to contend with the use of the same transcriber-defined symbol to represent different print characters in different documents and possibly even within the same document. It might well be that the option to provide the Unicode character code in hexadecimal would better serve the reader who needs to be sure of the identity of a print character.

See the article "The Problem of the Braille Representation of Print Characters" for a non-technical introduction to this issue.

7.1 Problem Statements from BANA article

The literary braille code instructs the transcriber to substitute a word for symbols such as + (the plus sign), - (the minus sign), and < (greater than) that are shown in print.
Since a human transcriber is not always part of the equation, it becomes increasingly important for our translation software to at least be able to render the words and symbols correctly. That need factors strongly into the code changes as well and will become an increasingly pressing necessity as print continues to evolve.
If a company uses nonstandard symbols in its name and a blind person misspells the company name on a cover letter for a job application because she did not get accurate information from the braille, what are the chances that person will get the job? Should she have to check the spelling using audio or relying on a sighted person to tell her how it is spelled or should braille, the primary literacy tool for people who are blind, be capable of giving the most accurate information?

7.2 UEB symbol construction

UEB uses the same prefix-root method to construct symbols as is used in the Nemeth code. Both UEB and the revised Nemeth code called NUBS adhere strictly to this approach.

Presenting all of the UEB rules for symbols is not within the scope of the present article which focusses on some of the less familiar aspects of UEB. Persons interested in more information will need to consult the previously referenced "The Rules of Unified English Braille, June 2010" and other documents available from the same source.

A supposed advantage of UEB is that the braille symbols for print characters retain their meaning no matter their relative location. Unfortunately, in braille codes as elsewhere "there's no such thing as a free lunch!" One of the costs of the symbols for characters retaining their meanings is the the interpretation of the UEB indicators can depend on the presence of mode indicators.

7.2.1 Some UEB two-cell symbols

UEB uses two-cell prefix-root symbols to represent common characters which has the advantage that these are probably the easiest multi-cell symbols for braille readers. However, the design of UEB, including its use of upper numbers, means that the average transcription contains a greater number of two-cell symbols as compared to one-cell ones than would an EBAE+CBC or Nemeth transcription of the same source.

Dot-4 is one of the prefixes used in two-cell symbols. UEB uses dot-4 as an indicator before nine letters to represent monetary and other symbols. Thus dot-4 before the cell for the letter a is the UEB symbol for the Commercial At Sign, before e is the symbol for the Euro Sign, before s is the symbol for the dollar sign (i.e. the same as in Nemeth and NUBS), etc.

These dot-4 symbols, like other UEB braille equivalents for print symbols, have the same meaning no matter what their relative position. The UEB dot-4 dots-234 symbol for the dollar sign can thus be used to spell Ke$ha or Micro$oft. [It needs to be pointed out here that UEB pays a perhaps undesirable price for this capability. The design of UEB is such that the use of the generic accent indicator is no longer an option, i.e. accented letters must be represented by cumbersome multi-cell symbols in every occurence. On the other hand, the design of NUBS allows for both alternatives. With NUBS the reader who wishes more information on encountering a generic accent indicator knows to check the symbol table. As for the use of an embedded dollar sign in NUBS, this is flagged by preceding a word like Ke$ha with the dots-56 notational indicator.]

7.2.2 UEB equivalents for Latin-1 Supplement and Latin Extended Letters

Since UEB uses dot-4 as a prefix with single letters to represent special symbols, it can't follow the EBAE practice of using dot-4 before a letter to indicate a "foreign letter," actually a letter which resembles a Latin letter that has been marked with some sort of accent or other decoration, when it appears in English texts. Moreover, the EBAE practice does not provide sufficient information for the reader to identify the foreign letter unless the reader happens to be sure of the word in which the letter appears. On the other hand, UEB has no way to indicate a generic "accent" for a leisure reader who doesn't want to be slowed down by details he or she doesn't care about.

UEB represents marked or accented letters using a set of twelve new two-cell indicators or modifiers. These indicators are used to create three-cell symbols for small or capital Latin letters that have been "modified" with UEB-defined items or decorations such as a grave accent above or cedilla below. ("Modified letters" is UEB terminology for unique characters that resemble small or capital Latin letters but with some sort of added decoration.) The UEB modifiers must be placed immediately before the corresponding Latin letter. (Any capitalization indicator is placed previous to the modifier.) UEB also provides for transcriber-defined modifiers.

Note UEB Rule 4.2.7 for the use and non-use of modifiers.

Use the modifiers listed above only in foreign language words and phrases in English context intended primarily for leisure reading, in English words or in anglicised words and phrases.

Where a signficant knowledge of a foreign language is presupposed or is being taught, use signs from the indigenous foreign language braille code.

One potential problem with the use of modified letters is that the user may fail to understand that the modified Latin letters are actually unique single characters in print. They are not analogous to cases where superscripts or overscripts are used to specify a layout that shows a particular relationship among separate characters.

A second potential problem with the use of modified letters is that the reader may find it difficult to determine the correct name of the corresponding print character. For example one modifier is called "breve above the following letter" but the formal Unicode character name of, say, a small letter a to which that modifier is applied is "LATIN SMALL LETTER A WITH BREVE," not "LATIN SMALL LETTER A WITH BREVE ABOVE."

The continued focus in English braille of providing symbols for letters that occur primarily in a small set of languages might be considered rather old-fashioned since these Latin-like letters are only a very small fraction of the letters in all of the world's scripts.

7.2.3 UEB transcriber-assigned symbols

UEB's rules for representing print characters are to some extent extensible in that it provides for transcriber-assigned braille equivalents. Their use does, of course, require the reader to locate either the associated transcriber's note or the symbol page to determine what print character they are intended to represent. Moreover, it could be difficult to ensure that these symbols are used consistently.

7.2.3.1 Transcriber-assigned symbols

Per Rule 3.25, UEB provides seven new symbols which transcribers can define to represent any characters that don't have explicit UEB symbols. These assignments must, of course, be listed on a separate symbols page or explained in a separate transcriber's note which has the consequence that they are less useful to the braille reader using an RBD for instant access to print than to the braille reader reading embossed paper braille.

7.2.3.2 Transcriber-defined shapes

Another technique UEB provides for representing icons and other symbols which have no UEB equivalent is as a shape. The transcriber defines a short abbreviation, such as "sf" for "smiling face ... used as an icon", and creates a new symbol by preceding the abbreviation with the UEB symbol indicator. This technique is described in Section 14.2 in the UEB "Guidelines for Technical Material" and, of course, once again the assignment must be listed in a transcriber's note or on the symbols page.

The use of transcriber-defined shapes has the potential to mislead the reader in cases where these braille equivalents are used to represent standard Unicode symbol characters. Referring back to the example of a "smiling face" it turns out that Unicode specifies 60 emoticons of which more than 40 are faces including these nine distinct smiling faces:

  1. SMILING FACE WITH OPEN MOUTH 😃
  2. SMILING FACE WITH OPEN MOUTH AND SMILING EYES 😄
  3. SMILING FACE WITH OPEN MOUTH AND COLD SWEAT 😅
  4. SMILING FACE WITH OPEN MOUTH AND TIGHTLY-CLOSED EYES 😆
  5. SMILING FACE WITH HALO 😇
  6. SMILING FACE WITH HORNS 😈
  7. SMILING FACE WITH SMILING EYES 😊
  8. SMILING FACE WITH HEART-SHAPED EYES 😍
  9. SMILING FACE WITH SUNGLASSES 😎
Given the growing use of emoticons in email and text messages, recognizing and being able to use the correct smiling face character is arguably as important knowing how to spell Ke$ha with an embedded dollar sign. [If the smiling faces don't display correctly for sighted readers you will need to install the Quivira TrueType Unicode-based font which can be downloaded from the Quivira site.]

7.2.3.3 Transcriber-defined letter modifiers

UEB provides three transcriber-defined three-cell indicators that can be used to create four-cell symbols for small or capital letters which are modified with transcriber-defined decorations. (These indicators have a similar functionality to the corresponding explicit two-cell modifier indicators.)

7.2.3.4 Generic quotation marks

A UEB rule that might be surprising to UEB proponents who point to UEB's always being consistent with print is this one:

7.6.1 Use one-cell (nonspecific) quotation marks 8 and 0 for the predominant quotation marks in the text in all instances where the specific form of the quotation marks ("double", "single", "Italian" or "nondirectional") has no significance, that is, in the great majority of cases. Indicate the print form of the nonspecific quotation marks on the symbols page or in a transcriber's note.
This rule is, of course, convenient for the reader accustomed to EBAE since it is consistent with EBAE's symbols for directional double quotation marks. However, it is not consistent with BANA's desire that "translation software ... at least be able to render the ... symbols correctly."

7.3 Braille-specific ("phantom") symbols

One of BANA's continuing concerns is that braille not mislead braille readers as to the nature of the print source. Anachronistic braille rules such as placing a per cent sign before a number or inserting apostrophes where none are present in print were eliminated quite some time ago.

Unfortunately, the UEB seems to have taken a step backwards as far as the possibility of misleading braille readers. UEB's required use of braille grouping symbols reintroduces a significant difference between braille and print that could be confusing for braille readers. The Nemeth code showed more than 40 years ago that it is possible to design an unambiguous braille code that does not require the use of enclosure symbols not present in print.

7.4 Analyis of the UEB approach to representing print characters

It appears from an examination of the UEB rules that UEB can potentially represent any print character unambiguously although as several of the previous examples are intended to point out, the UEB Rulebook does not require that transcribers always do this.

Given that the UEB has the potential to represent any print character, the question becomes whether the mechanisms used to achieve this have any drawbacks. The answer is that there are a number of drawbacks. In the long run the most serious drawback may be UEB's failure to leverage Unicode, which is how the print world ensures an unambiguous representation of each of the growing number of print characters. Drawbacks that can affect the braille reader directly include UEB's reliance on separate symbol lists and transcriber's notes, its heavy use of mode indicators, and its numerous multi-cell braille symbols.

[Arguments against UEB's use of upper numbers, i.e. the use of the same characters for the decimal digits as for ten letters, are well-known and will not be repeated here.]

7.5 Suggestions

An accompanying article titled "The Problem of the Braille Representation of Print Characters" provides a non-technical introduction to the character problem and some suggestions for how braille codes can deal with this problem.

The only reasonable way that a braille code can be designed to be future-proof as far as representing characters in a manner useful to the braille reader is to ensure that the braille code has the alternative to provide a direct representation of any Unicode character code when providing an indirect braille equivalent would be awkward or infeasible. (Unicode formal character names are too long for this purpose.) All this would require would be for the braille code to specify the necessary indicator(s). (The HTML standard is to preface a Unicode hexadecimal character code with the sequence &#x and follow it with single semicolon character.)

Unicode can represent braille. What will be our answer when asked if braille can represent Unicode?

Anyone not convinced of the importance of Unicode should take the time to read the wonderful quotations from persons all over the world which have been collected on the Acclaim for Unicode page.

The development of Unicode has underscored the Internet’s truly global character. The recorded history of every nation and culture can travel in its natural form across Cyberspace for the use of anyone, anywhere. Through the power of Unicode, a worldwide audience is finally able to share in the breadth of human creativity.
Brendan Kehoe, Zen and the Art of the Internet
Unicode marks the most significant advance in writing systems since the Phoenicians.
James J. O’Donnell, Provost, Georgetown University

8. Problem: The wide variety of print character styling in educational materials and other print documents.

The growing use of a wide variety of print character styling presents considerable difficulty to braille transcribers and negatively impacts the possibilities for accurate automated braille transcription. One solution is for braille systems to provide a greater number of typeform indicators including ones where the meaning can be assigned by transcribers. However, the use of typeform indicators can present multiple drawbacks for the braille reader.

8.1 Problem Statement by BANA

Literary braille provides only one way to indicate a change in font showing emphasis. The one indicator, the italic sign, has to represent italic, boldface, underlined, or colored type. The Formats guidelines allow for italic, boldface, and various colors. These are needed when a textbook gives an instruction such as: “Copy the new vocabulary words (shown in italic type) into your notebook and study the review words (shown in boldface type)."

8.2 Problem Analysis

Because of the complexity of issues surrounding the representation of print character styling in braille transcriptions these issues have been addressed in the two accompanying articles. The implementation of the UEB typeform indicators along with possible problems with the acceptability of their design are discussed in the article titled Acceptability and Extensibility of Typeform Indicators in UEB and in NUBS. General issues mitigating against the use of typeform indicators rather than alternatives are discussed in the article titled Problem: Reflecting Print Styling in Braille Transcriptions.

8.3 Suggested Solutions

The first article referenced above demonstrates that the implementions of typeform indicators in both UEB and NUBS are extensible. However, since the NUBS implementation is likely to be more acceptable to braille readers, that choice is to be preferred. The second article presents several suggestions as seen in the article's Summary.

Conclusion

A careful examination of the UEB Rulebook in light of the problems that BANA hopes the UEB will address shows that the UEB fails in numerous respects to address current problems and to provide an extensible framework for the future. BANA needs to find another solution.



A DRAFT version of this article was first posted February 10, 2012.
A updated DRAFT version was posted April 19, 2012.
Some typos fixed in version posted May 3, 2012.
Contact author with feedback and corrections: info at dotlessbraille dot org