DotlessBraille Home

The Problem of Apostrophes and Single Quotes in Braille

Table of Contents

Executive Summary. The Braille Authority of North America (BANA) has recently published a three-part article that points out specific issues with braille translation of apostrophes and single quotes.

There is often a great deal of confusion among single quotation marks, apostrophes, and accent marks. Because of the various ways these symbols are used in print, sometimes inner quotation marks display in refreshable braille as apostrophes (dot 3), and sometimes a mark that is intended as an apostrophe or accent mark is shown as an opening inner quotation mark (dots 6, 2-3-6).

Solving the apostrophe and single quote problem will be difficult since it involves the interaction of three factors:

  1. a quirky print situation,
  2. braille tradition, and
  3. the changing needs of braille readers.
Note that the proposed UEB does not solve this problem.

The quirky print situation has two aspects. One aspect is that informal print and typeset print handle apostrophes and single quotes completely differently. Informal print uses Unicode character 0027 hex ['] for three different purposes: as the apostrophe and as both the left and right single quotes. Properly typeset print uses Unicode character 2019 hex [’] for two different purposes: as the apostrophe and as the right single quote. (It uses character 2018 [‘] for the left single quote.)

Where braille tradition complicates the problem is that this is one of the few cases where braille systems use separate braille symbols for the same print character depending on semantics. Braille systems use different symbols for an apostrophe and for single quotes. This means that the semantic purpose of the print character to be translated has to be identified in order for the braille transcription to correctly follow braille rules. The semantic purpose can be either indicated by a human editor or inferred by heuristics built into transcription software. However is difficult for an automated process to always determine the proper semantics and thus produce correct braille.

The third part of this problem is that in addition to knowing print semantics, braille readers sometimes need to know print syntax, i.e. exactly which print character is used in a print source document. BANA's CBC code addresses this issue as far as uniquely identifying the ASCII keyboard characters. However, none of the current BANA codes provide a general method to identify particular non-ASCII characters as would be necessary to distinguish Unicode character ’ from Unicode character '.

Braille readers need a better solution to apostrophe-quote problem than is available either in the current BANA codes or in the proposed UEB. UEB still requires a difficult semantic distinction to be made and it still doesn't provide any way for the braille reader to be certain of the identity of the translated print character. It is my hope that the background information in this article will help contribute to a solution.

The apostrophe and single quotes in print

There are various ways of representing apostrophes and single quotes in print that cause confusion even to print readers and that can negatively impact the accuracy of braille transcriptions.

Typeset documents

In properly typeset American English documents the same character is used to represent the apostrophe and the right single quote but not the left single quote. This character, ’ does not appear on the standard U.S. keyboard. Its Unicode name is Right Single Quotation Mark and its Unicode character code is 2019 hexadecimal. Typical fonts render this character as a raised curved mark bowed to the right.

Informal documents

In informal American English documents the same character is used to represent the apostrophe, the left single quote, and the right single quote. This character, ', appears on the standard U.S. keyboard. Its Unicode name is Apostrophe and its Unicode character code is 0027 hexadecimal. Typical fonts render this character as a raised vertical mark.

Problem summary

In summary, both typeset print documents and informal print documents use the same character for at least two different semantic purposes: as an apostrophe and as a (right) single quotation mark while braille transcriptions traditionally use separate symbols depending on the semantic purpose of the character. The need to make this distinction can impact the cost of braille production and/or the accuracy of a braille transcription. An automated solution is thus desirable.

Although humans can usually corrently identify the intended semantics, a Google seach was unable to locate heuristics that would ensure that automated processes can do as well as humans in this regard. (A similar problem occurs when using the “smart quotes” option in Microsoft Word. Its heuristics sometimes supply a left single quotation mark where the author intends an apostrophe.)

A recent article published by BANA summarizes the effect of this issue on the user of refreshable braille as follows.

There is often a great deal of confusion among single quotation marks, apostrophes, and accent marks. Because of the various ways these symbols are used in print, sometimes inner quotation marks display in refreshable braille as apostrophes (dot 3), and sometimes a mark that is intended as an apostrophe or accent mark is shown as an opening inner quotation mark (dots 6, 2-3-6).

The apostrophe and single quotes in braille

Braille codes make a semantic distinction between apostrophe and single quote characters that is not present in print. EBAE, CBC, and UEB each handle this issue in different ways but none of these braille codes provides a fully satisfactory solution to preventing this issue from causing errors in automated braille transcription nor to presenting accurate character information to the braille reader. (Some braille transcribing software apparently has built-in heuristics to address this issue. However this is not an optimal solution as the heuristics are not error-free, usually not described in the software documentation, and not under the control of the user.)

EBAE

English Braille American Edition (EBAE) has a symbol represented by braille cell dot-3 that is called apostrophe. This symbol is apparently intended to be used to translate characters with the semantic role of an apostrophe. Thus the EBAE apostrophe translates two different characters: the Unicode Apostrophe when used as an apostrophe and the Unicode Right Single Quotation Mark when used as an apostrophe.

EBAE has a two-cell symbol, dots-356, dot-3, called closing single quotation mark, that is apparently intended to be used to translate characters with the semantic role of a right single quotation mark. Thus the EBAE closing single quotation mark translates two different characters: the Unicode Apostrophe when used as an closing single quotation mark and the Unicode Right Single Quotation Mark when used as a closing single quotation mark.

EBAE has a two-cell symbol, dot-6, dots-236, called opening single quotation mark, that is apparently intended to be used to translate characters with the semantic role of a left single quotation mark. Thus the EBAE opening single quotation mark translates two different characters: the Unicode Apostrophe when used as an opeining single quotation mark and the Unicode Left Single Quotation Mark in all situations.

CBC

Computer Braille Code (CBC) has a symbol represented by braille cell dot-3 that is called apostrophe that is intended to represent the ' ASCII keyboard character no matter what its semantic purpose. CBC does not define braille symbols for non-ASCII characters.

UEB

The proposed Unified English Braille (UEB) system specifies a symbol, represented by braille cell dot-3, that is called apostrophe, nondirectional single quotation mark. This braille symbol is apparently intended to represent the ' keyboard character no matter what its semantic purpose and also any other characters with the semantic purpose of an apostrophe. UEB specifies a separate symbol, represented by the two-cell braille sequence dot-6, dots-356 that that is called closing single quotation mark. This braille symbol is apparently intended to represent the ’ character only when used as a closing single quotation mark.

The June 2010 Rules of Unified English Braille document as obtained from the International Council on English Braille (ICEB) Project Page states:

7.6.1 Use one-cell (nonspecific) quotation marks [dots-236] and [dots-356] for the predominant quotation marks in the text in all instances where the specific form of the quotation marks ("double", "single", "Italian" or "nondirectional") has no significance, that is, in the great majority of cases. Indicate the print form of the nonspecific quotation marks on the symbols page or in a transcriber's note. [Note that EBAE uses the same symbols as the UEB nonspecific paired quotation marks to represent the specific paired double quotation marks.]

...

7.6.5 Use one-cell (nonspecific) quotation marks when apostrophes are used as the predominant quotation marks in print. Use specific single quotation marks when apostrophes are used as the secondary or inner quotation marks in print. However, when in doubt as to whether a mark is an apostrophe or a single quotation mark, treat it as an apostrophe.

7.6.6 Use nondirectional double [dot-6, dots-2356] or single [dot-3] quotation marks (that is quotation marks without any slant or curl to convey "opening" or "closing") only in the following relatively rare cases:

Otherwise use directional quotation marks.

Although the UEB is supposed to make braille transcription simpler and braille more readable, these rules and guidelines seem (to me at least) to be more complex and to require more human intervention than BANA's current rules and guidelines.

Concluding thoughts and issues for consideration

I can only guess why the designers of EBAE chose to use different braille symbols for an apostrophe and for a right single quote even though they are the same in print. The right single quote is rarely used in American print and is thus represented in EBAE with a two-cell rather than a one-cell symbol, i.e., dots-356, dot-3. It may be that the designers felt that the simpler dot-3 symbol for an apostrophe made tactile braille more readable. Given that braille was at one time transcribed by sighted persons rather than by automated processes, the semantic problem of distinguishing an apostrophe from a single quotation mark was probably not considered significant. Moreover, prior to the widespread use of electronic data, it is likely that only a few experts had much interest in the syntax or particular characters used in a document.

It is difficult to know how to resolve braille's apostrophe versus single quote issue especially given that print sometimes uses Unicode character 0027 hex for the apostrophe and also for both single quotes and sometimes uses Unicode character 2019 hex for the apostrophe and also for the right single quote. On the one hand, given the natural desire to avoid unnecessary changes to braille, it seems undesirable to advocate following print and using the same symbol for an apostrophe and a right single quote so as to avoid part of the semantic problem that occurs with automated braille transcription. (Note that this strategy would not solve the semantic problem where Unicode character 0027 hex is used for the left single quote.) On the other hand, using the same braille symbol independently of semantics would still do nothing as far as aiding the braille reader who has a need to know which characters are used in a print source document.

The problem of knowing which Unicode character has been used to represent an apostrophe or a single quote is a specific example of the more general problem of knowing exactly which of the more than 100,000 Unicode characters have been used in a print document. Although most sighted persons can visually distinguish a printed neutral apostrophe from a printed smart apostrophe, it is not true that a sighted person can visually identify any arbitrary printed character. It is thus reasonable to restrict any general solution to the situation where the print source is a properly encoded electronic document. It is only possible to know for certain which character is intended when an electronic document correctly identifies the syntax of each character using a Unicode-based convention. In some cases the only reasonable solution may be to make the electronic source document available to the reader. Alternatively, braille rules could be extended so as to to provide a standard method for presenting character information such that the braille reader can determine the Unicode identity.


This article first posted February 3, 2012.
Slightly revised version posted February 10, 2012.
Contact author: info at dotlessbraille dot org