Braille Representation of Print Styling

8. Problem: Reflecting Print Styling in Braille Transcriptions (4/17/2012 Draft)

Executive Summary
Introduction
8.1 Problem Statement and Expert's Response
- 8.1.1 Problem Statement by BANA
- 8.1.2 Response from DAISY expert
8.2 Background information
8.3 Standard text-level semantic elements
8.4 UEB Typeform Indicators
8.5 An example from the UEB Rulebook where a typeform indicator is a poor solution
8.6 The mark text-level semantic element
8.7 Summary

Executive Summary UEB provides four sets of explicit typeform indicators and five sets of transcriber-defined typeform indicators all of which are to be used to indicate the print styling of characters, words, and phrases. (NUBS has a similar capability.) The implementation of these indicators and the rules for their default use are detailed in an accompanying article.

The present article focusses on a more fundamental issue concerning the use of typeform indicators: cases where use of these these indicators can result in poor braille transcriptions. The basic problem is that print documents often use special styling to distinguish text-level semantics and what the braille reader needs to know is not the details of the styling but the implied semantics. Some of these cases are already addressed in current braille system such as the use in English braille of a letter sign rather than an italics typeform indicator to show that a letter italiczied in print is a symbol and not an alphabetic wordsign. However there are many other cases that are yet to be addressed.

This article includes a list of 17 different text-level semantic elements which are typically distinguished with special print styling but may be cases where the use of braille typeform indicators would not provide the information the braille reader needs. A particular example of this problem, which was taken from the UEB Rulebook, is used to show how the suggested use of a transcriber-defined typeform indicator could, in practice, mislead the braille reader.

The semantic element which seems to currently be of most concern to BANA is the mark or highlight element. This semantic element is a run of text to which special print styling has been used in order to call attention to the text for some auxiliary purpose. BANA's concern stems from the growing use of this element in educational materials such as the highlighting of key words in colored type. BANA has been hoping that the greater number of typeform indicators available in braille systems like UEB and NUBS could be used in braille transcriptions to more accurately represent the growing use and variety of print highlighting. However, arguments based on the work of braille researchers Dr. Susanna Millar and Dr. Cheryl Kamei-Hannan and on the principles of Universal Design for Learning (UDL) suggest that the use of typeform indicators for this purpose has multiple drawbacks and that braille readers may need a more general solution than typeform indicators for obtaining the information that highlighting provides to the sighted reader.

Introduction Braille systems provide for two types of formatting or styling: direct formatting shown with whitespace and character formatting shown with the use of braille indicators. Direct formatting is used for structural elements while character formatting is used for runs of text where the visual appearance is made distinctive by using different fonts, font sizes, colors, and styles such as italic or bold.

Braille systems typically treat at least three types of document elements as structural elements:

elements for designating sections of a document, e.g. headers;
elements for grouping contents within sections, e.g. paragraphs; and
tabular data, e.g. text layed out as a table.

Braille formatting of structural elements is not addressed in the UEB Rulebook and is thus not addressed here. The remaining references to print styling in this article are specific to character formatting.

The problem of reflecting print styling in a braille transcription is much more complex than it might first appear. One way to appreciate this complexity is to consider the various reasons why a braille reader might need to be aware of print styling. The braille reader might be doing research on ancient manuscripts or the particulars of texts as printed in old books and need the detail provided to the field of Digital Humanities by the Text Encoding Initiative(TEI). The braille reader might need a facsimile transcription for some scholarly or job-related purpose. The braille reader might need to understand the how and why of designing and using stylesheets and templates so as to produce properly typeset print documents or PowerPoint slides for sighted colleagues or teachers. The braille reader, including a student, might simply need to get the equivalent information that a sighted reader acquires from print styling.

The UEB Rulebook provides numerous typeform indicators which are supposed to be used as the braille equivalent of print styling. The rules for the use of these typeform indicators are summarized in Section 8.4. However, as the previous paragraph suggests and the following discussion will make clear, typeform indicators can only address part of the problem.

8.1 Problem Statement and Expert's Response

There is a growing concern among braille producers as to how to deal with the myriad of distinctive formatting in print textbooks. A typical example of this concern as extracted from BANA's three-part article is followed by the response of a DAISY expert.

8.1.1 Problem Statement by BANA

Literary braille provides only one way to indicate a change in font showing emphasis. The one indicator, the italic sign, has to represent italic, boldface, underlined, or colored type. The Formats guidelines allow for italic, boldface, and various colors. These are needed when a textbook gives an instruction such as: “Copy the new vocabulary words (shown in italic type) into your notebook and study the review words (shown in boldface type).[Cf. mark element.]

8.1.2 Response from DAISY expert

I perfectly understand your mainstream teacher example. This is probably a debatable issue, and is interdependent with how to mark up the book. It's also a chicken and egg situation where the book markup may depend on how teachers communicate, and eventually the way teachers communicate may be influenced by the markup.

For instance, I personally believe that a teacher having a VI student should not point to "blue definitions", but "theorems definitions, in blue" and also know both the print version and digital edition, to ensure the instructions can be understood by all students. It's definitely not simple, and probably involves a lot of training. I personally do not believe that the formatting should be used to convey the type of information being discussed. Although I understand that in a transition period, the original formatting information may need to appear in the accessible version. [Lightly edited private communication from leading DAISY expert in reply to query from the author.]

8.2 Background information

Before analyzing UEB's treatment of character formatting, it is essential to consider the latest information as to how the print world handles the related issue. This information is especially important for persons planning the future of braille in light of BANA's goal of helping braille users function better in the print world both now and in the future.

Character formatting can either be purely decorative or be used to provide a visual indication of text-level semantics. Braille transcriptions typically ignore decorative formatting. Of course this tactic requires the transcriber to distinguish decorative formatting from other formatting. One practical way of doing this is to first identify all the formatting that does supply information the braille reader needs to know.

The easiest way to identify non-decorative formatting is when one has access to an electronic source document that is properly tagged or marked up such that text items are marked with semantic tags tied to a separate style sheet rather than being marked with explicit style information. Nonetheless, sighted transcribers are still often called upon to transcribe rendered documents where the styling is applied (or appears to have been applied) directly to the text. [Transcribers of technical material need to be aware that there are many important mathematical symbols that while they may appear to be ordinary symbols to which some sort of styling has been applied are actually unique print characters and need to be designated as such.] In the case of a rendered document the transcriber often has the task of determining the purposes of the different styles although information as to style conventions is sometimes included explicitly in the preface to a book or other document.

If braille systems are to handle the use of informative print character styling in an intelligent manner, they need to give consideration to the various purposes or functions of informative character styling. In some cases it may be appropriate to use braille indicators to indicate such print styling. However there are other cases, such as when print styling indicates "user input," that the use of special translation rules such as BANA's Computer Braille Code (CBC) is a better solution. The next section uses modern terminology to list a number of common purposes for which print texts use informative, non-decorative styling.

8.3 Standard text-level semantic elements

HTML specifications are an especially useful souce of modern terminology since they represent the consensus of a large number of experts. The latest version of HTML5 identifies 28 distinct types of text-level semantic elements or phrase tags which are often distinguished by informative print styling. However, only 17 of these elements are typically found in ordinary English texts. The following list quotes the HTML terminology to describe these 17 semantic elements which English braille systems are most likely to be called upon to address:

The mark element "represents a run of text in one document marked or highlighted for reference purposes, due to its relevance in another context." (This element is listed first here because the growing use in textbooks of visual styling for a variety of mark elements is one of the concerns BANA is hoping to address. A particular publisher's use of a distinctive style to mark vocabulary words is one example. Cf. Section 8.6.)
The em element is "used for stress emphasis"
The strong element "represents strong importance"
The cite element "represents the title of a work"
The q element "represents some phrasing content quoted from another source"
The dfn element "represents the defining instance of a term"
The abbr element "represents an abbreviation or acronym"
The var element "represents a variable" (i.e. what in contracted braille is called an alphabetic letter representing itself)
The code element "represents a fragment of computer code"
The samp element "represents sample output from a computer program"
The kbd element "represents user input"
The a element is "used for hyperlinks and hyperlink placeholders"
The sub and sup elements are used to mark up the use of these typographical conventions except where MathML markup is more appropriate
The i element "represents a span of text in an alternate voice or mood"
The b element "represents a span of text to which attention is being drawn for utilitarian purposes without conveying any extra importance and with no implication of an alternate voice or mood, such as key words in a document abstract, product names in a review, actionable words in interactive text-driven software, or an article lede"
The small element "represents side comments including attributions"
The s element represents contents that are out-of-date, e.g. the original price of an item now on sale

8.4 UEB Typeform Indicators

UEB's typeform indicators make it possible to indicate considerably more variation in character formatting than does EBAE and are thus more in line with Braille Formats. UEB provides four sets of two-cell prefix-root symbols which transcribers can use to represent italic, boldface, underlined, and script typeforms. ("The prefix designates the typeform and the root determines its extent.") Each set consists of three typeform start indicators (for symbols, words, and passages, respectively) and one typeform terminator to terminate the corresponding passage start indicator.

A possible disadvantage to the UEB typeform indicators is that the braille reader could have trouble simply remembering what they mean. The italic typeform indicator prefix is the traditional dots-46 while the ones for boldface, underlined, and script are dots-45, dots-456, and dot-4. [The italic and script indicator prefixes are the same as in the Nemeth code but the other two are different.]

UEB also supplies five additional sets of three-cell prefix-root symbols which are not assigned to particular typeforms. Transcribers can assign these symbols to represent any print typeforms that don't have assigned UEB typeform indicators. These assignments must, of course, be stated in a transcriber's note or symbol list.

The transcriber-defined typeform indicators have the potential to present more difficulties for the reader than do the explicit ones both because they are three-cell symbols and because they may not be used consistently in different transcriptions. The second cell of the two-cell prefix for the transcriber-defined typeform indicators is always dots-3456 which is more commonly used as the the number sign. The first cells of the five prefixes are dot-4, dots-45, dots-456, dot-5, and dots-46, respectively. [The third or root cell, which indicates the scope or extent, is the same as is used in the corresponding explicit typeform indicators.]

Cf. accompanying article for more details on the implementation of these indicators and the rules for their use.

8.5 An example from the UEB Rulebook where a typeform indicator is a poor solution

As the long list of text-level semantic elements in Section 8.3 makes abundantly clear, modern print documents often use styled text to convey important and/or useful information, not just to make the text more visually attractive to the sighted reader. If braille transcriptions are to capture the information conveyed by print styling they will sometimes need to use more sophisticated braille solutions than typeform indicators.

This example of the use of a user-defined typeform indicator extracted from the UEB Rulebook is a good illustration of the misuse of typeform indicators.

Print:
In response to the prompt Insert the CD-ROM in drive E:, you put the compact disk in drive E, and press Enter.

Suggested UEB braille translation:
,9 response to ! prompt @#7,9s]t
! ;,,cd-,,rom 9 drive ;,e31@#'
y put ! compact disk 9 drive
;,e1 & press ,5t]4

[The UEB braille translation of the phrase Insert the CD-ROM in drive E: shown in a monospace or typewriter font in print is shown in colored simulated braille for convenience here. It is not shown in colored type in the Rulebook.]

The UEB braille translation, shown in red, of the phrase Insert the CD-ROM in drive E: uses contracted braille. It is immediately preceded by the first UEB transcriber-defined typeform passage indicator and it—plus the comma purposely not shown in monospace in print—is immediately followed by the first UEB transcriber-defined typeform terminator. [Both typeform indicators are shown in green here.] The Rulebook states that in the example "the first transcriber-defined typeform is used to indicate a change to a Courier New font." This is apparently the information the Rulebook intends to be in a Transcriber's Note or symbol list in an actual transcription.

The phrase Insert the CD-ROM in drive E: is shown in a monospace font in print because that is a standard convention for showing what a user is supposed to enter verbatim. In an electronic source file this phrase should be marked with the "kbd" semantic element.

The suggested UEB transcription for this example is useless; it would not provide the reader the information the reader needs. Even a sighted reader is unlikely to recognize or care which of several possibile monospace fonts is in use. Moreover, it simply does not make sense to translate to contracted braille something a user is supposed to enter exactly since the contracted braille translation is not what the user is supposed to enter and backtranslation can be wrong for computer items. The right transcription would use something equivalent to BANA's CBC. [By the way, if braille display users had the option for such items to be shown in eight-dot computer braille, they could simply copy-and-paste the information directly from the displayed braille transcription.]

8.6 The `mark` text-level semantic element

Braille authorities have for some time been concerned with the problem of mainstreamed braille-using students who have a need to understand direct references to particular print styling such as when told to locate vocabulary words marked or highlighted in a distinctive style. This is a particular example of a general tactic HTML5 addresses with its new mark text-level semantic element. The intended purpose of this element is to call attention to text which is useful to the reader who is trying to accomplish some task (such as finding vocabulary words) or to call attention to text that is relevant in a different context.

An example of the use of the mark element to call attention to text that is relevant in a different context could occur in a class on literature. A textbook might include an extract from a novel to illustrate the effect of an author's use of slang words. The textbook might then mark the slang words in the extract by printing them in a distinctive type even though they are printed in the same manner as the rest of the text in the original book.

BANA's interest in the UEB suggests that BANA has determined that the best solution to the use of marking is simply to mimic print by using braille typeform indicators to represent the particular print styles used to mark texts for the types of purposes such as those just outlined. This is not necessarily a problem best solved with braille translation rules. There is a bigger longterm issue here that needs more thought.

8.6.1 General concerns about "marking" or highlighting

There are a number of general issues that mitigate against thinking that the ability for braille transcriptions to mimic print marking is such a major concern that it justifies a wholesale overhaul of BANA's braille codes. In the first place, while the use of print styling for the purpose of simply calling attention is a fairly common tactic, its use is quite ad hoc and doesn't conform to any particular standard or even custom. There isn't, for example, a standard such as always using blue type for vocabulary words.

In the second place, reference to a particular distinctive style is likely to become outdated. Many software applications already include a style configurator function by which the user can customize highlighting and other aspects of visual styling. Browsers let users choose default fonts, sizes, and colors. Webpage developers sometimes let users choose a stylesheet for viewing the page. Even standalone eBook readers allow users to customize styles. It's not hard to imagine growing support for customization of digital textbooks.

In the third place, marking can be dynamic as well as static. Software applications make use of dynamic or non-persistent highlighting for various purposes such as supporting the use of a find or search function in a browser or wordprocessor. Occurrences of whatever the user is trying to locate are typically highlighted in color for the convenience of a sighted user. Screenreaders should be able to make use of a dynamic mark tag to present similar information to their users while braille displays can use the bottom row of dots to provide dynamic marking.

Finally, the focus for braille authorities should be to design methods to replicate in an accessible manner the function of print styling used for marking, not the styling itself. Luckily it's easy to think of alternatives including those suggested above for accessible dynamic marking. For example, scattered vocabulary words marked in an electronic print source could be extracted into a list.

8.6.2 Drawbacks to using typeform indicators to representing "marking" in braille transcriptions

In addition to the general issues described in the previous section, there are some braille-specific issues that mitigate against the reliance on typeform indicators as the only or even preferable solution to representing print styling used for marking.

BANA has recently supported some research as to how braille learners might be best taught to take advantage of marking and which proposed braille indicators some young braille readers found most convenient for this purpose. This research is summarized in the article Alternate methods for transcribing words with emphasis by Cheryl Kamei-Hannan, Ph.D. which begins with the observation:

Locating words written in bold, italics, underlined, or in colors is a more difficult task for a student who reads in braille, as compared with a student who is sighted.... The addition of the composition symbol is often confused with letters or other braille contractions, especially for beginning readers and those who are learning the braille code.

The article goes on to explain why this is the case and to suggest some helpful strategies including the use of typeform indicators that differ from those in UEB.

Dr. Kamei-Hannan's article makes the important point that:

Teachers reported that students were not familiar with “scanning” and that they had to explain that scanning and reading text were different tasks.

Dr. Susanna Millar's research on the nature of tactile reading demonstrates a related issue:

Fluent braille readers actually use quite different finger movements in reading for comprehension than when they are asked to look for a particular letter in the same text. ["Literacy through Braille and Perception by Touch and Movement" in Language and Visualisation, Ed. Eriksson and Holmqvist, 2004, pp. 97-130.]

Dr. Millar's research suggests the possibility that educating braille readers to look for all the words marked with a specific typeform indicator (i.e. to "scan") could actually interfere with the acquisition of the optimal finger movements for fluent reading.

8.6.3 A need for further research

The results of the research carried out by Kamei-Hannan and by Millar suggests that while braille-using students probably need to understand the method and purpose of scanning, having to rely on it for acquiring information has the potential to negatively impact their braille reading skills and, for this and related reasons, the use of typeform indicators is not an optimal solution for handling marking in braille transcriptions. This situation is not surprising given the findings of the National Center On Universal Design for Learning (UDL)

Within the UDL framework, the hallmark of materials is their variability and flexibility. For conveying conceptual knowledge, UDL materials offer multiple media and embedded, just-in-time supports such as hyperlinked glossaries, background information, and on-screen coaching. For strategic learning and expression of knowledge, UDL materials offer tools and supports needed to access, analyze, organize, synthesize, and demonstrate understanding in varied ways.

Simply adopting a new braille system will not address these fundamental concerns. BANA needs to facilitate more research into this area.

8.7 Summary

Using typeform indicators to duplicate print styling should only be used as a fallback position when there is not a better option.

The basic problem is that distinctive print styling is often used to indicate semantics: 17 specific text-level semantic elements commonly indicated with distinctive styling in print are listed in Section 8.3. What the braille reader typically needs for semantic elements is an explicit indication of the semantics or a custom translation such as Computer Braille Code, not a typeform indicator. This is not a new idea; explicit indication of semantics already occurs in English braille when a lettersign rather than an italics indicator is used to indicate that a brailled letter shown in italic type in print is a symbol and not an alphabetic wordsign.

A particular issue that needs further consideration as to a solution is the growing use of distinctive print styling to mark or highlight text not for a specific semantic purpose but to "show its relevance in another context" as might be done by styling vocabulary words scattered through a block of text or temporarily highlighting the results of an online word search. Although scanning text to locate highlighted items is a simple visual skill—sighted children much too young to read can easily locate words printed in a distinctive color—it is not something visually-impaired persons should be called upon to do on a routine basis. In other words, this is not an issue easily addressed by a new braille system.

Information available from National Center On UDL makes it clear that the U.S. educational system is moving away from a fixed curricula towards a Flexible Curricula "designed to be adjustable from the beginning, so it can adapt to the needs of diverse learners without significant add-ons." BANA's solutions to the problems of braille readers should take this development into account.

A DRAFT version of this article was first posted April 17, 2012.
Please send corrects and feedback to the author: info at dotlessbraille dot org

8. Problem: Reflecting Print Styling in Braille Transcriptions (4/17/2012 Draft)

Table of Contents

8.5 An example from the UEB Rulebook where a typeform indicator is a poor solution

8.6 The mark text-level semantic element

8.6 The `mark` text-level semantic element