Positive Impacts of EPUB 3: MathML and Braille Mathematics

Executive Summary. This article is intended to introduce MathML to persons with at least some familiarity with braille and braille mathematics. With the advent of EPUB 3 as the digital publishing standard there is no longer any doubt that MathML will be the basis for accessible mathematics. Braille experts don't need to be able to read or write MathML but they do need to understand the practical and conceptual underpinnings of MathML at the very least to the extent of the information about MathML in this article. Braille experts need to know enough about MathML to communicate with MathML experts since it is unlikely that MathML experts will learn very much about braille math.

If braille experts are able to articulate the similarities and differences between braille math and MathML to developers, they can help ensure the development of better MathML-based tools for producing and/or using braille mathematics. It may also be the case that similarities or differences between a particular braille math code and MathML will turn out to be important. For example, it could be difficult to communicate to a MathML expert why UEB uses the same symbols for numbers and for certain letters since MathML treats numbers, variables, and text as three separate Token Elements. And it might be more difficult to adapt a MathML-based tool that demonstrates the steps in a computation to UEB math than to NUBS math. The potential difficulty is that UEB has different ways of representing the same layouts (or what MathML calls schemata) depending on the surrounding context and on the nature of the expressions involved. (See Section 3.3.1 for an example where a simple cut-and-paste in a UEB fraction could result in braille errors.)

Table of Contents

1. Background

If there has ever been any doubt that MathML will be the basis for accessible mathematics, that doubt was completely erased by the October 11, 2011 announcement of EPUB 3 as the digital publishing standard.

EPUB 3, developed by the International Digital Publishing Forum (IDPF), is the open, royalty-free standard for the new generation of digital books. ... EPUB was built from the ground up with accessibility for blind and print-disabled readers in mind. Experts in accessibility and publishing technology from the DAISY Consortium worked shoulder-to-shoulder with the large tech companies and the publishers to deliver a format with astounding capabilities.
...
The study of mathematics should get a real boost from the inclusion of MathML in the digital book.

George Kerscher, "The Future of Digital Publishing: An Optimist's View," Future Reflections, Vol. 31 (2).

Those interested in the history of MathML should read Stephen Wolfram's series on Mathematical Notation: Past and Future. The series also explains why LATeX is not an alternative to MathML. To oversimplify a bit, LATeX is a typesetting language and was never intended to provide a basis for either archiving or manipulating mathematics which are two of the goals of MathML.

The purpose of the present article is to provide a brief introduction to MathML and then to explain the relationship between MathML and various options for braille mathematics. The intent here is to provide a basis for appreciating the potential of MathML, not to teach the reader, who is assumed to be familiar with braille and braille mathematics, to generate or read MathML.

As for generating MathML, it is not something that a person would typically enter directly. Rather there are user-friendly WYSIWYG math entry systems and math Optical Character Recognition applications that automatically create MathML. Moreover, with the rapidly growing availability of high-quality EPUB 3 documents, it is becoming less likely that producers of special media such as braille will be responsible for producing electronic source documents whether or not mathematics is involved.* The reason for the availability of these high-quality source documents is that publishers are starting to appreciate the commercial value of having experts produce semantically rich source documents which can be used for multiple purposes.

As for reading MathML, there are numerous applications, including browser plug-ins, that can render MathML to properly typeset regular and large print. These same applications can often generate spoken math as well. Applications to produce on-the-fly translation of MathML to braille math are not yet available as of June 2012 but are likely to become available in the near future.

Finally, as to the potential of MathML source as the basis for conversion to braille, one factor that is easy to appreciate is its use of Unicode character codes for symbols that don't appear on a standard keyboard. This means that transcribers can simply consult the source and will no longer have to puzzle over a glyph and try to guess which character it represents. Of course, given that Unicode specifies thousands of mathematical symbols, braille codes will require some sort of official fallback mechanism for representing symbols for which the code does not provide explicit braille equivalents.

2. Introduction to MathML

MathML 3.0 is an application of XML (Extensible Markup Language) that is used as a specialist notation for mathematical expressions; it is not intended for general text contents. MathML thus functions as what is sometimes referred to as a "mode" in braille terminology. XML documents that include mathematical expressions typically also include other text which needs to be represented with an appropriate markup language such as EPUB 3 or HTML5. Any mathematical expressions represented by MathML are then embedded within the other text as "math islands" which are entered and exited via math start and end tags.

There are two forms of MathML. Presentation MathML, which is addressed here, is designed to represent mathematical notation and structure. Content MathML, which is less widely used, is designed to represent the meaning or content of mathematical notation.

There are two basic types of presentation markup in MathML: token elements and schemata. According to the MathML documentation token elements "are broadly intended to represent the smallest units of mathematical notation which carry meaning." Schemata are notations, including the one for fractions, that represent layouts used to show the relationships among token elements and/or other layouts.

2.1 MathML Token Elements

There are separate MathML token elements for representing identifiers, numbers, operators, text, strings, and spacing. These items are described very briefly in the following sections; the MathML documentation provides considerably more detail and explains numerous options not mentioned here. Section 2.1.7 gives an example of a simple algebraic equation represented with token elements.

2.1.1 Identifiers

Identifiers include variables, function names, and constants. To use braille terminology, variables are typically letters or letter sequences that stand for themselves and are thus not the same as words. Function names are standard names for mathematical functions such as sin. Constants are named numbers, such as pi.

Identifiers are tagged with the <mi> tag. Examples are <mi>x</mi>, <mi>sin</mi> and <mi>&#x03C0;</mi>. The last example represents the Greek small letter pi or π by using its Unicode character code.

2.1.2 Numbers

Numbers are numeric literals and are typically a digit or sequence of digits possibly including a decimal point. Numbers are tagged with the <mn> tag. Examples are <mn>1</mn> and <mn>2.3</mn>.

2.1.3 Operators

Presentation MathML treats a large number of symbols as operators. These include ordinary binary operators such as the plus sign, so-called "fences" including parentheses, separators such as a comma, and special mathematical "accents" such as a bar over a symbol. There are also special symbols for "invisible operators" which are operators that may only be implied in print. For example, the algebraic expression "2x" typically means two times x. MathML invisible operators are used to ensure that an application processing MathML doesn't have any doubt as to which operator is intended.

Operators are tagged with the <mo> tag. Examples are <mo>+</mo> and <mo>&InvisibleTimes;</mo>.

2.1.4 Text

The text element is used for arbitrary text appearing within mathematics. This would typically be some sort of commentary or a label such as "Theorem 1." (It is sometimes more appropriate to include text elements in a document's other text contents rather than to include text elements in math islands.)

2.1.5 Spacing

The spacing element can be used to represent a particular size of blank space. It can also be used to suggest linebreaking opportunities. This latter use might be relevant to braille codes with specific rules as to where mathematical expressions can be broken between lines.

2.1.6 Strings

This element can be used to represent strings which are to be interpreted by special processors.

2.1.7 An Example Using Token Elements

It is sometimes possible to represent simple mathematical expressions using only token elements. An example is the following MathML source for the algebraic equation 2b = 4.

<math>
<mn>2</mn><mo>&InvisibleTimes;</mo><mi>b</mi><mo>=</mo><mn>4</mn>
</math>
.

2.2. MathML Schemata

MathML specifies several different classes of schemata. The four described here are general layouts, script and limit schemata, tabular math, and elementary math. In all cases the elements of a schemata can be either token elements, as in the case of a simple numerical fraction, and/or more complex expressions including those using schemata. Again this section is only intended to provide a brief overview.

2.2.1 General Layout Schemata

General layout schemata include methods for grouping subexpressions. There are also two specific notations: fractions and radicals.

2.2.1.1 Rows

The MathML <mrow> tag is the simplest way to group subexpressions. This tag can always be used simply for convenience but is sometimes necessary to avoid ambiguity or to conform with MathML schemata layouts.

2.2.1.2 Fractions

The fraction schemata is just what its name implies. It is a method of designating a pair of token elements or other expressions as the numerator and the denominator of a fraction. Fraction schemata can be nested to produce what the Nemeth code refers to as complex fractions, i. e. fractions where the numerator or denominator or both contain one or more fractions, as well as more complicated structures.

The MathML syntax for the fraction schemata uses the "mfrac" tag and the following simple layout
<mfrac> numerator denominator </mfrac>.
Both the "numerator" and the "denominator" items can be any MathML expression including fractions. However, each item must be encoded as a single MathML element so items that are not single elements will need to be grouped into a single "outer" element such as by using the <mrow> tag. For example, one way to represent a+b all over c in MathML is <mfrac><mrow><mi>a</mi><mo>+</mo><mi>b</mi></mrow><mi>c</mi></mfrac>.

2.2.1.3 Radicals

Radicals include the square root and the indexed root. Additional attributes can be used to control differences in visual appearance for radicals shown inline and for radicals displayed on separate lines.

The MathML layout for the square root is simply <msqrt> base </msqrt>. The MathML layout for an indexed root is <mroot> base index </mroot>. As is the case for a fraction, complex expressions for either the "base" or "index" of an indexed root will need to be grouped into a single "outer" element such as a <row> element.

2.2.2 Script and Limit Schemata

Scripts and limits refer to various arrangements of elements around a base element. Here only the subscript, superscript, underscript, and overscript elements are described. MathML specifies additional elements for prescripts and tensors.

2.2.2.1 Subscripts and Superscripts

The MathML msub, msup, and msubsup elements are used to represent layouts where the typical visual display has the script elements located below and/or above the baseline of the base element but to the right side of the base element rather than directly under or over it.

The MathML syntax for subscripts uses the following simple layout:
<msub> base subscript </msub>.

The MathML syntax for superscripts uses the following simple layout:
<msup> base superscript </msup>.

The MathML syntax where the base element has both a subscript and a superscript uses the following layout
<msubsup> base subscript superscript </msubsup>.

As with other layouts, these layouts can be nested. Also any complex expressions used for a base or script need to be grouped into a single element.

2.2.2.2 Underscripts and Overscripts

The MathML munder, mover, and munderover are used to represent layouts where the typical visual display has the script elements located directly below and/or above the base element.

The MathML syntax for underscripts uses the following simple layout:
<munder> base underscript </munder>.

The MathML syntax for overscripts uses the following simple layout:
<mover> base overscript </mover>.

The MathML syntax where a base element has both an underscript and an overscript uses the following layout
<munderover> base underscript overscript </munderover>.

As with other layouts, these layouts can be nested. Also, any complex expressions used for a base or script need to be grouped into a single element.

2.2.3 Tabular Math

Tabular MathML markup is used for tables and matrices and includes a large number of features to make it possible to represent many different layouts that fit into this general category. In fact, MathML markup for tables is so complete that it should make it possible to develop processors that can automatically transcribe tables represented by MathML according to the rules of Braille Formats. (In some cases, such as when there is a need to specify shortened versions of column headers, automation may require an iterative process with some human involvement similar to that used by spelling checkers.)

2.2.4 Elementary Math

MathML elementary math markup makes it possible to indicate various special notations used for lower grade mathematics including addition, multiplication, and long division. The markup includes provision for special structures used in educational environments such as intermediate states and partial forms.

This markup is one of the newer features of MathML and has probably not yet been fully exploited. However it offers great potential as a basis for developing custom educational materials and interactive tools for lower grade mathematics.

3. MathML and Braille Mathematics

The Nemeth Braille Code is currently used for braille mathematics and other technical material in the United States. The Braille Authority of North America (BANA) is considering replacing the use of the Nemeth Code with a unified code for both literary and technical material. The two unified codes currently under consideration are the Nemeth Uniform Braille System (NUBS) and Unified English Braille (UEB). The following description of NUBS rules is based on the full NUBS documentation. The description of the UEB rules are based on The Rules of Unified English Braille, June 2010 (UEB Rulebook) and The Guidelines for Technical Material both of which can be obtained from the International Council on English Braille (ICEB) Project Page.

The next three sections compare Nemeth, NUBS, and UEB with MathML. We are not concerned primarily with attempting to estimate the relative difficulty of translating MathML to one or the other of these braille math codes. After all, MathML has been designed to provide full information about mathematical expressions such that it should be possible to develop a processor for automatically converting MathML to any other appropriate system for representing mathematics.

However, what is important to the braille community are the practical and conceptual underpinnings of MathML. MathML reflects the general thinking of numerous experts. From a practical standpoint, MathML source is a linear system for mathematics just like braille math is. Of course, MathML source is much too verbose to be convenient for humans to either read or write. Nonetheless, a careful study of the surprisingly easy-to-read official MathML documentation should be a useful exercise for those interested in braille math. First, it is a reminder of why a linear system for mathematics which, like braille math, does need to be convenient for humans is not going to be able to handle every nuance of modern mathematical notation. Second, it is a reminder of the many issues which need to be considered when designing or evaluating a braille math system.

With the growing use of MathML to represent math in electronic source documents and as the basis for digital tools for learning mathematics, MathML is quickly becoming the lingua franca for those with any sort of interest in electronic math. This means that it is of growing importance for braille experts interested in better tools for producing or using braille mathematics to have at least some conceptual appreciation of MathML and understanding of MathML terminology. It is simply not reasonable to expect that many MathML experts will learn much about braille mathematics so it will be up to braille experts to be able to explain the similarities and differences. It may also be the case that similarities between braille math and MathML can be exploited both by easing communication and by increasing the likelihood that new MathML-based tools can be adapted for braille users.

A primary problem for braille mathematics is the same as it is for any linear system for mathematics: the need to devise some way to represent the planar layouts used to build expressions in standard mathematical notation. As shown in Section 2.2, MathML does this with layout schemata. Braille mathematics represents planar layouts with braille indicators specific to common layouts such as fractions. Although MathML and braille mathematics have similar functionality for planar layouts, their design requirements are very different because MathML is intended for machine processing while braille mathematics is intended to be read by humans.

3.1 MathML and the Nemeth Code

Anyone familiar with the Nemeth code should have no trouble appreciating its parallels with MathML.

As for token elements, Nemeth has separate symbols for numbers, identifiers, and operators which map cleanly to MathML's number, identifier, and operator elements. Since Nemeth mathematics uses lower numbers and typically assumes uncontracted braille there is generally no need for a processor converting MathML math to Nemeth to insert number signs and other semantic indicators simply in order to avoid ambiguity.

The parallels between MathML schemata and Nemeth rules are nothing short of remarkable. The 1972 Revision of the Nemeth manual even has four separate chapters titled Fractions, Radicals, Superscript and Subscripts, and Modifiers. (Modifier is Nemeth terminology for a script which typically occupies a position directly under or directly over the sign to which it applies. Some of the Nemeth rules for modifiers have the same function as the MathML underscript and overscript schemata.)

Since NUBS is quite similar to the Nemeth code, further technical discussion is deferred to the following section on NUBS.

3.2 MathML and NUBS

NUBS is a modernized version of the Nemeth Code. Some of the changes in NUBS have made it possible for NUBS to be fully compatible with EBAE contracted braille whereas the Nemeth Code requires a few changes to EBAE. Another change from the Nemeth Code is that NUBS provides explicit indication of mathematical or notational items. This latter change facilitates the reader's interpretation of technical material and simplifies automatic backtranslation. NUBS has also added extensible mechanisms for characters, modifiers, and typeform indicators that are very similar to the corresponding mechanisms in UEB.

NUBS like Nemeth of course provides braille equivalents for numbers, identifiers, and many operators or symbols which map cleanly to MathML's number, identifier, and operator token elements. Further discussion of the NUBS token elements is outside the scope of this article other than a reminder that NUBS uses lower numbers which don't require semantic indicators as they are always used in notational mode.

The following four sections are devoted to the NUBS representation of individual layouts. They include an explanation of how NUBS, again like Nemeth, has been designed to make it easier for the reader to understand the necessarily linear notation for complex layouts. The information in these four sections assumes that NUBS notational mode has already been invoked appropriately depending on whether the mathematical expression is inline or embedded.

3.2.1 MathML and NUBS Representations for Fractions

The MathML schemata for fractions is described in Section 2.2.1.2 The corresponding NUBS syntax is
?numerator/denominator#.
NUBS, like Nemeth, uses a single dots 1-4-5-6 braille cell instead of the "mfrac" start tag as a start fraction indicator and a single dots 3-4-5-6 cell instead of the "mfrac" end tag as an end fraction indicator. NUBS, again like Nemeth, adds an explict separator, dots 3-4, to separate the numerator from the denominator so as to reduce clutter for human readers since this tactic avoids the need to use extra symbols, tags, or "phantom enclosures" as is necessary in MathML to separate the two terms.

The primary difference between MathML and NUBS for fractions is that NUBS adds explicit indication of nested fractions such as when either the numerator or denominator of a fraction contains another fraction. This is accomplished by preceding the indicators and corresponding separator of any outer fractions with one or more dot 6 cells to indicate the order of the outer fractions where the inner fraction is considered to be order zero and its indicators thus don't require preceding dot 6 cells. These order indicators, which an automated process can determine the need for by using standard XML technology, aren't necessary to avoid ambiguity but are used to make it easier for the braille reader to decode the linear expression. Order indicators let the braille reader know what to expect and also help the reader determine which separator belongs to which fraction. Otherwise the reader has to keep more in memory in order to distinguish a layout such as ((a/b)/c) from (a/(b/c)).

3.2.2 MathML and NUBS Representations for Radicals

The MathML schemata for radicals is described in Section 2.2.1.3 The NUBS layout for the square root is
>base[
where dots 3-4-5 is the NUBS radical sign or square root start indicator and dots 2-4-6 is the termination indicator for the layout.

The NUBS indexed radical layout is similar to the one for the square root. It is
>^index"base[.

Here dots 3-4-5 is again the start indicator and the dots 4-5 indicates that the next element is the index. (This notation, which is one of the few significant differences between the original Nemeth code and NUBS, may be viewed as a bit awkward as dots 4-5 is also the NUBS superscript indicator.) Again, as for the NUBS representation of a fraction, an explicit separator, in this case the dot 5 baseline indicator, is used to separate the index from the base so as to avoid the need to use extra symbols for that purpose.

Here, as for fractions, NUBS uses the dot 6 order indicator to provide the reader explicit indication of nested radicals such as when the base of a square root or indexed root contains another radical. The order indicators are placed before the outer radical signs and the corresponding terminators. They are not needed before the separator after an index due to the simplicity of the notation for an index.

3.2.3 MathML and NUBS Representations for Subscripts and Superscripts

The MathML schemata for subscripts and superscripts are described in Section 2.2.2.1 It is outside the scope of this article to give a complete presentation of the NUBS rules for subscripts and superscripts. However, since many persons reading this article are likely familiar with the Nemeth Code, it should be pointed out that the NUBS rules are the same except for the use of a different indicator to avoid the numeric subscript convention.

The NUBS layout for a first-level non-numerical subscript is
base;subscript"
where dots 5-6 is the NUBS first-level subscript indicator and dot 5 is the return to baseline indicator which can be omitted if either the layout is followed by a space or it ends an expression.

NUBS, like Nemeth, has a special notation for a first-order numeric subscript following a letter. (This is because numbers following letters in technical material are almost always subscripts so this special notation provides considerable simplification for the more common case.) In this case both the subscript and baseline indicators are omitted as being understood. A dots 3-4-5-6 number sign between a letter and a following number is used to escape this default.

The NUBS layout for a first-level superscript is
base^superscript"
where dots 4-5 is the NUBS first-level superscript indicator and dot 5 is the return to baseline indicator which can be omitted if either the layout is followed by a space or it ends an expression. Just as for the fraction and radical layouts, there is no need for special enclosure symbols to delimit either the base, subscript, or superscript as these are automatically delimited by the indicators or, in the case of the numerical subscript, by syntax.

NUBS uses the same simple mechanism as the Nemeth code for higher-level subscripts and superscripts. That is, the relevant subscript and/or superscript indicator(s) is(are) repeated according to the level. This mechanism avoids the need for extra enclosures while providing helpful information to the reader.

3.2.4 MathML and NUBS Representations for Underscripts and Overscripts

The MathML schemata for underscripts and overscripts are described in Section 2.2.2.2 NUBS is even more aligned with MathML than is the Nemeth Code for these elements in that the Nemeth Code uses the same notation for underscripts and overscripts as it does for modifiers while NUBS does not. NUBS has a new notation for underscripts and overscripts which is described in this section. It is outside the scope of this article to present the NUBS rules for modifiers which are used for items such as accent marks. However, since many persons reading this article are likely familiar with the Nemeth Code, it should be pointed out that the NUBS rule for modifiers is somewhat simpler than the corresponding Nemeth one.

The NUBS layout for a first-level underscript is
base;&underscript[
where dots 5-6, 1-2-3-4-6 is the NUBS first-level underscript indicator and dots 2-4-6 is the termination indicator.

The NUBS layout for a first-level overscript is
base^&overscript[
where dots 4-5, 1-2-3-4-6 is the NUBS first-level overscript indicator and dots 2-4-6 is the termination indicator.

The NUBS layout for an item with both a first-level underscript and a first-level overscript is
base;&underscript ^&overscript[
where dots 5-6 1-2-3-5-6 is the NUBS first-level underscript indicator, dots 4-5, 1-2-3-4-6 is the NUBS first-level overscript indicator, and dots 2-4-6 is the single termination indicator for both scripts. Note that as with other NUBS schemata there is no need to use extra enclosure symbols since the indicators also function as separators.

In the case of multiple underscripts, they are arranged according to their vertical order starting with the "top" one directly under the base being considered the first-order underscript. The indicators for the higher-order underscripts simply repeat the leading dots 4-5 cell of the first-level underscript indicator. Multiple overscripts are handled similarly with the vertical order starting with the "bottom" one directly above the base being the first-order overscript. Again the indicators for the higher-order overscripts simply repeat the leading dots 5-6 cell of the first-level overscript indicator.

3.2.5 Summary of MathML and NUBS

NUBS, just like its Nemeth Code predecessor, is well-aligned with MathML. With the exception of its optional special convention for numerical subscripts, NUBS has only one way to represent each of the four most common MathML schemata or layouts. NUBS does include an extra indicator to designate the order when a layout is nested with another layout of the same type. However, determining such nesting for XML structures such as MathML is extremely easy using standard tools for processing XML. (And, as far as student-generated math, leaving out the level indicators does not create ambiguity.)

3.3 MathML and UEB

UEB is a unified code being considered for adoption in the United States. A great deal of information about UEB has already been published. A link to the two UEB rulebooks used as references in preparing this information about UEB is given at the start of Section 3. However it should be noted that the "Technical Guidelines" were published in 2008 with the intent that they would be updated so what is written here may be subject to change. Links to other articles and UEB information may be found on this page.

UEB of course provides braille equivalents for numbers, identifiers, and many operators or symbols which can be mapped to MathML's number, identifier, and operator token elements. Further discussion of the UEB token elements is outside the scope of this article other than to emphasize that since UEB uses upper numbers, numbers always require semantic indicators. Morever, since UEB doesn't always use a distinct mode for math, it requires a greater use of semantic indicators which are embedded in mathematical expressions than does NUBS and the useage rules for UEB semantic indicators are more likely to depend on context.

3.3.1 MathML and UEB Representations for Fractions

The MathML schemata for fractions is described in Section 2.2.1.2 and corresponding NUBS fraction layout is described in Section 3.2.1.

UEB has two different schemata for fractions. The first one is only used for simple numeric fractions where both the numerator and denominator are numbers:
#number/number
The dots 3-4-5-6 together with the first cell of the numerator functions as a Numeric Indicator. The advantage of this special notation is that effect of the Numeric Indicator persists after the simple numeric fraction line, dots 3-4, so the Numeric Indicator doesn't have to be repeated as it would be if the general fraction layout were to be used for a simple numeric fraction.

The second UEB fraction layout is (numerator./denominator)
where dots 1-2-3-5-6 is the general fraction open indicator, dots 4-6, 3-4 is the separator or general fraction line, and 2-3-4-5-6 is the general fraction close indicator. UEB doesn't have any special notation for nested fractions like NUBS does so the reader has more to keep track of.

One problem with using two different fraction layouts could occur in algebra when a student is working with an equation such as 1/2 equals x/4 that uses both a numeric fraction and a general fraction. The student can't simply cut the x and paste in the digit two once they solve the equation because they also need to change the layout. Related problems could occur when designing tutorials.

3.3.2 MathML and UEB Representations for Radicals

The MathML schemata for radicals is described in Section 2.2.1.3 and the NUBS layouts for radicals are described in Section 3.2.2.

The UEB layout for the square root is
%base+
where dots 1-4-6 is the UEB radical sign or square root start indicator and dots 3-4-6 is the termination indicator for the layout.

UEB has two different layouts for an indexed radical. If the index is a single token or other simple item this layout is used:
%9index base+.

If the index is not a simple item, then the index must be enclosed in a pair of braille grouping symbols, dots 1-2-6 and dots 3-4-5 as in this next layout:
%9<index>base+.

Just as in the case of fractions, UEB doesn't provide any special mechanism for nested radicals as NUBS does so the user again has more to keep track of.

3.3.3 MathML and UEB Representations for Subscripts and Superscripts

The MathML schemata for subscripts and superscripts are described in Section 2.2.2.1 The corresponding NUBS layouts are described in Section 3.2.3.

The UEB strategy for expressions using subscripts and/or superscripts which are embedded in a contracted braille passage is to insert a grade 1 symbol indicator immediately before the dots 2-6 subscript indicator and/or the dots 3-5 superscript indicator unless grade 1 mode is already in effect for the expression. Also, sometimes the subscript or superscript expression will need to be enclosed in braille grouping indicators to avoid ambiguity but other times this is not necessary.

The four possible layouts for a simple subscript in UEB are thus
base5subscript
base;5subscript
base5<subscript>
base;5<subscript>

The four possible layouts for a simple superscript in UEB are thus
base9superscript
base;9superscript
base9<superscript>
base;9<superscript>

The four possible layouts for an item with both a simple subscript and a simple superscript in UEB are thus
base5subscript9superscript
base;5subscript;9superscript
base5<subscript9<superscript>
base;5<subscript;9<superscript>

As for multiple levels, such as superscripts to superscripts, the subscript and superscript expressions may need to be enclosed in braille grouping indicators to avoid ambiguity even where this wouldn't be the case for single level subscripts or superscripts.

Just as in the case of fractions and radicals, UEB doesn't provide any special mechanism for multiple levels of subscripts and superscripts as NUBS does so the UEB user again has more to keep track of.

3.3.4 MathML and UEB Representations for Underscripts and Overscripts

The MathML schemata for underscripts and overscripts are described in Section 2.2.2.2. The corresponding NUBS layouts are described in Section 3.2.4.

UEB uses the same approach to underscripts and overscripts as it does for subscripts and superscripts so there is no need to go into detail. The only differences are the insertion of a dots 4-6 cell before the symbols used for the subscript and superscript indicators. The directly below indicator is thus dots 4-6, 2-6 and the directly above indicator is dots 4-6, 3-5.

3.3.5 Summary of MathML and UEB

In general UEB has more than one way of representing the same layout depending on the details of the expressions involved. UEB can also sometimes require insertion of braille-specific indicators and symbols into the middle of a layout.

Specific examples include the special layout used for simple numerical fractions. Braille grouping symbols, i.e. grouping symbols not required in print, have to be inserted into the layout for certain indexed radicals depending on the nature of the index. The layouts for subscripts, superscripts, underscripts and overscripts also sometimes do and sometimes do not require grouping symbols. It can even be the case that an expression that doesn't require grouping symbols when used as a single-level script will require grouping symbols when it carries other scripts.

The variations in layouts that depend on the nature of the expressions used in the layouts has the potential to make algebraic manipulations more difficult since the layouts may need to be changed when the expressions are changed. In other words, it is not always possible to simply copy and paste an expression from one context to another since the expession may have to be changed in the new context. A related problem would occur in tutorials and worked-out examples which could make it more difficult for the student to understand the mathematics itself.

4. Conclusion

The acceptance of EPUB 3 as a digital publishing standard and especially the inclusion of MathML to represent mathematics in EPUB 3 documents is an extremely positive development for accessible math in general and for braille math in particular. Consideration of the potential of this circumstance to have a positive impact on braille users would suggest that now is not an opportune time to make major structural changes to the braille systems used in the United States for representing technical material. However, even those who might be uncertain about the significance of this consideration should have no difficulty in evaluating the significance of the disconnect between UEB and MathML as explained here.


*Another thing that will change with the growing use of rich electronic source files is that transcribers will no longer be transcribing from rendered print sources. This means that Braille Transcribing Manuals that focus on the visual appearance of the print are going to be out-of-date sooner that some might anticipate.


First DRAFT posted June 5, 2012.
Updated draft posted June 12, 2012.