Goto Chapter: Top 1 2 3 4 5 6 7 A B C Bib Ind
 [Top of Book]  [Contents]   [Previous Chapter]   [Next Chapter] 

3 The Document Type Definition
 3.1 What is a DTD?
 3.2 Overall Document Structure
 3.3 Sectioning Elements
 3.4 ManSection–a special kind of subsection
 3.5 Cross Referencing and Citations
 3.6 Structural Elements like Lists
 3.7 Types of Text
 3.8 Elements for Mathematical Formulae
 3.9 Everything else

3 The Document Type Definition

In this chapter we first explain what a "document type definition" is and then describe gapdoc.dtd in detail. That file together with the current chapter define how a GAPDoc document has to look like. It can be found in the main directory of the GAPDoc package and it is reproduced in Appendix B.

We do not give many examples in this chapter which is more intended as a formal reference for all GAPDoc elements. Instead, we provide a separate help book, see ???. This uses all the constructs introduced in this chapter and you can easily compare the source code and how it looks like in the different output formats. Furthermore recall that many basic things about XML markup were already explained by example in the introductory chapter 1.

3.1 What is a DTD?

A document type definition (DTD) is a formal declaration of how an XML document has to be structured. It is itself structured such that programs that handle documents can read it and treat the documents accordingly. There are for example parsers and validity checkers that use the DTD to validate an XML document, see 2.1-14.

The main thing a DTD does is to specify which elements may occur in documents of a certain document type, how they can be nested, and what attributes they can or must have. So, for each element there is a rule.

Note that a DTD can not ensure that a document which is "valid" also makes sense to the converters! It only says something about the formal structure of the document.

For the remaining part of this chapter we have divided the elements of GAPDoc documents into several subsets, each of which will be discussed in one of the next sections.

See the following three subsections to learn by example, how a DTD works. We do not want to be too formal here, but just enable the reader to understand the declarations in gapdoc.dtd. For precise descriptions of the syntax of DTD's see again the official standard in:

  http://www.xml.com/axml/axml.html

3.2 Overall Document Structure

A GAPDoc document contains on its top level exactly one element with name Book. This element is declared in the DTD as follows:

3.2-1 <Book>
<!ELEMENT Book (TitlePage,
                TableOfContents?,
                Body,
                Appendix*,
                Bibliography?,
                TheIndex?)>
<!ATTLIST Book Name CDATA #REQUIRED>

After the keyword ELEMENT and the name Book there is a list in parentheses. This is a comma separated list of names of elements which can occur (in the given order) in the content of a Book element. Each name in such a list can be followed by one of the characters "?", "*" or "+", meaning that the corresponding element can occur zero or one time, an arbitrary number of times, or at least once, respectively. Without such an extra character the corresponding element must occur exactly once. Instead of one name in this list there can also be a list of elements names separated by "|" characters, this denotes any element with one of the names (i.e., "|" means "or").

So, the Book element must contain first a TitlePage element, then an optional TableOfContents element, then a Body element, then zero or more elements of type Appendix, then an optional Bibliography element, and finally an optional element of type TheIndex.

Note that only these elements are allowed in the content of the Book element. No other elements or text is allowed in between. An exception of this is that there may be whitespace between the end tag of one and the start tag of the next element - this should be ignored when the document is processed to some output format. An element like this is called an element with "element content".

The second declaration starts with the keyword ATTLIST and the element name Book. After that there is a triple of whitespace separated parameters (in general an arbitrary number of such triples, one for each allowed attribute name). The first (Name) is the name of an attribute for a Book element. The second (CDATA) is always the same for all of our declarations, it means that the value of the attribute consists of "character data". The third parameter #REQUIRED means that this attribute must be specified with any Book element. Later we will also see optional attributes which are declared as #IMPLIED.

3.2-2 <TitlePage>
<!ELEMENT TitlePage (Title, Subtitle?, Version?, TitleComment?, 
                     Author+, Date?, Address?, Abstract?, Copyright?, 
                     Acknowledgements? , Colophon? )>

Within this element information for the title page is collected. Note that more than one author can be specified. The elements must appear in this order because there is no sensible way to specify in a DTD something like "the following elements may occur in any order but each exactly once".

Before going on with the other elements inside the Book element we explain the elements for the title page.

3.2-3 <Title>
<!ELEMENT Title (%Text;)*>

Here is the last construct you need to understand for reading gapdoc.dtd. The expression "%Text;" is a so-called "parameter entity". It is something like a macro within the DTD. It is defined as follows:

<!ENTITY % Text "%InnerText; | List | Enum | Table">

This means, that every occurrence of "%Text;" in the DTD is replaced by the expression

%InnerText; | List | Enum | Table

which is then expanded further because of the following definition:

<!ENTITY % InnerText "#PCDATA |
                      Alt |
                      Emph | E |
                      Par | P | Br |
                      Keyword | K | Arg | A | Quoted | Q | Code | C | 
                      File | F | Button | B | Package |
                      M | Math | Display | 
                      Example | Listing | Log | Verb |
                      URL | Email | Homepage | Address | Cite | Label | 
                      Ref | Index" > 

These are the only two parameter entities we are using. They expand to lists of element names which are explained in the sequel and the keyword #PCDATA (concatenated with the "or" character "|").

So, the element (Title) is of so-called "mixed content": It can contain parsed character data which does not contain further markup (#PCDATA) or any of the other above mentioned elements. Mixed content must always have the asterisk qualifier (like in Title) such that any sequence of elements (of the above list) and character data can be contained in a Title element.

The %Text; parameter entity is used in all places in the DTD, where "normal text" should be allowed, including lists, enumerations, and tables, but no sectioning elements.

The %InnerText; parameter entity is used in all places in the DTD, where "inner text" should be allowed. This means, that no structures like lists, enumerations, and tables are allowed. This is used for example in headings.

3.2-4 <Subtitle>
<!ELEMENT Subtitle (%Text;)*>

Contains the subtitle of the document.

3.2-5 <Version>
<!ELEMENT Version (#PCDATA|Alt)*>

Note that the version can only contain character data and no further markup elements (except for Alt, which is necessary to resolve the entities described in 2.2-3). The converters will not put the word "Version" in front of the text in this element.

3.2-6 <TitleComment>
<!ELEMENT TitleComment (%Text;)*>

Sometimes a title and subtitle are not sufficient to give a rough idea about the content of a package. In this case use this optional element to specify an additional text for the front page of the book. This text should be short, use the Abstract element (see 3.2-10) for longer explanations.

3.2-7 <Author>
<!ELEMENT Author (%Text;)*>    <!-- There may be more than one Author! -->

As noted in the comment there may be more than one element of this type. This element should contain the name of an author and probably an Email-address and/or WWW-Homepage element for this author, see 3.5-6 and 3.5-7. You can also specify an individual postal address here, instead of using the Address element described below, see 3.2-9.

3.2-8 <Date>
<!ELEMENT Date (#PCDATA)>

Only character data is allowed in this element which gives a date for the document. No automatic formatting is done.

3.2-9 <Address>
<!ELEMENT Address (#PCDATA|Alt|Br)*>

This optional element can be used to specify a postal address of the author or the authors. If there are several authors with different addresses then put the Address elements inside the Author elements.

Use the Br element (see 3.9-3) to mark the line breaks in the usual formatting of the address on a letter.

Note that often it is not necessary to use this element because a postal address is easy to find via a link to a personal web page.

3.2-10 <Abstract>
<!ELEMENT Abstract (%Text;)*>

This element contains an abstract of the whole book.

3.2-11 <Copyright>
<!ELEMENT Copyright (%Text;)*>

This element is used for the copyright notice. Note the &copyright; entity as described in section 2.2-3.

3.2-12 <Acknowledgements>
<!ELEMENT Acknowledgements (%Text;)*>

This element contains the acknowledgements.

3.2-13 <Colophon>
<!ELEMENT Colophon (%Text;)*>

The "colophon" page is used to say something about the history of a document.

3.2-14 <TableOfContents>
<!ELEMENT TableOfContents EMPTY>

This element may occur in the Book element after the TitlePage element. If it is present, a table of contents is generated and inserted into the document. Note that because this element is declared to be EMPTY one can use the abbreviation

<TableOfContents/>

to denote this empty element.

3.2-15 <Bibliography>
<!ELEMENT Bibliography EMPTY>
<!ATTLIST Bibliography Databases CDATA #REQUIRED
                       Style CDATA #IMPLIED>

This element may occur in the Book element after the last Appendix element. If it is present, a bibliography section is generated and inserted into the document. The attribute Databases must be specified, the names of several data files can be specified, separated by commas.

Two kinds of files can be specified in Databases: The first are BibTeX files as defined in [Lam85, Appendix B]. Such files must have a name with extension .bib, and in Databases the name must be given without this extension. Note that such .bib-files should be in latin1-encoding (or ASCII-encoding). The second are files in BibXMLext format as defined in Section 7.2. These files must have an extension .xml and in Databases the full name must be specified.

We suggest to use the BibXMLext format because it allows to produce potentially nicer bibliography entries in text and HTML documents.

A bibliography style may be specified with the Style attribute. The optional Style attribute (for LaTeX output of the document) must also be specified without the .bst extension (the default is alpha). See also section 3.5-3 for a description of the Cite element which is used to include bibliography references into the text.

3.2-16 <TheIndex>
<!ELEMENT TheIndex EMPTY>

This element may occur in the Book element after the Bibliography element. If it is present, an index is generated and inserted into the document. There are elements in GAPDoc which implicitly generate index entries (e.g., Func (3.4-2)) and there is an element Index (3.5-4) for explicitly adding index entries.

3.3 Sectioning Elements

A GAPDoc book is divided into chapters, sections, and subsections. The idea is of course, that a chapter consists of sections, which in turn consist of subsections. However for the sake of flexibility, the rules are not too restrictive. Firstly, text is allowed everywhere in the body of the document (and not only within sections). Secondly, the chapter level may be omitted. The exact rules are described below.

Appendices are a flavor of chapters, occurring after all regular chapters. There is a special type of subsection called "ManSection". This is a subsection devoted to the description of a function, operation or variable. It is analogous to a manpage in the UNIX environment. Usually each function, operation, method, and so on should have its own ManSection.

Cross referencing is done on the level of Subsections, respectively ManSections. The topics in GAP's online help are also pointing to subsections. So, they should not be too long.

We start our description of the sectioning elements "top-down":

3.3-1 <Body>

The Body element marks the main part of the document. It must occur after the TableOfContents element. There is a big difference between inside and outside of this element: Whereas regular text is allowed nearly everywhere in the Body element and its subelements, this is not true for the outside. This has also implications on the handling of whitespace. Outside superfluous whitespace is usually ignored when it occurs between elements. Inside of the Body element whitespace matters because character data is allowed nearly everywhere. Here is the definition in the DTD:

<!ELEMENT Body  ( %Text;| Chapter | Section )*>

The fact that Chapter and Section elements are allowed here leads to the possibility to omit the chapter level entirely in the document. For a description of %Text; see here.

(Remark: The purpose of this element is to make sure that a valid GAPDoc document has a correct overall structure, which is only possible when the top element Book has element content.)

3.3-2 <Chapter>
<!ELEMENT Chapter (%Text;| Heading | Section)*>
<!ATTLIST Chapter Label CDATA #IMPLIED>    <!-- For reference purposes -->

A Chapter element can have a Label attribute, such that this chapter can be referenced later on with a Ref element (see section 3.5-1). Note that you have to specify a label to reference the chapter as there is no automatic labelling!

Chapter elements can contain text (for a description of %Text; see here), Section elements, and Heading elements.

The following additional rule cannot be stated in the DTD because we want a Chapter element to have mixed content. There must be exactly one Heading element in the Chapter element, containing the heading of the chapter. Here is its definition:

3.3-3 <Heading>
<!ELEMENT Heading (%InnerText;)*>

This element is used for headings in Chapter, Section, Subsection, and Appendix elements. It may only contain %InnerText; (for a description see here).

Each of the mentioned sectioning elements must contain exactly one direct Heading element (i.e., one which is not contained in another sectioning element).

3.3-4 <Appendix>
<!ELEMENT Appendix (%Text;| Heading | Section)*>
<!ATTLIST Appendix Label CDATA #IMPLIED>   <!-- For reference purposes -->

The Appendix element behaves exactly like a Chapter element (see 3.3-2) except for the position within the document and the numbering. While chapters are counted with numbers (1., 2., 3., ...) the appendices are counted with capital letters (A., B., ...).

Again there is an optional Label attribute used for references.

3.3-5 <Section>
<!ELEMENT Section (%Text;| Heading | Subsection | ManSection)*>
<!ATTLIST Section Label CDATA #IMPLIED>    <!-- For reference purposes -->

A Section element can have a Label attribute, such that this section can be referenced later on with a Ref element (see section 3.5-1). Note that you have to specify a label to reference the section as there is no automatic labelling!

Section elements can contain text (for a description of %Text; see here), Heading elements, and subsections.

There must be exactly one direct Heading element in a Section element, containing the heading of the section.

Note that a subsection is either a Subsection element or a ManSection element.

3.3-6 <Subsection>
<!ELEMENT Subsection (%Text;| Heading)*>
<!ATTLIST Subsection Label CDATA #IMPLIED> <!-- For reference purposes -->

The Subsection element can have a Label attribute, such that this subsection can be referenced later on with a Ref element (see section 3.5-1). Note that you have to specify a label to reference the subsection as there is no automatic labelling!

Subsection elements can contain text (for a description of %Text; see here), and Heading elements.

There must be exactly one Heading element in a Subsection element, containing the heading of the subsection.

Another type of subsection is a ManSection, explained now:

3.4 ManSection–a special kind of subsection

ManSections are intended to describe a function, operation, method, variable, or some other technical instance. It is analogous to a manpage in the UNIX environment.

3.4-1 <ManSection>
<!ELEMENT ManSection ( Heading?, 
                      ((Func, Returns?) | (Oper, Returns?) | 
                       (Meth, Returns?) | (Filt, Returns?) | 
                       (Prop, Returns?) | (Attr, Returns?) |
                       (Constr, Returns?) |
                       Var | Fam | InfoClass)+, Description )>
<!ATTLIST ManSection Label CDATA #IMPLIED> <!-- For reference purposes -->

<!ELEMENT Returns (%Text;)*>
<!ELEMENT Description (%Text;)*>

The ManSection element can have a Label attribute, such that this subsection can be referenced later on with a Ref element (see section 3.5-1). But this is probably rarely necessary because the elements Func and so on (explained below) generate automatically labels for cross referencing.

The content of a ManSection element is one or more elements describing certain items in GAP, each of them optionally followed by a Returns element, followed by a Description element, which contains %Text; (see here) describing it. (Remember to include examples in the description as often as possible, see 3.7-10). The classes of items GAPDoc knows of are: functions (Func), operations (Oper), constructors (Constr), methods (Meth), filters (Filt), properties (Prop), attributes (Attr), variables (Var), families (Fam), and info classes (InfoClass). One ManSection should only describe several of such items when these are very closely related.

Each element for an item corresponding to a GAP function can be followed by a Returns element. In output versions of the document the string "Returns: " will be put in front of the content text. The text in the Returns element should usually be a short hint about the type of object returned by the function. This is intended to give a good mnemonic for the use of a function (together with a good choice of names for the formal arguments).

ManSections are also sectioning elements which count as subsections. Usually there should be no Heading-element in a ManSection, in that case a heading is generated automatically from the first Func-like element. Sometimes this default behaviour does not look appropriate, for example when there are several Func-like elements. For such cases an optional Heading is allowed.

3.4-2 <Func>
<!ELEMENT Func EMPTY>
<!ATTLIST Func Name  CDATA #REQUIRED
               Label CDATA #IMPLIED
               Arg   CDATA #REQUIRED
               Comm  CDATA #IMPLIED>

This element is used within a ManSection element to specify the usage of a function. The Name attribute is required and its value is the name of the function. The value of the Arg attribute (also required) contains the full list of arguments including optional parts, which are denoted by square brackets. The argument names can be separated by whitespace, commas or the square brackets for the optional arguments, like "grp[, elm]" or "xx[y[z] ]". If GAP options are used, this can be followed by a colon : and one or more assignments, like "n[, r]: tries := 100".

The name of the function is also used as label for cross referencing. When the name of the function appears in the text of the document it should always be written with the Ref element, see 3.5-1. This allows to use a unique typesetting style for function names and automatic cross referencing.

If the optional Label attribute is given, it is appended (with a colon : in between) to the name of the function for cross referencing purposes. The text of the label can also appear in the document text. So, it should be a kind of short explanation.

<Func Arg="x[, y]" Name="LibFunc" Label="for my objects"/>

The optional Comm attribute should be a short description of the function, usually at most one line long (this is currently nowhere used).

This element automatically produces an index entry with the name of the function and, if present, the text of the Label attribute as subentry (see also 3.2-16 and 3.5-4).

3.4-3 <Oper>
<!ELEMENT Oper EMPTY>
<!ATTLIST Oper Name  CDATA #REQUIRED
               Label CDATA #IMPLIED
               Arg   CDATA #REQUIRED
               Comm  CDATA #IMPLIED>

This element is used within a ManSection element to specify the usage of an operation. The attributes are used exactly in the same way as in the Func element (see 3.4-2).

Note that multiple descriptions of the same operation may occur in a document because there may be several declarations in GAP. Furthermore there may be several ManSections for methods of this operation (see 3.4-5) which also use the same name. For reference purposes these must be distinguished by different Label attributes.

3.4-4 <Constr>
<!ELEMENT Constr EMPTY>
<!ATTLIST Constr Name  CDATA #REQUIRED
                 Label CDATA #IMPLIED
                 Arg   CDATA #REQUIRED
                 Comm  CDATA #IMPLIED>

This element is used within a ManSection element to specify the usage of a constructor. The attributes are used exactly in the same way as in the Func element (see 3.4-2).

Note that multiple descriptions of the same constructor may occur in a document because there may be several declarations in GAP. Furthermore there may be several ManSections for methods of this constructor (see 3.4-5) which also use the same name. For reference purposes these must be distinguished by different Label attributes.

3.4-5 <Meth>
<!ELEMENT Meth EMPTY>
<!ATTLIST Meth Name  CDATA #REQUIRED
               Label CDATA #IMPLIED
               Arg   CDATA #REQUIRED
               Comm  CDATA #IMPLIED>

This element is used within a ManSection element to specify the usage of a method. The attributes are used exactly in the same way as in the Func element (see 3.4-2).

Frequently, an operation is implemented by several different methods. Therefore it seems to be interesting to document them independently. This is possible by using the same method name in different ManSections. It is however required that these subsections and those describing the corresponding operation are distinguished by different Label attributes.

3.4-6 <Filt>
<!ELEMENT Filt EMPTY>
<!ATTLIST Filt Name  CDATA #REQUIRED
               Label CDATA #IMPLIED
               Arg   CDATA #IMPLIED
               Comm  CDATA #IMPLIED
               Type  CDATA #IMPLIED>

This element is used within a ManSection element to specify the usage of a filter. The first four attributes are used in the same way as in the Func element (see 3.4-2), except that the Arg attribute is optional.

The Type attribute can be any string, but it is thought to be something like "Category" or "Representation".

3.4-7 <Prop>
<!ELEMENT Prop EMPTY>
<!ATTLIST Prop Name  CDATA #REQUIRED
               Label CDATA #IMPLIED
               Arg   CDATA #REQUIRED
               Comm  CDATA #IMPLIED>

This element is used within a ManSection element to specify the usage of a property. The attributes are used exactly in the same way as in the Func element (see 3.4-2).

3.4-8 <Attr>
<!ELEMENT Attr EMPTY>
<!ATTLIST Attr Name  CDATA #REQUIRED
               Label CDATA #IMPLIED
               Arg   CDATA #REQUIRED
               Comm  CDATA #IMPLIED>

This element is used within a ManSection element to specify the usage of an attribute (in GAP). The attributes are used exactly in the same way as in the Func element (see 3.4-2).

3.4-9 <Var>
<!ELEMENT Var  EMPTY>
<!ATTLIST Var  Name  CDATA #REQUIRED
               Label CDATA #IMPLIED
               Comm  CDATA #IMPLIED>

This element is used within a ManSection element to document a global variable. The attributes are used exactly in the same way as in the Func element (see 3.4-2) except that there is no Arg attribute.

3.4-10 <Fam>
<!ELEMENT Fam  EMPTY>
<!ATTLIST Fam  Name  CDATA #REQUIRED
               Label CDATA #IMPLIED
               Comm  CDATA #IMPLIED>

This element is used within a ManSection element to document a family. The attributes are used exactly in the same way as in the Func element (see 3.4-2) except that there is no Arg attribute.

3.4-11 <InfoClass>
<!ELEMENT InfoClass EMPTY>
<!ATTLIST InfoClass Name  CDATA #REQUIRED
                    Label CDATA #IMPLIED
                    Comm  CDATA #IMPLIED>

This element is used within a ManSection element to document an info class. The attributes are used exactly in the same way as in the Func element (see 3.4-2) except that there is no Arg attribute.

3.5 Cross Referencing and Citations

Cross referencing in the GAPDoc system is somewhat different to the usual LaTeX cross referencing in so far, that a reference knows "which type of object" it is referencing. For example a "reference to a function" is distinguished from a "reference to a chapter". The idea of this is, that the markup must contain this information such that the converters can produce better output. The HTML converter can for example typeset a function reference just as the name of the function with a link to the description of the function, or a chapter reference as a number with a link in the other case.

Referencing is done with the Ref element:

3.5-1 <Ref>
<!ELEMENT Ref EMPTY>
<!ATTLIST Ref Func      CDATA #IMPLIED
              Oper      CDATA #IMPLIED
              Constr    CDATA #IMPLIED
              Meth      CDATA #IMPLIED
              Filt      CDATA #IMPLIED
              Prop      CDATA #IMPLIED
              Attr      CDATA #IMPLIED
              Var       CDATA #IMPLIED
              Fam       CDATA #IMPLIED
              InfoClass CDATA #IMPLIED
              Chap      CDATA #IMPLIED
              Sect      CDATA #IMPLIED
              Subsect   CDATA #IMPLIED
              Appendix  CDATA #IMPLIED
              Text      CDATA #IMPLIED

              Label     CDATA #IMPLIED
              BookName  CDATA #IMPLIED
              Style (Text | Number) #IMPLIED>  <!-- normally automatic -->

The Ref element is defined to be EMPTY. If one of the attributes Func, Oper, Constr, Meth, Prop, Attr, Var, Fam, InfoClass, Chap, Sect, Subsect, Appendix is given then there must be exactly one of these, making the reference one to the corresponding object. The Label attribute can be specified in addition to make the reference unique, for example if more than one method with a given name is present. (Note that there is no way to specify in the DTD that exactly one of the first listed attributes must be given, this is an additional rule.)

A reference to a Label element defined below (see 3.5-2) is done by giving the Label attribute and optionally the Text attribute. If the Text attribute is present its value is typeset in place of the Ref element, if linking is possible (for example in HTML). If this is not possible, the section number is typeset. This type of reference is also used for references to tables (see 3.6-5).

An external reference into another book can be specified by using the BookName attribute. In this case the Label attribute or, if this is not given, the function or section like attribute, is used to resolve the reference. The generated reference points to the first hit when asking "?book name: label" inside GAP.

The optional attribute Style can take only the values Text and Number. It can be used with references to sectioning units and it gives a hint to the converter programs, whether an explicit section number is generated or text. Normally all references to sections generate numbers and references to a GAP object generate the name of the corresponding object with some additional link or sectioning information, which is the behavior of Style="Text". In case Style="Number" in all cases an explicit section number is generated. So

<Ref Subsect="Func" Style="Text"/> described in section 
<Ref Subsect="Func" Style="Number"/>

produces: <Func> described in section 3.4-2.

3.5-2 <Label>
<!ELEMENT Label EMPTY>
<!ATTLIST Label Name CDATA #REQUIRED>

This element is used to define a label for referencing a certain position in the document, if this is possible. If an exact reference is not possible (like in a printed version of the document) a reference to the corresponding subsection is generated. The value of the Name attribute must be unique under all Label elements.

3.5-3 <Cite>
<!ELEMENT Cite EMPTY>
<!ATTLIST Cite Key CDATA #REQUIRED
               Where CDATA #IMPLIED>

This element is for bibliography citations. It is EMPTY by definition. The attribute Key is the key for a lookup in a BibTeX database that has to be specified in the Bibliography element (see 3.2-15). The value of the Where attribute specifies the position in the document as in the corresponding LaTeX syntax \cite[Where value]{Key value}.

3.5-4 <Index>
<!ELEMENT Index (%InnerText;|Subkey)*>
<!ATTLIST Index Key    CDATA #IMPLIED
                Subkey CDATA #IMPLIED>
<!ELEMENT Subkey (%InnerText;)*>

This element generates an index entry. The content of the element is typeset in the index. It can optionally contain a Subkey element. If one or both of the attributes Key and Subkey are given, then the attribute values are used for sorting the index entries. Otherwise the content itself is used for sorting. The attributes should be used when the content contains markup. Note that all Func and similar elements automatically generate index entries. If the TheIndex element (3.2-16) is not present in the document all Index elements are ignored.

3.5-5 <URL>
<!ELEMENT URL (#PCDATA|Alt|Link|LinkText)*>  <!-- Link, LinkText
     variant for case where text needs further markup -->
<!ATTLIST URL Text CDATA #IMPLIED>   <!-- This is for output formats
                                          that have links like HTML -->
<!ELEMENT Link     (%InnerText;)*> <!-- the URL -->
<!ELEMENT LinkText (%InnerText;)*> <!-- text for links, can contain markup -->

This element is for references into the internet. It specifies an URL and optionally a text which can be used for a link (like in HTML or PDF versions of the document). This can be specified in two ways: Either the URL is given as element content and the text is given in the optional Text attribute (in this case the text cannot contain further markup), or the element contains the two elements Link and LinkText which in turn contain the URL and the text, respectively. The default value for the text is the URL itself.

3.5-6 <Email>
<!ELEMENT Email (#PCDATA|Alt|Link|LinkText)*>

This element type is the special case of an URL specifying an email address. The content of the element should be the email address without any prefix like "mailto:". This address is typeset by all converters, also without any prefix. In the case of an output document format like HTML the converter can produce a link with a "mailto:" prefix.

3.5-7 <Homepage>
<!ELEMENT Homepage (#PCDATA|Alt|Link|LinkText)*>

This element type is the special case of an URL specifying a WWW-homepage.

3.6 Structural Elements like Lists

The GAPDoc system offers some limited access to structural elements like lists, enumerations, and tables. Although it is possible to use all LaTeX constructs one always has to think about other output formats. The elements in this section are guaranteed to produce something reasonable in all output formats.

3.6-1 <List>
<!ELEMENT List ( ((Mark,Item)|Item)+ )>
<!ATTLIST List Only CDATA #IMPLIED
               Not  CDATA #IMPLIED>

This element produces a list. Each item in the list corresponds to an Item element. Every Item element is optionally preceded by a Mark element. The content of this is used as a marker for the item. Note that this marker can be a whole word or even a sentence. It will be typeset in some emphasized fashion and most converters will provide some indentation for the rest of the item.

The Only and Not attributes can be used to specify, that the list is included into the output by only one type of converter (Only) or all but one type of converter (Not). Of course at most one of the two attributes may occur in one element. The following values are allowed as of now: "LaTeX", "HTML", and "Text". See also the Alt element in 3.9-1 for more about text alternatives for certain converters.

3.6-2 <Mark>
<!ELEMENT Mark ( %InnerText;)*>

This element is used in the List element to mark items. See 3.6-1 for an explanation.

3.6-3 <Item>
<!ELEMENT Item ( %Text;)*>

This element is used in the List, Enum, and Table elements to specify the items. See sections 3.6-1, 3.6-4, and 3.6-5 for further information.

3.6-4 <Enum>
<!ELEMENT Enum ( Item+ )>
<!ATTLIST Enum Only CDATA #IMPLIED
               Not  CDATA #IMPLIED>

This element is used like the List element (see 3.6-1) except that the items must not have marks attached to them. Instead, the items are numbered automatically. The same comments about the Only and Not attributes as above apply.

3.6-5 <Table>
<!ELEMENT Table ( Caption?, (Row | HorLine)+ )>
<!ATTLIST Table Label   CDATA #IMPLIED
                Only    CDATA #IMPLIED
                Not     CDATA #IMPLIED
                Align   CDATA #REQUIRED>
                <!-- We allow | and l,c,r, nothing else -->
<!ELEMENT Row   ( Item+ )>
<!ELEMENT HorLine EMPTY>
<!ELEMENT Caption ( %InnerText;)*>

A table in GAPDoc consists of an optional Caption element followed by a sequence of Row and HorLine elements. A HorLine element produces a horizontal line in the table. A Row element consists of a sequence of Item elements as they also occur in List and Enum elements. The Only and Not attributes have the same functionality as described in the List element in 3.6-1.

The Align attribute is written like a LaTeX tabular alignment specifier but only the letters "l", "r", "c", and "|" are allowed meaning left alignment, right alignment, centered alignment, and a vertical line as delimiter between columns respectively.

If the Label attribute is there, one can reference the table with the Ref element (see 3.5-1) using its Label attribute.

Usually only simple tables should be used. If you want a complicated table in the LaTeX output you should provide alternatives for text and HTML output. Note that in HTML-4.0 there is no possibility to interpret the "|" column separators and HorLine elements as intended. There are lines between all columns and rows or no lines at all.

3.7 Types of Text

This section covers the markup of text. Various types of "text" exist. The following elements are used in the GAPDoc system to mark them. They mostly come in pairs, one long name which is easier to remember and a shortcut to make the markup "lighter".

Most of the following elements are thought to contain only character data and no further markup elements. It is however necessary to allow Alt elements to resolve the entities described in section 2.2-3.

3.7-1 <Emph> and <E>
<!ELEMENT Emph (%InnerText;)*>    <!-- Emphasize something -->
<!ELEMENT E    (%InnerText;)*>    <!-- the same as shortcut -->

This element is used to emphasize some piece of text. It may contain %InnerText; (see here).

3.7-2 <Quoted> and <Q>
<!ELEMENT Quoted (%InnerText;)*>   <!-- Quoted (in quotes) text -->
<!ELEMENT Q (%InnerText;)*>        <!-- Quoted text (shortcut) -->

This element is used to put some piece of text into " "-quotes. It may contain %InnerText; (see here).

3.7-3 <Keyword> and <K>
<!ELEMENT Keyword (#PCDATA|Alt)*>  <!-- Keyword -->
<!ELEMENT K (#PCDATA|Alt)*>        <!-- Keyword (shortcut) -->

This element is used to mark something as a keyword. Usually this will be a GAP keyword such as "if" or "for". No further markup elements are allowed within this element except for the Alt element, which is necessary.

3.7-4 <Arg> and <A>
<!ELEMENT Arg (#PCDATA|Alt)*>      <!-- Argument -->
<!ELEMENT A (#PCDATA|Alt)*>        <!-- Argument (shortcut) -->

This element is used inside Descriptions in ManSections to mark something as an argument (of a function, operation, or such). It is guaranteed that the converters typeset those exactly as in the definition of functions. No further markup elements are allowed within this element.

3.7-5 <Code> and <C>
<!ELEMENT Code (#PCDATA|Arg|Alt)*>     <!-- GAP code -->
<!ELEMENT C (#PCDATA|Arg|Alt)*>        <!-- GAP code (shortcut) -->

This element is used to mark something as a piece of code like for example a GAP expression. It is guaranteed that the converters typeset this exactly as in the Listing element (compare section 3.7-9). The only further markup elements allowed within this element are <Arg> elements (see 3.7-4).

3.7-6 <File> and <F>
<!ELEMENT File (#PCDATA|Alt)*>     <!-- Filename -->
<!ELEMENT F (#PCDATA|Alt)*>        <!-- Filename (shortcut) -->

This element is used to mark something as a filename or a pathname in the file system. No further markup elements are allowed within this element.

3.7-7 <Button> and <B>
<!ELEMENT Button (#PCDATA|Alt)*>   <!-- "Button" (also Menu, Key, ...) -->
<!ELEMENT B (#PCDATA|Alt)*>        <!-- "Button" (shortcut) -->

This element is used to mark something as a button. It can also be used for other items in a graphical user interface like menus, menu entries, or keys. No further markup elements are allowed within this element.

3.7-8 <Package>
<!ELEMENT Package (#PCDATA|Alt)*>   <!-- A package name -->

This element is used to mark something as a name of a package. This is for example used to define the entities GAP, XGAP or GAPDoc (see section 2.2-3). No further markup elements are allowed within this element.

3.7-9 <Listing>
<!ELEMENT Listing (#PCDATA)>  <!-- This is just for GAP code listings -->
<!ATTLIST Listing Type CDATA #IMPLIED> <!-- a comment about the type of
                                            listed code, may appear in
                                            output -->

This element is used to embed listings of programs into the document. Only character data and no other elements are allowed in the content. You should not use the character entities described in section 2.2-3 but instead type the characters directly. Only the general XML rules from section 2.1 apply. Note especially the usage of <![CDATA[ sections described there. It is guaranteed that all converters use a fixed width font for typesetting Listing elements. Compare also the usage of the Code and C elements in 3.7-5.

The Type attribute contains a comment about the type of listed code. It may appear in the output.

3.7-10 <Log> and <Example>
<!ELEMENT Example (#PCDATA)>  <!-- This is subject to the automatic 
                                   example checking mechanism -->
<!ELEMENT Log (#PCDATA)>      <!-- This not -->

These two elements behave exactly like the Listing element (see 3.7-9). They are thought for protocols of GAP sessions. The only difference between the two is that Example sections are intended to be subject to an automatic manual checking mechanism used to ensure the correctness of the GAP manual whereas Log is not touched by this (see section 5.4 for checking tools).

To get a good layout of the examples for display in a standard terminal we suggest to use SizeScreen([72]); (see SizeScreen (Reference: SizeScreen)) in your GAP session before producing the content of Example elements.

3.7-11 <Verb>

There is one further type of verbatim-like element.

<!ELEMENT Verb  (#PCDATA)> 

The content of such an element is guaranteed to be put into an output version exactly as it is using some fixed width font. Before the content a new line is started. If the line after the end of the start tag consists of whitespace only then this part of the content is skipped.

This element is intended to be used together with the Alt element to specify pre-formatted ASCII alternatives for complicated Display formulae or Tables.

3.8 Elements for Mathematical Formulae

3.8-1 <Math> and <Display>
<!-- Normal TeX math mode formula -->
<!ELEMENT Math (#PCDATA|A|Arg|Alt)*>   
<!-- TeX displayed math mode formula -->
<!ELEMENT Display (#PCDATA|A|Arg|Alt)*>
<!-- Mode="M" causes <M>-style formatting -->
<!ATTLIST Display Mode CDATA #IMPLIED>

These elements are used for mathematical formulae. As described in section 2.2-2 they correspond to LaTeX's math and display math mode respectively.

The formulae are typed in as in LaTeX, except that the standard XML entities, see 2.1-9 (in particular the characters < and &), must be escaped - either by using the corresponding entities or by enclosing the formula between "<![CDATA[" and "]]>". (The main reference for LaTeX is [Lam85].)

It is also possible to use some unicode characters for mathematical symbols directly, provided that it can be translated by Encode (6.2-2) into "LaTeX" encoding and that SimplifiedUnicodeString (6.2-2) with arguments "latin1" and "single" returns something sensible. Currently, we support entities &CC;, &ZZ;, &NN;, &PP;, &QQ;, &HH;, &RR; for the corresponding black board bold letters ℂ, ℤ, ℕ, ℙ, ℚ, ℍ and ℝ, respectively.

The only element type that is allowed within the formula elements is the Arg or A element (see 3.7-4), which is used to typeset identifiers that are arguments to GAP functions or operations.

If a Display element has an attribute Mode with value "M", then the formula is formatted as in M elements (see 3.8-2). Otherwise in text and HTML output the formula is shown as LaTeX source code.

For simple formulae (and you should try to make all your formulae simple!) attempt to use the M element or the Mode="M" attribute in Display for which there is a well defined translation into text, which can be used for text and HTML output versions of the document. So, if possible try to avoid the Math elements and Display elements without attribute or provide useful text substitutes for complicated formulae via Alt elements (see 3.9-1 and 3.7-11).

3.8-2 <M>
<!-- Math with well defined translation to text output -->
<!ELEMENT M (#PCDATA|A|Arg|Alt)*>

The "M" element type is intended for formulae in the running text for which there is a sensible text version. For the LaTeX version of a GAPDoc document the M and Math elements are equivalent. The remarks in 3.8-1 about special characters and the Arg element apply here as well. A document which has all formulae enclosed in M elements can be well readable in text terminal output and printed output versions.

Compared to former versions of GAPDoc many more formulae can be put into M elements. Most modern terminal emulations support unicode characters and many mathematical symbols can now be represented by such characters. But even if a terminal can only display ASCII characters, the user will see some not too bad representation of a formula.

As examples, here are some LaTeX macros which have a sensible ASCII translation and are guaranteed to be translated accordingly by text (and HTML) converters (for a full list of handled Macros see RecNames(TEXTMTRANSLATIONS)):

Table: LaTeX macros with special text translation
\ast *
\bf
\bmod mod
\cdot *
\colon :
\equiv =
\geq >=
\germ
\hookrightarrow ->
\iff <=>
\langle <
\ldots ...
\left  
\leq <=
\leftarrow <-
\Leftarrow <=
\limits  
\longrightarrow -->
\Longrightarrow ==>
\mapsto ->
\mathbb  
\mathop  
\mid |
\pmod mod
\prime '
\rangle >
\right  
\rightarrow ->
\Rightarrow =>
\rm, \sf, \textrm, \text
\setminus \
\thinspace
\times x
\to ->
\vert |
\!
\,
\;  
\{ {
\} }

 


In all other macros only the backslash is removed (except for some macros describing more exotic symbols). Whitespace is normalized (to one blank) but not removed. Note that whitespace is not added, so you may want to add a few more spaces than you usually do in your LaTeX documents.

Braces {} are removed in general, however pairs of double braces are converted to one pair of braces. This can be used to write <M>x^{12}</M> for x^12 and <M>x_{{i+1}}</M> for x_{i+1}.

3.9 Everything else

3.9-1 <Alt>

This element is used to specify alternatives for different output formats within normal text. See also sections 3.6-1, 3.6-4, and 3.6-5 for alternatives in lists and tables.

<!ELEMENT Alt (%InnerText;)*>  <!-- This is only to allow "Only" and
                                    "Not" attributes for normal text -->
<!ATTLIST Alt Only CDATA #IMPLIED
              Not  CDATA #IMPLIED>

Of course exactly one of the two attributes must occur in one element. The attribute values must be one word or a list of words, separated by spaces or commas. The words which are currently recognized by the converter programs contained in GAPDoc are: "LaTeX", "HTML", and "Text". If the Only attribute is specified then only the corresponding converter will include the content of the element into the output document. If the Not attribute is specified the corresponding converter will ignore the content of the element. You can use other words to specify special alternatives for other converters of GAPDoc documents.

In the case of "HTML" there is a second word which is recognized and this can either be "MathJax" or "noMathJax". For example a pair of Alt elements with <Alt Only="HTML noMathJax">... and <Alt Not="HTML noMathJax">... could provide special content for the case of HTML output without use of MathJax and every other output.

We fix a rule for handling the content of an Alt element with Only attribute. In their content code for the corresponding output format is included directly. So, in case of HTML the content is HTML code, in case of LaTeX the content is LaTeX code. The converters don't apply any handling of special characters to this content. In the case of LaTeX the formatting of the code is not changed.

Within the element only %InnerText; (see here) is allowed. This is to ensure that the same set of chapters, sections, and subsections show up in all output formats.

3.9-2 <Par> and <P>
<!ELEMENT Par EMPTY>    <!-- this is intentionally empty! -->
<!ELEMENT P EMPTY>      <!-- the same as shortcut  -->

This EMPTY element marks the boundary of paragraphs. Note that an empty line in the input does not mark a new paragraph as opposed to the LaTeX convention.

(Remark: it would be much easier to parse a document and to understand its sectioning and paragraph structure when there was an element whose content is the text of a paragraph. But in practice many paragraph boundaries are implicitly clear which would make it somewhat painful to enclose each paragraph in extra tags. The introduction of the P or Par elements as above delegates this pain to the writer of a conversion program for GAPDoc documents.)

3.9-3 <Br>
 
<!ELEMENT Br EMPTY>     <!-- a forced line break  -->

This element can be used to force a line break in the output versions of a GAPDoc element, it does not start a new paragraph. Please, do not use this instead of a Par element, this would often lead to ugly output versions of your document.

3.9-4 <Ignore>
 
<!ELEMENT Ignore (%Text;| Chapter | Section | Subsection | ManSection |
                  Heading)*>
<!ATTLIST Ignore Remark CDATA #IMPLIED>

This element can appear anywhere. Its content is ignored by the standard converters. It can be used, for example, to include data which are not part of the actual GAPDoc document, like source code, or to make not finished parts of the document invisible.

Of course, one can use special converter programs which extract the contents of Ignore elements. Information on the type of the content can be stored in the optional attribute Remark.

 [Top of Book]  [Contents]   [Previous Chapter]   [Next Chapter] 
Goto Chapter: Top 1 2 3 4 5 6 7 A B C Bib Ind

generated by GAPDoc2HTML