CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutSign UpSign In

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.

| Download

GAP 4.8.9 installation with standard packages -- copy to your CoCalc project to get it

Views: 418346
1
2
<Chapter Label="ch:bibutil">
3
<Heading>Utilities for Bibliographies</Heading>
4
5
A standard for collecting references (in particular to mathematical
6
texts) is &BibTeX;
7
(<URL>http://www.ctan.org/tex-archive/biblio/bibtex/distribs/doc/</URL>).
8
A disadvantage of &BibTeX; is that the format of the
9
data is specified with the use by &LaTeX; in mind. The data format is
10
less suited for conversion to other document types like plain text or
11
HTML.<P/>
12
13
In the first section we describe utilities for using data from &BibTeX;
14
files in &GAP;. <P/>
15
16
In the second section we introduce a new XML based data format
17
BibXMLext for bibliographies which seems better suited for other
18
tasks than using it with &LaTeX;. <P/>
19
20
Another section will describe utilities to deal with BibXMLext
21
data in &GAP;.
22
23
24
<Section Label="ParseBib">
25
<Heading>Parsing &BibTeX; Files</Heading>
26
27
Here are functions for parsing, normalizing and printing reference lists
28
in &BibTeX; format. The reference describing this format is&nbsp;<Cite
29
Key="La85" Where="Appendix B"/>.
30
31
<#Include Label="ParseBibFiles">
32
33
<#Include Label="NormalizeNameAndKey">
34
35
<#Include Label="WriteBibFile">
36
37
<#Include Label="LabelsFromBibTeX">
38
39
<#Include Label="InfoBibTools">
40
41
</Section>
42
43
<Section Label="BibXMLformat">
44
<Heading>The BibXMLext Format</Heading>
45
46
Bibliographical data in &BibTeX; files have the disadvantage that the
47
actual data are given in &LaTeX; syntax. This makes it difficult to use
48
the data for anything but for &LaTeX;, say for representations of the
49
data as plain text or HTML. For example: mathematical formulae are in
50
&LaTeX; <C>$</C> environments, non-ASCII characters can be
51
specified in many strange ways, and how to specify URLs for links if the
52
output format allows them?<P/>
53
54
Here we propose an XML data format for bibliographical data which
55
addresses these problems, it is called BibXMLext. In the next
56
section we describe some tools for
57
generating (an approximation to) this data format from &BibTeX; data,
58
and for using data given in BibXMLext format for various
59
purposes. <P/>
60
61
The first motivation for this development was the handling of
62
bibliographical data in &GAPDoc;, but the format and the tools are certainly
63
useful for other purposes as well.<P/>
64
65
We started from a DTD <F>bibxml.dtd</F> which is publicly available, say
66
from <URL>http://bibtexml.sf.net/</URL>. This is essentially a
67
reformulation of the definition of the &BibTeX; format, including
68
several of some widely used further fields. This has already the
69
advantage that a generic XML parser can check the validity of the
70
data entries, for example for missing compulsary fields in entries.
71
We applied the following changes and extensions to define the
72
DTD for BibXMLext, stored in the file <F>bibxmlext.dtd</F> which can
73
be found in the root directory of this &GAPDoc; package (and in Appendix
74
<Ref Appendix="bibxmlextdtd"/>):
75
76
<List >
77
<Mark>names</Mark>
78
<Item>Lists of names in the <C>author</C> and <C>editor</C> fields in
79
&BibTeX; are difficult to parse. Here they must be given by a sequence
80
of <C>&lt;name></C>-elements which each contain an optional <C>&lt;first></C>-
81
and a <C>&lt;last></C>-element for the first and last names,
82
respectively.</Item>
83
<Mark><C>&lt;M></C> and <C>&lt;Math></C></Mark>
84
<Item>These elements enclose mathematical formulae, the content is
85
&LaTeX; code (without the <C>$</C>). These should be handled in
86
the same way as the elements with the same names in &GAPDoc;, see
87
<Ref Subsect="M"/> and <Ref Subsect="Math"/>. In particular, simple
88
formulae which have a well defined plain text representation can be
89
given in <C>&lt;M></C>-elements.</Item>
90
<Mark>Encoding</Mark>
91
<Item>Note that in XML files we can use the full range of unicode
92
characters, see <URL>http://www.unicode.org/</URL>. All non-ASCII
93
characters should be specified as unicode characters. This makes dealing
94
with special characters easy for plain text or HTML, only for use with
95
&LaTeX; some sort of translation is necessary.</Item>
96
<Mark><C>&lt;URL></C></Mark>
97
<Item>These elements are allowed everywhere in the text and should be
98
represented by links in converted formats which allow this. It is used
99
in the same way as the element with the same name in &GAPDoc;, see
100
<Ref Subsect="URL"/>.</Item>
101
<Mark><C>&lt;Alt Only="..."></C> and <C>&lt;Alt Not="..."></C></Mark>
102
<Item>Sometimes information should be given in different ways, depending
103
on the output format of the data. This is possible with the
104
<C>&lt;Alt></C>-elements with the same definition as in &GAPDoc;, see
105
<Ref Subsect="Alt"/>.
106
</Item>
107
<Mark><C>&lt;C></C></Mark>
108
<Item>This element should be used to protect text from case changes by
109
converters (the extra <C>{}</C> characters in &BibTeX;
110
title fields).</Item>
111
<Mark><C>&lt;string key="..." value="..."/></C> and
112
<C>&lt;value key="..."/></C></Mark>
113
<Item>The <C>&lt;string></C>-element defines key-value pairs which can
114
be used in any field via the <C>&lt;value></C>-element (not only for
115
whole fields but also parts of the text).</Item>
116
<Mark><C>&lt;other type="..."></C></Mark>
117
<Item>This is a generic element for fields which are otherwise not
118
supported. An arbitrary number of them is allowed for each entry, so any
119
kind of additional data can be added to entries.</Item>
120
<Mark><C>&lt;Wrap Name="..."></C></Mark>
121
<Item>This generic element is allowed inside all fields. This markup will be
122
just ignored (but not the element content) by our standard tools. But
123
it can be a useful hook for introducing arbitrary further markup
124
(and our tools can easily be extended to handle it).</Item>
125
<Mark>Extra entities</Mark>
126
<Item>The DTD defines the standard XML entities (<Ref
127
Subsect="XMLspchar"/> and the entities <C>&amp;nbsp;</C> (non-breakable
128
space), <C>&amp;ndash;</C> and <C>&amp;copyright;</C>.
129
Use <C>&amp;ndash;</C> in page ranges.
130
</Item>
131
</List>
132
133
For further details of the DTD we refer to the file <F>bibxmlext.dtd</F>
134
itself which is shown in appendix <Ref Appendix="bibxmlextdtd"/>. That
135
file also recalls some information from the &BibTeX; documentation on how
136
the standard fields of entries should be used. Which entry types and
137
which fields are supported (and the ordering of the fields which is
138
fixed by a DTD) can be either read off the DTD, or within &GAP; one can use
139
the function <Ref Func="TemplateBibXML"/> to get templates for the
140
various entry types.
141
<P/>
142
143
Here is an example of a BibXMLext document:
144
<Listing Type="doc/testbib.xml"><![CDATA[
145
<#Include SYSTEM "testbib.xml">
146
]]></Listing>
147
148
There is a standard XML header and a <C>DOCTYPE</C> declaration
149
referring to the <F>bibxmlext.dtd</F> DTD mentioned above. Local
150
entities could be defined in the <C>DOCTYPE</C> tag as shown in the
151
example in <Ref Subsect="GDent"/>. The actual content of the document is
152
inside a <C>&lt;file></C>-element, it consists of <C>&lt;string></C>- and
153
<C>&lt;entry></C>-elements. Several of the BibXMLext markup features are
154
shown. We will use this input document for some examples below.
155
</Section>
156
157
<Section Label="BibXMLtools">
158
<Heading>Utilities for BibXMLext data</Heading>
159
160
<Subsection Label="Subsect:IntroXMLBib">
161
<Heading>Translating &BibTeX; to BibXMLext</Heading>
162
First we describe a tool which can translate bibliography entries from
163
&BibTeX; data to BibXMLext <C>&lt;entry></C>-elements. It also does some
164
validation of the data. In some
165
cases it is desirable to improve the result by hand afterwards
166
(editing formulae, adding <C>&lt;URL></C>-elements, translating
167
non-ASCII characters to unicode, ...).<P/>
168
See <Ref Func="WriteBibXMLextFile"/> below for how to write the results
169
to a BibXMLext file.
170
</Subsection>
171
172
<#Include Label="StringBibAsXMLext">
173
174
The following functions allow parsing of data which are already in
175
BibXMLext format.
176
177
<#Include Label="ParseBibXMLextString">
178
179
<#Include Label="WriteBibXMLextFile">
180
181
<Subsection Label="Subsect:RecBib">
182
<Heading>Bibliography Entries as Records</Heading>
183
For working with BibXMLext entries we find it convenient to first
184
translate the parse tree of an entry, as returned by <Ref
185
Func="ParseBibXMLextFiles"/>, to a record with the field names of the
186
entry as components whose value is the content of the field as string.
187
These strings are generated with respect to a result type. The records are
188
generated by the following function which can be customized by the user.
189
</Subsection>
190
191
<#Include Label="RecBibXMLEntry">
192
193
<#Include Label="AddHandlerBuildRecBibXMLEntry">
194
195
<#Include Label="StringBibXMLEntry">
196
197
The following command may be useful to generate completly new
198
bibliography entries in BibXMLext format. It also informs about the
199
supported entry types and field names.
200
201
<#Include Label="TemplateBibXML">
202
203
</Section>
204
205
<#Include Label="SearchMRSection">
206
207
</Chapter>
208
209
210
211