Path: blob/aarch64-shenandoah-jdk8u272-b10/jaxp/src/org/w3c/dom/ls/LSSerializer.java
86410 views
/*1* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.2*3* This code is free software; you can redistribute it and/or modify it4* under the terms of the GNU General Public License version 2 only, as5* published by the Free Software Foundation. Oracle designates this6* particular file as subject to the "Classpath" exception as provided7* by Oracle in the LICENSE file that accompanied this code.8*9* This code is distributed in the hope that it will be useful, but WITHOUT10* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or11* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License12* version 2 for more details (a copy is included in the LICENSE file that13* accompanied this code).14*15* You should have received a copy of the GNU General Public License version16* 2 along with this work; if not, write to the Free Software Foundation,17* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.18*19* Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA20* or visit www.oracle.com if you need additional information or have any21* questions.22*/2324/*25* This file is available under and governed by the GNU General Public26* License version 2 only, as published by the Free Software Foundation.27* However, the following notice accompanied the original version of this28* file and, per its terms, should not be removed:29*30* Copyright (c) 2004 World Wide Web Consortium,31*32* (Massachusetts Institute of Technology, European Research Consortium for33* Informatics and Mathematics, Keio University). All Rights Reserved. This34* work is distributed under the W3C(r) Software License [1] in the hope that35* it will be useful, but WITHOUT ANY WARRANTY; without even the implied36* warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.37*38* [1] http://www.w3.org/Consortium/Legal/2002/copyright-software-2002123139*/4041package org.w3c.dom.ls;4243import org.w3c.dom.DOMConfiguration;44import org.w3c.dom.Node;45import org.w3c.dom.DOMException;4647/**48* A <code>LSSerializer</code> provides an API for serializing (writing) a49* DOM document out into XML. The XML data is written to a string or an50* output stream. Any changes or fixups made during the serialization affect51* only the serialized data. The <code>Document</code> object and its52* children are never altered by the serialization operation.53* <p> During serialization of XML data, namespace fixup is done as defined in [<a href='http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407'>DOM Level 3 Core</a>]54* , Appendix B. [<a href='http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113'>DOM Level 2 Core</a>]55* allows empty strings as a real namespace URI. If the56* <code>namespaceURI</code> of a <code>Node</code> is empty string, the57* serialization will treat them as <code>null</code>, ignoring the prefix58* if any.59* <p> <code>LSSerializer</code> accepts any node type for serialization. For60* nodes of type <code>Document</code> or <code>Entity</code>, well-formed61* XML will be created when possible (well-formedness is guaranteed if the62* document or entity comes from a parse operation and is unchanged since it63* was created). The serialized output for these node types is either as a64* XML document or an External XML Entity, respectively, and is acceptable65* input for an XML parser. For all other types of nodes the serialized form66* is implementation dependent.67* <p>Within a <code>Document</code>, <code>DocumentFragment</code>, or68* <code>Entity</code> being serialized, <code>Nodes</code> are processed as69* follows70* <ul>71* <li> <code>Document</code> nodes are written, including the XML72* declaration (unless the parameter "xml-declaration" is set to73* <code>false</code>) and a DTD subset, if one exists in the DOM. Writing a74* <code>Document</code> node serializes the entire document.75* </li>76* <li>77* <code>Entity</code> nodes, when written directly by78* <code>LSSerializer.write</code>, outputs the entity expansion but no79* namespace fixup is done. The resulting output will be valid as an80* external entity.81* </li>82* <li> If the parameter "<a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#parameter-entities'>83* entities</a>" is set to <code>true</code>, <code>EntityReference</code> nodes are84* serialized as an entity reference of the form "85* <code>&entityName;</code>" in the output. Child nodes (the expansion)86* of the entity reference are ignored. If the parameter "<a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#parameter-entities'>87* entities</a>" is set to <code>false</code>, only the children of the entity reference88* are serialized. <code>EntityReference</code> nodes with no children (no89* corresponding <code>Entity</code> node or the corresponding90* <code>Entity</code> nodes have no children) are always serialized.91* </li>92* <li>93* <code>CDATAsections</code> containing content characters that cannot be94* represented in the specified output encoding are handled according to the95* "<a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#parameter-split-cdata-sections'>96* split-cdata-sections</a>" parameter. If the parameter is set to <code>true</code>,97* <code>CDATAsections</code> are split, and the unrepresentable characters98* are serialized as numeric character references in ordinary content. The99* exact position and number of splits is not specified. If the parameter100* is set to <code>false</code>, unrepresentable characters in a101* <code>CDATAsection</code> are reported as102* <code>"wf-invalid-character"</code> errors if the parameter "<a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#parameter-well-formed'>103* well-formed</a>" is set to <code>true</code>. The error is not recoverable - there is no104* mechanism for supplying alternative characters and continuing with the105* serialization.106* </li>107* <li> <code>DocumentFragment</code> nodes are serialized by108* serializing the children of the document fragment in the order they109* appear in the document fragment.110* </li>111* <li> All other node types (Element, Text,112* etc.) are serialized to their corresponding XML source form.113* </li>114* </ul>115* <p ><b>Note:</b> The serialization of a <code>Node</code> does not always116* generate a well-formed XML document, i.e. a <code>LSParser</code> might117* throw fatal errors when parsing the resulting serialization.118* <p> Within the character data of a document (outside of markup), any119* characters that cannot be represented directly are replaced with120* character references. Occurrences of '<' and '&' are replaced by121* the predefined entities &lt; and &amp;. The other predefined122* entities (&gt;, &apos;, and &quot;) might not be used, except123* where needed (e.g. using &gt; in cases such as ']]>'). Any124* characters that cannot be represented directly in the output character125* encoding are serialized as numeric character references (and since126* character encoding standards commonly use hexadecimal representations of127* characters, using the hexadecimal representation when serializing128* character references is encouraged).129* <p> To allow attribute values to contain both single and double quotes, the130* apostrophe or single-quote character (') may be represented as131* "&apos;", and the double-quote character (") as "&quot;". New132* line characters and other characters that cannot be represented directly133* in attribute values in the output character encoding are serialized as a134* numeric character reference.135* <p> Within markup, but outside of attributes, any occurrence of a character136* that cannot be represented in the output character encoding is reported137* as a <code>DOMError</code> fatal error. An example would be serializing138* the element <LaCa\u00f1ada/> with <code>encoding="us-ascii"</code>.139* This will result with a generation of a <code>DOMError</code>140* "wf-invalid-character-in-node-name" (as proposed in "<a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#parameter-well-formed'>141* well-formed</a>").142* <p> When requested by setting the parameter "<a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#parameter-normalize-characters'>143* normalize-characters</a>" on <code>LSSerializer</code> to true, character normalization is144* performed according to the definition of <a href='http://www.w3.org/TR/2004/REC-xml11-20040204/#dt-fullnorm'>fully145* normalized</a> characters included in appendix E of [<a href='http://www.w3.org/TR/2004/REC-xml11-20040204/'>XML 1.1</a>] on all146* data to be serialized, both markup and character data. The character147* normalization process affects only the data as it is being written; it148* does not alter the DOM's view of the document after serialization has149* completed.150* <p> Implementations are required to support the encodings "UTF-8",151* "UTF-16", "UTF-16BE", and "UTF-16LE" to guarantee that data is152* serializable in all encodings that are required to be supported by all153* XML parsers. When the encoding is UTF-8, whether or not a byte order mark154* is serialized, or if the output is big-endian or little-endian, is155* implementation dependent. When the encoding is UTF-16, whether or not the156* output is big-endian or little-endian is implementation dependent, but a157* Byte Order Mark must be generated for non-character outputs, such as158* <code>LSOutput.byteStream</code> or <code>LSOutput.systemId</code>. If159* the Byte Order Mark is not generated, a "byte-order-mark-needed" warning160* is reported. When the encoding is UTF-16LE or UTF-16BE, the output is161* big-endian (UTF-16BE) or little-endian (UTF-16LE) and the Byte Order Mark162* is not be generated. In all cases, the encoding declaration, if163* generated, will correspond to the encoding used during the serialization164* (e.g. <code>encoding="UTF-16"</code> will appear if UTF-16 was165* requested).166* <p> Namespaces are fixed up during serialization, the serialization process167* will verify that namespace declarations, namespace prefixes and the168* namespace URI associated with elements and attributes are consistent. If169* inconsistencies are found, the serialized form of the document will be170* altered to remove them. The method used for doing the namespace fixup171* while serializing a document is the algorithm defined in Appendix B.1,172* "Namespace normalization", of [<a href='http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407'>DOM Level 3 Core</a>]173* .174* <p> While serializing a document, the parameter "discard-default-content"175* controls whether or not non-specified data is serialized.176* <p> While serializing, errors and warnings are reported to the application177* through the error handler (<code>LSSerializer.domConfig</code>'s "<a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#parameter-error-handler'>178* error-handler</a>" parameter). This specification does in no way try to define all possible179* errors and warnings that can occur while serializing a DOM node, but some180* common error and warning cases are defined. The types (181* <code>DOMError.type</code>) of errors and warnings defined by this182* specification are:183* <dl>184* <dt><code>"no-output-specified" [fatal]</code></dt>185* <dd> Raised when186* writing to a <code>LSOutput</code> if no output is specified in the187* <code>LSOutput</code>. </dd>188* <dt>189* <code>"unbound-prefix-in-entity-reference" [fatal]</code> </dt>190* <dd> Raised if the191* configuration parameter "<a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#parameter-namespaces'>192* namespaces</a>" is set to <code>true</code> and an entity whose replacement text193* contains unbound namespace prefixes is referenced in a location where194* there are no bindings for the namespace prefixes. </dd>195* <dt>196* <code>"unsupported-encoding" [fatal]</code></dt>197* <dd> Raised if an unsupported198* encoding is encountered. </dd>199* </dl>200* <p> In addition to raising the defined errors and warnings, implementations201* are expected to raise implementation specific errors and warnings for any202* other error and warning cases such as IO errors (file not found,203* permission denied,...) and so on.204* <p>See also the <a href='http://www.w3.org/TR/2004/REC-DOM-Level-3-LS-20040407'>Document Object Model (DOM) Level 3 Load205and Save Specification</a>.206*/207public interface LSSerializer {208/**209* The <code>DOMConfiguration</code> object used by the210* <code>LSSerializer</code> when serializing a DOM node.211* <br> In addition to the parameters recognized by the <a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#DOMConfiguration'>212* DOMConfiguration</a> interface defined in [<a href='http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407'>DOM Level 3 Core</a>]213* , the <code>DOMConfiguration</code> objects for214* <code>LSSerializer</code> adds, or modifies, the following215* parameters:216* <dl>217* <dt><code>"canonical-form"</code></dt>218* <dd>219* <dl>220* <dt><code>true</code></dt>221* <dd>[<em>optional</em>] Writes the document according to the rules specified in [<a href='http://www.w3.org/TR/2001/REC-xml-c14n-20010315'>Canonical XML</a>].222* In addition to the behavior described in "<a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#parameter-canonical-form'>223* canonical-form</a>" [<a href='http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407'>DOM Level 3 Core</a>]224* , setting this parameter to <code>true</code> will set the parameters225* "format-pretty-print", "discard-default-content", and "xml-declaration226* ", to <code>false</code>. Setting one of those parameters to227* <code>true</code> will set this parameter to <code>false</code>.228* Serializing an XML 1.1 document when "canonical-form" is229* <code>true</code> will generate a fatal error. </dd>230* <dt><code>false</code></dt>231* <dd>[<em>required</em>] (<em>default</em>) Do not canonicalize the output. </dd>232* </dl></dd>233* <dt><code>"discard-default-content"</code></dt>234* <dd>235* <dl>236* <dt>237* <code>true</code></dt>238* <dd>[<em>required</em>] (<em>default</em>) Use the <code>Attr.specified</code> attribute to decide what attributes239* should be discarded. Note that some implementations might use240* whatever information available to the implementation (i.e. XML241* schema, DTD, the <code>Attr.specified</code> attribute, and so on) to242* determine what attributes and content to discard if this parameter is243* set to <code>true</code>. </dd>244* <dt><code>false</code></dt>245* <dd>[<em>required</em>]Keep all attributes and all content.</dd>246* </dl></dd>247* <dt><code>"format-pretty-print"</code></dt>248* <dd>249* <dl>250* <dt>251* <code>true</code></dt>252* <dd>[<em>optional</em>] Formatting the output by adding whitespace to produce a pretty-printed,253* indented, human-readable form. The exact form of the transformations254* is not specified by this specification. Pretty-printing changes the255* content of the document and may affect the validity of the document,256* validating implementations should preserve validity. </dd>257* <dt>258* <code>false</code></dt>259* <dd>[<em>required</em>] (<em>default</em>) Don't pretty-print the result. </dd>260* </dl></dd>261* <dt>262* <code>"ignore-unknown-character-denormalizations"</code> </dt>263* <dd>264* <dl>265* <dt>266* <code>true</code></dt>267* <dd>[<em>required</em>] (<em>default</em>) If, while verifying full normalization when [<a href='http://www.w3.org/TR/2004/REC-xml11-20040204/'>XML 1.1</a>] is268* supported, a character is encountered for which the normalization269* properties cannot be determined, then raise a270* <code>"unknown-character-denormalization"</code> warning (instead of271* raising an error, if this parameter is not set) and ignore any272* possible denormalizations caused by these characters. </dd>273* <dt>274* <code>false</code></dt>275* <dd>[<em>optional</em>] Report a fatal error if a character is encountered for which the276* processor cannot determine the normalization properties. </dd>277* </dl></dd>278* <dt>279* <code>"normalize-characters"</code></dt>280* <dd> This parameter is equivalent to281* the one defined by <code>DOMConfiguration</code> in [<a href='http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407'>DOM Level 3 Core</a>]282* . Unlike in the Core, the default value for this parameter is283* <code>true</code>. While DOM implementations are not required to284* support <a href='http://www.w3.org/TR/2004/REC-xml11-20040204/#dt-fullnorm'>fully285* normalizing</a> the characters in the document according to appendix E of [<a href='http://www.w3.org/TR/2004/REC-xml11-20040204/'>XML 1.1</a>], this286* parameter must be activated by default if supported. </dd>287* <dt>288* <code>"xml-declaration"</code></dt>289* <dd>290* <dl>291* <dt><code>true</code></dt>292* <dd>[<em>required</em>] (<em>default</em>) If a <code>Document</code>, <code>Element</code>, or <code>Entity</code>293* node is serialized, the XML declaration, or text declaration, should294* be included. The version (<code>Document.xmlVersion</code> if the295* document is a Level 3 document and the version is non-null, otherwise296* use the value "1.0"), and the output encoding (see297* <code>LSSerializer.write</code> for details on how to find the output298* encoding) are specified in the serialized XML declaration. </dd>299* <dt>300* <code>false</code></dt>301* <dd>[<em>required</em>] Do not serialize the XML and text declarations. Report a302* <code>"xml-declaration-needed"</code> warning if this will cause303* problems (i.e. the serialized data is of an XML version other than [<a href='http://www.w3.org/TR/2004/REC-xml-20040204'>XML 1.0</a>], or an304* encoding would be needed to be able to re-parse the serialized data). </dd>305* </dl></dd>306* </dl>307*/308public DOMConfiguration getDomConfig();309310/**311* The end-of-line sequence of characters to be used in the XML being312* written out. Any string is supported, but XML treats only a certain313* set of characters sequence as end-of-line (See section 2.11,314* "End-of-Line Handling" in [<a href='http://www.w3.org/TR/2004/REC-xml-20040204'>XML 1.0</a>], if the315* serialized content is XML 1.0 or section 2.11, "End-of-Line Handling"316* in [<a href='http://www.w3.org/TR/2004/REC-xml11-20040204/'>XML 1.1</a>], if the317* serialized content is XML 1.1). Using other character sequences than318* the recommended ones can result in a document that is either not319* serializable or not well-formed).320* <br> On retrieval, the default value of this attribute is the321* implementation specific default end-of-line sequence. DOM322* implementations should choose the default to match the usual323* convention for text files in the environment being used.324* Implementations must choose a default sequence that matches one of325* those allowed by XML 1.0 or XML 1.1, depending on the serialized326* content. Setting this attribute to <code>null</code> will reset its327* value to the default value.328* <br>329*/330public String getNewLine();331/**332* The end-of-line sequence of characters to be used in the XML being333* written out. Any string is supported, but XML treats only a certain334* set of characters sequence as end-of-line (See section 2.11,335* "End-of-Line Handling" in [<a href='http://www.w3.org/TR/2004/REC-xml-20040204'>XML 1.0</a>], if the336* serialized content is XML 1.0 or section 2.11, "End-of-Line Handling"337* in [<a href='http://www.w3.org/TR/2004/REC-xml11-20040204/'>XML 1.1</a>], if the338* serialized content is XML 1.1). Using other character sequences than339* the recommended ones can result in a document that is either not340* serializable or not well-formed).341* <br> On retrieval, the default value of this attribute is the342* implementation specific default end-of-line sequence. DOM343* implementations should choose the default to match the usual344* convention for text files in the environment being used.345* Implementations must choose a default sequence that matches one of346* those allowed by XML 1.0 or XML 1.1, depending on the serialized347* content. Setting this attribute to <code>null</code> will reset its348* value to the default value.349* <br>350*/351public void setNewLine(String newLine);352353/**354* When the application provides a filter, the serializer will call out355* to the filter before serializing each Node. The filter implementation356* can choose to remove the node from the stream or to terminate the357* serialization early.358* <br> The filter is invoked after the operations requested by the359* <code>DOMConfiguration</code> parameters have been applied. For360* example, CDATA sections won't be passed to the filter if "<a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#parameter-cdata-sections'>361* cdata-sections</a>" is set to <code>false</code>.362*/363public LSSerializerFilter getFilter();364/**365* When the application provides a filter, the serializer will call out366* to the filter before serializing each Node. The filter implementation367* can choose to remove the node from the stream or to terminate the368* serialization early.369* <br> The filter is invoked after the operations requested by the370* <code>DOMConfiguration</code> parameters have been applied. For371* example, CDATA sections won't be passed to the filter if "<a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#parameter-cdata-sections'>372* cdata-sections</a>" is set to <code>false</code>.373*/374public void setFilter(LSSerializerFilter filter);375376/**377* Serialize the specified node as described above in the general378* description of the <code>LSSerializer</code> interface. The output is379* written to the supplied <code>LSOutput</code>.380* <br> When writing to a <code>LSOutput</code>, the encoding is found by381* looking at the encoding information that is reachable through the382* <code>LSOutput</code> and the item to be written (or its owner383* document) in this order:384* <ol>385* <li> <code>LSOutput.encoding</code>,386* </li>387* <li>388* <code>Document.inputEncoding</code>,389* </li>390* <li>391* <code>Document.xmlEncoding</code>.392* </li>393* </ol>394* <br> If no encoding is reachable through the above properties, a395* default encoding of "UTF-8" will be used. If the specified encoding396* is not supported an "unsupported-encoding" fatal error is raised.397* <br> If no output is specified in the <code>LSOutput</code>, a398* "no-output-specified" fatal error is raised.399* <br> The implementation is responsible of associating the appropriate400* media type with the serialized data.401* <br> When writing to a HTTP URI, a HTTP PUT is performed. When writing402* to other types of URIs, the mechanism for writing the data to the URI403* is implementation dependent.404* @param nodeArg The node to serialize.405* @param destination The destination for the serialized DOM.406* @return Returns <code>true</code> if <code>node</code> was407* successfully serialized. Return <code>false</code> in case the408* normal processing stopped but the implementation kept serializing409* the document; the result of the serialization being implementation410* dependent then.411* @exception LSException412* SERIALIZE_ERR: Raised if the <code>LSSerializer</code> was unable to413* serialize the node. DOM applications should attach a414* <code>DOMErrorHandler</code> using the parameter "<a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#parameter-error-handler'>415* error-handler</a>" if they wish to get details on the error.416*/417public boolean write(Node nodeArg,418LSOutput destination)419throws LSException;420421/**422* A convenience method that acts as if <code>LSSerializer.write</code>423* was called with a <code>LSOutput</code> with no encoding specified424* and <code>LSOutput.systemId</code> set to the <code>uri</code>425* argument.426* @param nodeArg The node to serialize.427* @param uri The URI to write to.428* @return Returns <code>true</code> if <code>node</code> was429* successfully serialized. Return <code>false</code> in case the430* normal processing stopped but the implementation kept serializing431* the document; the result of the serialization being implementation432* dependent then.433* @exception LSException434* SERIALIZE_ERR: Raised if the <code>LSSerializer</code> was unable to435* serialize the node. DOM applications should attach a436* <code>DOMErrorHandler</code> using the parameter "<a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#parameter-error-handler'>437* error-handler</a>" if they wish to get details on the error.438*/439public boolean writeToURI(Node nodeArg,440String uri)441throws LSException;442443/**444* Serialize the specified node as described above in the general445* description of the <code>LSSerializer</code> interface. The output is446* written to a <code>DOMString</code> that is returned to the caller.447* The encoding used is the encoding of the <code>DOMString</code> type,448* i.e. UTF-16. Note that no Byte Order Mark is generated in a449* <code>DOMString</code> object.450* @param nodeArg The node to serialize.451* @return Returns the serialized data.452* @exception DOMException453* DOMSTRING_SIZE_ERR: Raised if the resulting string is too long to454* fit in a <code>DOMString</code>.455* @exception LSException456* SERIALIZE_ERR: Raised if the <code>LSSerializer</code> was unable to457* serialize the node. DOM applications should attach a458* <code>DOMErrorHandler</code> using the parameter "<a href='http://www.w3.org/TR/DOM-Level-3-Core/core.html#parameter-error-handler'>459* error-handler</a>" if they wish to get details on the error.460*/461public String writeToString(Node nodeArg)462throws DOMException, LSException;463464}465466467