<?xml version="1.0" encoding="UTF-8" standalone="no"?> | |
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> | |
<html> | |
<head> | |
<title>ASF: XML Security Overview</title> | |
<meta http-equiv="content-type" content="text/html; charset=UTF-8" /> | |
<meta http-equiv="Content-Style-Type" content="text/css" /> | |
<link rel="stylesheet" type="text/css" href="resources/apache-xalan.css" /> | |
</head> | |
<!-- | |
* Licensed to the Apache Software Foundation (ASF) under one | |
* or more contributor license agreements. See the NOTICE file | |
* distributed with this work for additional information | |
* regarding copyright ownership. The ASF licenses this file | |
* to you under the Apache License, Version 2.0 (the "License"); | |
* you may not use this file except in compliance with the License. | |
* You may obtain a copy of the License at | |
* | |
* http://www.apache.org/licenses/LICENSE-2.0 | |
* | |
* Unless required by applicable law or agreed to in writing, software | |
* distributed under the License is distributed on an "AS IS" BASIS, | |
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |
* See the License for the specific language governing permissions and | |
* limitations under the License. | |
--> | |
<body> | |
<div id="title"> | |
<table class="HdrTitle"> | |
<tbody> | |
<tr> | |
<th rowspan="2"> | |
<a href="../index.html"> | |
<img alt="Trademark Logo" src="resources/XalanC-Logo-tm.png" width="190" height="90" /> | |
</a> | |
</th> | |
<th text-align="center" width="75%"> | |
<a href="index.html">Xalan-C/C++ Version 1.11</a> | |
</th> | |
</tr> | |
<tr> | |
<td valign="middle">XML Security Overview</td> | |
</tr> | |
</tbody> | |
</table> | |
<table class="HdrButtons" align="center" border="1"> | |
<tbody> | |
<tr> | |
<td> | |
<a href="http://www.apache.org">Apache Foundation</a> | |
</td> | |
<td> | |
<a href="http://xalan.apache.org">Xalan Project</a> | |
</td> | |
<td> | |
<a href="http://xerces.apache.org">Xerces Project</a> | |
</td> | |
<td> | |
<a href="http://www.w3.org/TR">Web Consortium</a> | |
</td> | |
<td> | |
<a href="http://www.oasis-open.org/standards">Oasis Open</a> | |
</td> | |
</tr> | |
</tbody> | |
</table> | |
</div> | |
<div id="navLeft"> | |
<ul> | |
<li> | |
<a href="resources.html">Resources</a> | |
<br /> | |
</li> | |
<li> | |
<a href="../index.html">Home</a> | |
</li></ul><hr /><ul> | |
<li> | |
<a href="index.html">Xalan-C++ 1.11</a> | |
</li> | |
<li> | |
<a href="whatsnew.html">What's New</a> | |
</li> | |
<li> | |
<a href="license.html">Licenses</a> | |
</li></ul><hr /><ul> | |
<li> | |
<a href="overview.html">Overview</a> | |
</li> | |
<li> | |
<a href="charter.html">Charter</a> | |
</li></ul><hr /><ul> | |
<li> | |
<a href="download.html">Download</a> | |
</li> | |
<li> | |
<a href="buildlibs.html">Build Libraries</a> | |
</li> | |
<li> | |
<a href="install.html">Installation</a> | |
</li> | |
<li> | |
<a href="builddocs.html">Build Documents</a> | |
</li></ul><hr /><ul> | |
<li> | |
<a href="samples.html">Sample Apps</a> | |
</li> | |
<li> | |
<a href="commandline.html">Command Line</a> | |
</li> | |
<li> | |
<a href="usagepatterns.html">Usage Patterns</a> | |
</li></ul><hr /><ul> | |
<li> | |
<a href="programming.html">Programming</a> | |
</li> | |
<li> | |
<a href="extensions.html">Extensions</a> | |
</li> | |
<li> | |
<a href="extensionslib.html">Extensions Library</a> | |
</li> | |
<li> | |
<a href="apiDocs/index.html">API Reference</a> | |
</li></ul><hr /><ul> | |
<li> | |
<a href="faq.html">Xalan-C FAQs</a> | |
</li></ul><hr /><ul> | |
<li> | |
<a href="whatsnew.html#bugs">Bugs</a> | |
</li> | |
<li> | |
<a href="http://xalan.apache.org/old/xalan-j/test/run.html#how-to-run-c">Testing</a> | |
</li> | |
<li>Web Security<br /> | |
</li> | |
</ul> | |
</div> | |
<div id="content"> | |
<h2>XML Security Overview</h2> | |
<ul> | |
<li> | |
<a href="#xsov_xmlParser">XML Parser Threats</a> | |
</li> | |
<li> | |
<a href="#xsov_resolvEntity">Resolving External Entities</a> | |
</li> | |
<li> | |
<a href="#xsov_trustEntity">Trusted External Entities</a> | |
</li> | |
<li> | |
<a href="#xsov_piThreat">Processing Instruction (PI) Threats</a> | |
</li> | |
<li> | |
<a href="#xsov_soapThreat">SOAP Simple Object Access Protocol</a> | |
</li> | |
<li> | |
<a href="#xsov_wsdlThreat">WSDL Web Service Description Language</a> | |
</li> | |
<li> | |
<a href="#xsov_uriThreat">URI Uniform Resource Identifiers</a> | |
</li> | |
<li> | |
<a href="#xsov_urlThreat">URL Uniform Resource Locators</a> | |
</li> | |
<li> | |
<a href="#xsov_malUtfStrings">Malformed UTF-8 and UTF-16 Strings</a> | |
</li> | |
<li> | |
<a href="#xsov_canonicalXML">Canonical XML Issues</a> | |
</li> | |
<li> | |
<a href="#xsov_xhtmlWorkaround">XHTML Output Mode - Workaround</a> | |
</li> | |
</ul> | |
<br /> | |
<p> | |
<b>This document goes well beyond XSLT. Use it as a general reference.</b> | |
</p> | |
<p>There are numerous security issues and problems that are | |
endemic to the XML architecture. | |
I will try to identify some of the most common issues and threats | |
and describe some mitigation strategies. | |
</p> | |
<p>The biggest threat issue is a matter of trust. | |
How well do you trust your sources of XML data? | |
What are the tools that can help increase the trust? | |
</p> | |
<p>Most Web Service communications uses HTTP over standard TCP ports. | |
The HTTP protocol on standard TCP ports has free access through business firewalls. | |
How well do your proxy servers handle the Web Service security issues | |
required for your applications? | |
</p> | |
<p>How well are your resource identifiers protected? | |
How well do your applications cope with resource identifier spoofing? | |
Can your resource identifiers be trusted by outside clients? | |
Can you trust the credentials of your clients? | |
</p> | |
<p>Will the SOAP interface for your Web Service send error messages | |
to an untrusted Web Service address? | |
</p> | |
<p>Is your WSDL interface description file readily available for download, | |
thus enabling persons with malicious intent to create targeted attacks on your Web Services? | |
</p> | |
<p>Can you trust the client credentials that use your Web Service application? | |
</p> | |
<p>There are numerous security issues that are not directly involved in | |
the markup of XML or its processing. | |
These issues relate to infrastructure. | |
</p> | |
<p>Can you trust your DNS (Domain Name Service) and reduce its vulnerability to hijacking? | |
</p> | |
<p>Are your web servers hardened against known application vulnerabilities? | |
</p> | |
<p>Are your applications hardened against | |
cross site scripting and SQL injection? | |
</p> | |
<p>Can your client applications trust the scripts | |
that are transmitted as web pages? | |
</p> | |
<p>Can your web server trust the scripts that are submitted? | |
</p> | |
<p>Is application data sanitized before being consumed by your applications? | |
</p> | |
<a name="xsov_xmlParser"></a> | |
<p align="right" size="2"> | |
<a href="#content">(top)</a> | |
</p> | |
<h3>XML Parser Threats</h3> | |
<p>This list will help you find the XML threat vectors that need to be addressed. | |
Some vectors cannot be easily resolved. | |
</p> | |
<ul> | |
<li>Resolving External Entities</li> | |
<li>Implicit Trust of Internal DTD</li> | |
<li>Resource Identifier Spoofing</li> | |
<li>Malformed UTF-8 and UTF-16</li> | |
<li>Secure the trust of external DTD descriptions</li> | |
<li>Secure the trust of external Schema definitions</li> | |
<li>Secure the trust of entity import and include constructs</li> | |
<li>Configuration of Entity Resolver Catalogs</li> | |
</ul> | |
<a name="xsov_resolvEntity"></a> | |
<p align="right" size="2"> | |
<a href="#content">(top)</a> | |
</p> | |
<h3>Resolving External Entities</h3> | |
<p>The XML1.0 and XML1.1 standards specify a <code>DOCTYPE</code> format. | |
The processing may uncover significant entity resolver deficiencies. | |
</p> | |
<p> | |
<code><!DOCTYPE name PUBLIC "public-id" "system-id" [internal-DTD]></code> | |
<br /> | |
<code><!DOCTYPE name SYSTEM "system-id" [internal-DTD]></code> | |
</p> | |
<p>XML Parsers MUST process the <code>[internal-DTD]</code> if it exists. | |
</p> | |
<p>XML Parsers MAY process the external <code>"system-id"</code> if it can be found. | |
</p> | |
<p>XML Parsers MAY process the external <code>"public-id"</code> if it can be found. | |
</p> | |
<p>XML Parsers MAY prefer either the <code>"public-id"</code> or <code>"system-id"</code> | |
if both are specified. | |
</p> | |
<p>XML Parsers MAY ignore both the <code>"public-id"</code> and <code>"system-id"</code> | |
if present. | |
</p> | |
<p>Declaring a parameter entity notation <code>"%entity;"</code> | |
in the <code>[internal-DTD]</code> and expanding the content within the | |
<code>[internal-DTD]</code> will force the XML parser to import the content | |
referenced by the <code>"%entity;"</code> notation. | |
</p> | |
<p>Declaring a general entity notation <code>"&entity;"</code> in the | |
<code>[internal-DTD]</code> and expanding the content within the body of | |
the XML document will force the XML parser to import the content referenced | |
by the <code>"&entity"</code> notation. | |
</p> | |
<p>The default method of resolving external entities is by resolving entity | |
name strings relative to DNS named hosts and/or path names relative to the | |
local computer system. When receiving XML documents from an outside source, | |
these entity reference locations may be unreachable, unreliable, or untrusted. | |
</p> | |
<p>Web Service SOAP XML documents MUST NOT have <code>DOCTYPE</code> definitions. | |
SOAP processors should not process DOCTYPE definitions. | |
The conformance is implementation dependent. | |
</p> | |
<p> | |
<a href="http://www.w3.org/TR/soap">http://www.w3.org/TR/soap</a> | |
</p> | |
<a name="xsov_trustEntity"></a> | |
<p align="right" size="2"> | |
<a href="#content">(top)</a> | |
</p> | |
<h3>Trusted External Entities</h3> | |
<p>The <b> | |
<i>OASIS XML Catalogs</i> | |
</b> specification, if implemented by an application, | |
can specify a set of external entities that can be trusted by mapping known | |
identifiers to local or trusted resources. A secure application should | |
not trust entity identifiers whose resources cannot be localized and secured. | |
</p> | |
<p> | |
<a href="http://www.oasis-open.org/committees/entity">http://www.oasis-open.org/committees/entity</a> | |
</p> | |
<p>A similar method can be designed specifically for each application. | |
</p> | |
<p>A trusted application may need to pre-screen any entity definitions in XML | |
before passing the information into the core of the application. | |
</p> | |
<p>A trusted application should install some type of entity resolving catalog | |
or database that can be trusted. | |
</p> | |
<a name="xsov_piThreat"></a> | |
<p align="right" size="2"> | |
<a href="#content">(top)</a> | |
</p> | |
<h3>Processing Instruction (PI) Threats</h3> | |
<p>Processing instructions are a mechanism to send specific information | |
into an application. A common processing instruction is a | |
stylesheet declaration. | |
This information is part of an XML document and comes usually | |
after the XML header and before the root element. | |
</p> | |
<p>A stylesheet declaration may cause an application to look for an | |
untrusted XSLT stylesheet to use for transformation of the | |
following root element. A standard exists for associating style sheets with XML documents. | |
</p> | |
<p> | |
<a href="http://www.w3.org/TR/xml-stylesheet">http://www.w3.org/TR/xml-stylesheet</a> | |
</p> | |
<p>Examples in the xml-stylesheet recommendation describes how to use the | |
processing instruction to associate CSS stylesheets for XHTML. | |
Applications that use XSLT transformations will interpret the | |
xml-stylesheet processing instruction as the location of a | |
XSLT transformation stylesheet. | |
</p> | |
<p>As more processing instructions become standardized and in common use, | |
their threat of misuse increases. | |
</p> | |
<a name="xsov_soapThreat"></a> | |
<p align="right" size="2"> | |
<a href="#content">(top)</a> | |
</p> | |
<h3>SOAP Simple Object Access Protocol</h3> | |
<p>The SOAP specification explicitly forbids the transport of | |
DOCTYPE definitions and PI processing instructions. | |
</p> | |
<p>The SOAP specifies a transport envelope that encapsulates | |
an XML message for transport. SOAP can also handle various | |
transmission status indicators implying confirmation of delivery, | |
error messages, and queue status messages. | |
SOAP transports can be loosely coupled and intermittent. | |
SOAP is used extensively in the design and deployment of Web Service architectures. | |
A companion Web Service specification is WSDL, the Web Service Definition Language. | |
</p> | |
<p>The SOAP protocol as widely deployed by Microsoft and other vendors | |
is based on specifications that predate the adoption | |
by the <a href="http://www.w3.org">World Wide Web Consortium (W3C)</a>. | |
SOAP is not based on Microsoft technology. | |
It is an open standard drafted by UserLand, Ariba, Commerce One, Compaq, | |
Developmentor, HP, IBM, IONA, Lotus, Microsoft, and SAP. | |
<a href="http://www.w3.org/TR/2000/NOTE-SOAP-20000508">SOAP 1.1</a> | |
was presented to the W3C in May 2000 as an official Internet standard. | |
</p> | |
<p>The original <a href="http://www.w3.org/TR/soap11">SOAP 1.1</a> standard | |
is associated with this URI namespace prefix. | |
</p> | |
<p> | |
<code>http://schemas.xmlsoap.org/soap/</code> | |
</p> | |
<p>There are significant changes in naming conventions since SOAP 1.1 | |
was adopted by W3C as a recommended standard. | |
The current iteration is <a href="http://www.w3.org/TR/soap12">SOAP 1.2</a> | |
and is associated with this URI namespace prefix. | |
</p> | |
<p> | |
<code>http://www.w3.org/2003/05</code> | |
</p> | |
<p>The basic security threat to the SOAP architecture is | |
the ability to spoof Web Service addresses and telling a | |
SOAP server to respond to a rogue Web Service address | |
when a <code>mustUnderstand</code> attribute is processed | |
and an error indication is raised. | |
</p> | |
<p>Other intelligence that can be obtained might be the | |
location of a public accessible WSDL definition | |
of the messages being transported by SOAP, | |
thus allowing additional malware attacks to be automatically generated. | |
</p> | |
<a name="xsov_wsdlThreat"></a> | |
<p align="right" size="2"> | |
<a href="#content">(top)</a> | |
</p> | |
<h3>WSDL Web Service Description Language</h3> | |
<p>WSDL is known as the Web Service Description Language. | |
The WSDL XML document is a an interface description that can be transformed | |
into various programming languages. | |
Such transformed interface descriptions are recognized as | |
Java Interfaces and C++ Virtual Classes. | |
</p> | |
<p>The original <a href="http://www.w3.org/TR/wsdl">WSDL 1.1</a> standard | |
is associated with this URI namespace prefix. | |
</p> | |
<p> | |
<code>http://schemas.xmlsoap.org/wsdl/</code> | |
</p> | |
<p>The current <a href="http://www.w3.org/TR/wsdl20">WSDL 2.0</a> standard | |
is maintained by W3C in their namespace with prefix. | |
</p> | |
<p> | |
<code>http://www.w3.org/</code> | |
</p> | |
<p>The WSDL can provide a template for generating a compliant Web Service systems | |
for multiple and hetrogeneous platforms. | |
</p> | |
<p>A WSDL document that can benefit developers can also be used by malware | |
and hackers to taylor specific threats against targeted Web Services. | |
</p> | |
<p>The SOA (Service Oriented Architecure), | |
SAAS (Software As A Service), | |
PAAS (Platform As A Service) are families of | |
Web Services used as interfaces into what is | |
generally known as Cloud Computing. | |
</p> | |
<a name="xsov_uriThreat"></a> | |
<p align="right" size="2"> | |
<a href="#content">(top)</a> | |
</p> | |
<h3>URI Uniform Resource Identifiers</h3> | |
<p>The URI does not need to specify the location of a resource. | |
It merely provides a resource name. A catalog, database, | |
or other mechanism is used to map URIs to resource locations. | |
</p> | |
<p>The security issue here is that most URIs are used with a | |
DNS (Domain Name Service) to find a host and path to a resource. | |
The URI is then treated as a URL (Uniform Resource Locator). | |
</p> | |
<p>The mitigation of these threats requires diligence of the | |
application architects to ensure an appropriate level of trust | |
for the URIs and URLs used in their applications. | |
</p> | |
<p>The transmission media is inherently untrusted. | |
Often SOAP bindings and HTTP transports are used. | |
Web Service addressing is readily spoofed. | |
</p> | |
<a name="xsov_urlThreat"></a> | |
<p align="right" size="2"> | |
<a href="#content">(top)</a> | |
</p> | |
<h3>URL Uniform Resource Locators</h3> | |
<p>See: <a href="#xsov_uriThreat">URI Uniform Resource Identifiers</a> | |
</p> | |
<a name="xsov_malUtfStrings"></a> | |
<p align="right" size="2"> | |
<a href="#content">(top)</a> | |
</p> | |
<h3>Malformed UTF-8 and UTF-16 Strings</h3> | |
<p>Public Key Infrastructure (X.509) certificates are leased from a | |
certificate authority or are self-signed. | |
The distinguished names and parts thereof are usually rendered in unicode. | |
</p> | |
<p>The value of zero is not a valid Unicode character. | |
It is possible to create non-zero UTF-8 and UTF-16 sequences that equate to zero, | |
which is not allowed. | |
Some rogue hackers have successfully obtained wild-card PKI (X.509) certificates | |
by prepending a UTF-8(zero) in a distinguished name when applying for a certificate. | |
Such a certificate could be used to successfully sign anything. | |
</p> | |
<p>Applications should not blindly accept UTF-8 and UTF-16 strings | |
without verifying the proper encoding for those strings. | |
Contents that equate to bad Unicode character values should be denied. | |
</p> | |
<a name="xsov_canonicalXML"></a> | |
<p align="right" size="2"> | |
<a href="#content">(top)</a> | |
</p> | |
<h3>Canonical XML Issues</h3> | |
<p>Canonical XML is a tranformation of an XML document into a | |
canonical form useful for signing. | |
This is used in some Web Service security implementations. | |
</p> | |
<p>There are several areas where Canonical XML will create XML documents | |
that have severe application problems. | |
</p> | |
<p>The number values are rendered in Base-10 as decimal fractions. | |
The computations performed by computers are usually in Base-2 floating point arithmetic. | |
You therefore have truncation or roundoff issues when converting between | |
decimal fractions and Base-2 fractions. | |
</p> | |
<p>The canonical process may collapse whitespace and transform | |
multi-character line endings to single-character line endings. | |
When whitespace is significant, the canonical issues for signing can cause problems. | |
</p> | |
<p>It is possible to create XHTML documents that will not work with some browsers. | |
The empty <a/> anchor element is not allowed by many browsers, | |
therefore <a></a> is required. | |
A standard XML canonical process may collapse elements with no content into empty elements. | |
The empty paragraph<p/> is disallowed. The <p></p> is supported. | |
</p> | |
<p>The World Wide Web Consortium (W3C) has additional detailed discussion of | |
<a href="http://www.w3.org/TR/C14N-issues/">canonicalization issues</a>. | |
</p> | |
<a name="xsov_xhtmlWorkaround"></a> | |
<p align="right" size="2"> | |
<a href="#content">(top)</a> | |
</p> | |
<h3>XHTML Output Mode - Workaround</h3> | |
<p>The Xalan-C/C++ library currently has no XHTML output mode. | |
Since XHTML is to be well-formed XML, the desire is to use the XML output method. | |
</p> | |
<p>XHTML is based on HTML version 4. | |
</p> | |
<p>Empty elements declared by HTML-4 should have a space before the | |
trailing '/>' markup (i.e. <br /> and <hr />). | |
XML output mode does not normally have this space when using | |
the <xsl:element name="br" /> in your stylesheet. | |
Most modern browsers are ok with no space, but viewing the | |
browser source shows a warning condition. | |
</p> | |
<p>Non-empty elements declared by HTML-4 should not be rendered as empty XML elements. | |
If there is no content, the elements should be rendered with both a start-tag and end-tag | |
(i.e. <a name="xxx"></a>) instead of an XML empty-element. | |
XSLT processors usually create an empty-element | |
(i.e. <a name="xxx"/>) when the element being defined has no content | |
other than attributes. | |
</p> | |
<p>For XSLT processors creating XML documents for XHTML, | |
you can create what looks like an element with no content by including | |
the &#8204; character | |
(a zero-width non-joining character often known as &zwnj;) | |
as the element text content. | |
This also allows transitional browsers the ability to find the end tag. | |
</p> | |
<p> | |
<blockquote class="source"> | |
<pre> DTD <!ENTITY zwnj "&#8204;"> | |
<a name="marker">&zwnj;</a></pre> | |
</blockquote> | |
</p> | |
<p>Transitional XHTML is not usually well-formed XML. | |
It becomes a mix of HTML version 4 and XML markup. | |
Strict XHTML is required to be well-formed XML. | |
</p> | |
<p align="right" size="2"> | |
<a href="#content">(top)</a> | |
</p> | |
</div> | |
<div id="footer">Copyright © 1999-2012 The Apache Software Foundation<br />Apache, Xalan, and the Feather logo are trademarks of The Apache Software Foundation<div class="small">Web Page created on - Sun 09/09/2012</div> | |
</div> | |
</body> | |
</html> |