| <?xml version="1.0" standalone="no"?> |
| <!DOCTYPE s1 SYSTEM "../../style/dtd/document.dtd"> |
| <!-- |
| * Licensed to the Apache Software Foundation (ASF) under one |
| * or more contributor license agreements. See the NOTICE file |
| * distributed with this work for additional information |
| * regarding copyright ownership. The ASF licenses this file |
| * to you under the Apache License, Version 2.0 (the "License"); |
| * you may not use this file except in compliance with the License. |
| * You may obtain a copy of the License at |
| * |
| * http://www.apache.org/licenses/LICENSE-2.0 |
| * |
| * Unless required by applicable law or agreed to in writing, software |
| * distributed under the License is distributed on an "AS IS" BASIS, |
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| * See the License for the specific language governing permissions and |
| * limitations under the License. |
| --> |
| |
| <s1 title="XML Security Overview"> |
| <ul> |
| <li><link anchor="xsov_xmlParser">XML Parser Threats</link></li> |
| <li><link anchor="xsov_resolvEntity">Resolving External Entities</link></li> |
| <li><link anchor="xsov_trustEntity">Trusted External Entities</link></li> |
| <li><link anchor="xsov_piThreat">Processing Instruction (PI) Threats</link></li> |
| <li><link anchor="xsov_soapThreat">SOAP Simple Object Access Protocol</link></li> |
| <li><link anchor="xsov_wsdlThreat">WSDL Web Service Description Language</link></li> |
| <li><link anchor="xsov_uriThreat">URI Uniform Resource Identifiers</link></li> |
| <li><link anchor="xsov_urlThreat">URL Uniform Resource Locators</link></li> |
| <li><link anchor="xsov_malUtfStrings">Malformed UTF-8 and UTF-16 Strings</link></li> |
| <li><link anchor="xsov_canonicalXML">Canonical XML Issues</link></li> |
| <li><link anchor="xsov_xhtmlWorkaround">XHTML Output Mode - Workaround</link></li> |
| </ul> |
| |
| <br/> |
| <p><em>This document goes well beyond XSLT. Use it as a general reference.</em> |
| </p> |
| <p>There are numerous security issues and problems that are |
| endemic to the XML architecture. |
| I will try to identify some of the most common issues and threats |
| and describe some mitigation strategies. |
| </p> |
| <p>The biggest threat issue is a matter of trust. |
| How well do you trust your sources of XML data? |
| What are the tools that can help increase the trust? |
| </p> |
| <p>Most Web Service communications uses HTTP over standard TCP ports. |
| The HTTP protocol on standard TCP ports has free access through business firewalls. |
| How well do your proxy servers handle the Web Service security issues |
| required for your applications? |
| </p> |
| <p>How well are your resource identifiers protected? |
| How well do your applications cope with resource identifier spoofing? |
| Can your resource identifiers be trusted by outside clients? |
| Can you trust the credentials of your clients? |
| </p> |
| <p>Will the SOAP interface for your Web Service send error messages |
| to an untrusted Web Service address? |
| </p> |
| <p>Is your WSDL interface description file readily available for download, |
| thus enabling persons with malicious intent to create targeted attacks on your Web Services? |
| </p> |
| <p>Can you trust the client credentials that use your Web Service application? |
| </p> |
| <p>There are numerous security issues that are not directly involved in |
| the markup of XML or its processing. |
| These issues relate to infrastructure. |
| </p> |
| <p>Can you trust your DNS (Domain Name Service) and reduce its vulnerability to hijacking? |
| </p> |
| <p>Are your web servers hardened against known application vulnerabilities? |
| </p> |
| <p>Are your applications hardened against |
| cross site scripting and SQL injection? |
| </p> |
| <p>Can your client applications trust the scripts |
| that are transmitted as web pages? |
| </p> |
| <p>Can your web server trust the scripts that are submitted? |
| </p> |
| <p>Is application data sanitized before being consumed by your applications? |
| </p> |
| |
| <anchor name="xsov_xmlParser"/> |
| <s2 title="XML Parser Threats"> |
| |
| <p>This list will help you find the XML threat vectors that need to be addressed. |
| Some vectors cannot be easily resolved. |
| </p> |
| <ul> |
| <li>Resolving External Entities</li> |
| <li>Implicit Trust of Internal DTD</li> |
| <li>Resource Identifier Spoofing</li> |
| <li>Malformed UTF-8 and UTF-16</li> |
| <li>Secure the trust of external DTD descriptions</li> |
| <li>Secure the trust of external Schema definitions</li> |
| <li>Secure the trust of entity import and include constructs</li> |
| <li>Configuration of Entity Resolver Catalogs</li> |
| </ul> |
| </s2> |
| |
| <anchor name="xsov_resolvEntity"/> |
| <s2 title="Resolving External Entities"> |
| |
| <p>The XML1.0 and XML1.1 standards specify a <code>DOCTYPE</code> format. |
| The processing may uncover significant entity resolver deficiencies. |
| </p> |
| |
| <p><code><!DOCTYPE name PUBLIC "public-id" "system-id" [internal-DTD]></code><br/> |
| <code><!DOCTYPE name SYSTEM "system-id" [internal-DTD]></code> |
| </p> |
| <p>XML Parsers MUST process the <code>[internal-DTD]</code> if it exists. |
| </p> |
| <p>XML Parsers MAY process the external <code>"system-id"</code> if it can be found. |
| </p> |
| <p>XML Parsers MAY process the external <code>"public-id"</code> if it can be found. |
| </p> |
| <p>XML Parsers MAY prefer either the <code>"public-id"</code> or <code>"system-id"</code> |
| if both are specified. |
| </p> |
| <p>XML Parsers MAY ignore both the <code>"public-id"</code> and <code>"system-id"</code> |
| if present. |
| </p> |
| <p>Declaring a parameter entity notation <code>"%entity;"</code> |
| in the <code>[internal-DTD]</code> and expanding the content within the |
| <code>[internal-DTD]</code> will force the XML parser to import the content |
| referenced by the <code>"%entity;"</code> notation. |
| </p> |
| <p>Declaring a general entity notation <code>"&entity;"</code> in the |
| <code>[internal-DTD]</code> and expanding the content within the body of |
| the XML document will force the XML parser to import the content referenced |
| by the <code>"&entity"</code> notation. |
| </p> |
| <p>The default method of resolving external entities is by resolving entity |
| name strings relative to DNS named hosts and/or path names relative to the |
| local computer system. When receiving XML documents from an outside source, |
| these entity reference locations may be unreachable, unreliable, or untrusted. |
| </p> |
| <p>Web Service SOAP XML documents MUST NOT have <code>DOCTYPE</code> definitions. |
| SOAP processors should not process DOCTYPE definitions. |
| The conformance is implementation dependent. |
| </p> |
| <p><jump href="http://www.w3.org/TR/soap">http://www.w3.org/TR/soap</jump> |
| </p> |
| </s2> |
| |
| <anchor name="xsov_trustEntity"/> |
| <s2 title="Trusted External Entities"> |
| |
| <p>The <ref>OASIS XML Catalogs</ref> specification, if implemented by an application, |
| can specify a set of external entities that can be trusted by mapping known |
| identifiers to local or trusted resources. A secure application should |
| not trust entity identifiers whose resources cannot be localized and secured. |
| </p> |
| <p><jump href="http://www.oasis-open.org/committees/entity">http://www.oasis-open.org/committees/entity</jump> |
| </p> |
| <p>A similar method can be designed specifically for each application. |
| </p> |
| <p>A trusted application may need to pre-screen any entity definitions in XML |
| before passing the information into the core of the application. |
| </p> |
| <p>A trusted application should install some type of entity resolving catalog |
| or database that can be trusted. |
| </p> |
| </s2> |
| |
| <anchor name="xsov_piThreat"/> |
| <s2 title="Processing Instruction (PI) Threats"> |
| |
| <p>Processing instructions are a mechanism to send specific information |
| into an application. A common processing instruction is a |
| stylesheet declaration. |
| This information is part of an XML document and comes usually |
| after the XML header and before the root element. |
| </p> |
| <p>A stylesheet declaration may cause an application to look for an |
| untrusted XSLT stylesheet to use for transformation of the |
| following root element. A standard exists for associating style sheets with XML documents. |
| </p> |
| <p><jump href="http://www.w3.org/TR/xml-stylesheet">http://www.w3.org/TR/xml-stylesheet</jump> |
| </p> |
| <p>Examples in the xml-stylesheet recommendation describes how to use the |
| processing instruction to associate CSS stylesheets for XHTML. |
| Applications that use XSLT transformations will interpret the |
| xml-stylesheet processing instruction as the location of a |
| XSLT transformation stylesheet. |
| </p> |
| <p>As more processing instructions become standardized and in common use, |
| their threat of misuse increases. |
| </p> |
| </s2> |
| |
| <anchor name="xsov_soapThreat"/> |
| <s2 title="SOAP Simple Object Access Protocol"> |
| |
| <p>The SOAP specification explicitly forbids the transport of |
| DOCTYPE definitions and PI processing instructions. |
| </p> |
| <p>The SOAP specifies a transport envelope that encapsulates |
| an XML message for transport. SOAP can also handle various |
| transmission status indicators implying confirmation of delivery, |
| error messages, and queue status messages. |
| SOAP transports can be loosely coupled and intermittent. |
| SOAP is used extensively in the design and deployment of Web Service architectures. |
| A companion Web Service specification is WSDL, the Web Service Definition Language. |
| </p> |
| <p>The SOAP protocol as widely deployed by Microsoft and other vendors |
| is based on specifications that predate the adoption |
| by the <jump href="http://www.w3.org">World Wide Web Consortium (W3C)</jump>. |
| SOAP is not based on Microsoft technology. |
| It is an open standard drafted by UserLand, Ariba, Commerce One, Compaq, |
| Developmentor, HP, IBM, IONA, Lotus, Microsoft, and SAP. |
| <jump href="http://www.w3.org/TR/2000/NOTE-SOAP-20000508">SOAP 1.1</jump> |
| was presented to the W3C in May 2000 as an official Internet standard. |
| </p> |
| <p>The original <jump href="http://www.w3.org/TR/soap11">SOAP 1.1</jump> standard |
| is associated with this URI namespace prefix. |
| </p> |
| <p><code>http://schemas.xmlsoap.org/soap/</code> |
| </p> |
| <p>There are significant changes in naming conventions since SOAP 1.1 |
| was adopted by W3C as a recommended standard. |
| The current iteration is <jump href="http://www.w3.org/TR/soap12">SOAP 1.2</jump> |
| and is associated with this URI namespace prefix. |
| </p> |
| <p><code>http://www.w3.org/2003/05</code> |
| </p> |
| <p>The basic security threat to the SOAP architecture is |
| the ability to spoof Web Service addresses and telling a |
| SOAP server to respond to a rogue Web Service address |
| when a <code>mustUnderstand</code> attribute is processed |
| and an error indication is raised. |
| </p> |
| <p>Other intelligence that can be obtained might be the |
| location of a public accessible WSDL definition |
| of the messages being transported by SOAP, |
| thus allowing additional malware attacks to be automatically generated. |
| </p> |
| </s2> |
| |
| <anchor name="xsov_wsdlThreat"/> |
| <s2 title="WSDL Web Service Description Language"> |
| |
| <p>WSDL is known as the Web Service Description Language. |
| The WSDL XML document is a an interface description that can be transformed |
| into various programming languages. |
| Such transformed interface descriptions are recognized as |
| Java Interfaces and C++ Virtual Classes. |
| </p> |
| <p>The original <jump href="http://www.w3.org/TR/wsdl">WSDL 1.1</jump> standard |
| is associated with this URI namespace prefix. |
| </p> |
| <p><code>http://schemas.xmlsoap.org/wsdl/</code> |
| </p> |
| <p>The current <jump href="http://www.w3.org/TR/wsdl20">WSDL 2.0</jump> standard |
| is maintained by W3C in their namespace with prefix. |
| </p> |
| <p><code>http://www.w3.org/</code> |
| </p> |
| <p>The WSDL can provide a template for generating a compliant Web Service systems |
| for multiple and hetrogeneous platforms. |
| </p> |
| <p>A WSDL document that can benefit developers can also be used by malware |
| and hackers to taylor specific threats against targeted Web Services. |
| </p> |
| <p>The SOA (Service Oriented Architecure), |
| SAAS (Software As A Service), |
| PAAS (Platform As A Service) are families of |
| Web Services used as interfaces into what is |
| generally known as Cloud Computing. |
| </p> |
| </s2> |
| |
| <anchor name="xsov_uriThreat"/> |
| <s2 title="URI Uniform Resource Identifiers"> |
| |
| <p>The URI does not need to specify the location of a resource. |
| It merely provides a resource name. A catalog, database, |
| or other mechanism is used to map URIs to resource locations. |
| </p> |
| <p>The security issue here is that most URIs are used with a |
| DNS (Domain Name Service) to find a host and path to a resource. |
| The URI is then treated as a URL (Uniform Resource Locator). |
| </p> |
| <p>The mitigation of these threats requires diligence of the |
| application architects to ensure an appropriate level of trust |
| for the URIs and URLs used in their applications. |
| </p> |
| <p>The transmission media is inherently untrusted. |
| Often SOAP bindings and HTTP transports are used. |
| Web Service addressing is readily spoofed. |
| </p> |
| </s2> |
| |
| <anchor name="xsov_urlThreat"/> |
| <s2 title="URL Uniform Resource Locators"> |
| |
| <p>See: <link anchor="xsov_uriThreat">URI Uniform Resource Identifiers</link> |
| </p> |
| </s2> |
| |
| <anchor name="xsov_malUtfStrings"/> |
| <s2 title="Malformed UTF-8 and UTF-16 Strings"> |
| |
| <p>Public Key Infrastructure (X.509) certificates are leased from a |
| certificate authority or are self-signed. |
| The distinguished names and parts thereof are usually rendered in unicode. |
| </p> |
| <p>The value of zero is not a valid Unicode character. |
| It is possible to create non-zero UTF-8 and UTF-16 sequences that equate to zero, |
| which is not allowed. |
| Some rogue hackers have successfully obtained wild-card PKI (X.509) certificates |
| by prepending a UTF-8(zero) in a distinguished name when applying for a certificate. |
| Such a certificate could be used to successfully sign anything. |
| </p> |
| <p>Applications should not blindly accept UTF-8 and UTF-16 strings |
| without verifying the proper encoding for those strings. |
| Contents that equate to bad Unicode character values should be denied. |
| </p> |
| </s2> |
| |
| <anchor name="xsov_canonicalXML"/> |
| <s2 title="Canonical XML Issues"> |
| |
| <p>Canonical XML is a tranformation of an XML document into a |
| canonical form useful for signing. |
| This is used in some Web Service security implementations. |
| </p> |
| <p>There are several areas where Canonical XML will create XML documents |
| that have severe application problems. |
| </p> |
| <p>The number values are rendered in Base-10 as decimal fractions. |
| The computations performed by computers are usually in Base-2 floating point arithmetic. |
| You therefore have truncation or roundoff issues when converting between |
| decimal fractions and Base-2 fractions. |
| </p> |
| <p>The canonical process may collapse whitespace and transform |
| multi-character line endings to single-character line endings. |
| When whitespace is significant, the canonical issues for signing can cause problems. |
| </p> |
| <p>It is possible to create XHTML documents that will not work with some browsers. |
| The empty <a/> anchor element is not allowed by many browsers, |
| therefore <a></a> is required. |
| A standard XML canonical process may collapse elements with no content into empty elements. |
| The empty paragraph<p/> is disallowed. The <p></p> is supported. |
| </p> |
| <p>The World Wide Web Consortium (W3C) has additional detailed discussion of |
| <jump href="http://www.w3.org/TR/C14N-issues/">canonicalization issues</jump>. |
| </p> |
| </s2> |
| |
| <anchor name="xsov_xhtmlWorkaround"/> |
| <s2 title="XHTML Output Mode - Workaround"> |
| |
| <p>The Xalan-C/C++ library currently has no XHTML output mode. |
| Since XHTML is to be well-formed XML, the desire is to use the XML output method. |
| </p> |
| <p>XHTML is based on HTML version 4. |
| </p> |
| <p>Empty elements declared by HTML-4 should have a space before the |
| trailing '/>' markup (i.e. <br /> and <hr />). |
| XML output mode does not normally have this space when using |
| the <xsl:element name="br" /> in your stylesheet. |
| Most modern browsers are ok with no space, but viewing the |
| browser source shows a warning condition. |
| </p> |
| <p>Non-empty elements declared by HTML-4 should not be rendered as empty XML elements. |
| If there is no content, the elements should be rendered with both a start-tag and end-tag |
| (i.e. <a name="xxx"></a>) instead of an XML empty-element. |
| XSLT processors usually create an empty-element |
| (i.e. <a name="xxx"/>) when the element being defined has no content |
| other than attributes. |
| </p> |
| <p>For XSLT processors creating XML documents for XHTML, |
| you can create what looks like an element with no content by including |
| the &#8204; character |
| (a zero-width non-joining character often known as &zwnj;) |
| as the element text content. |
| This also allows transitional browsers the ability to find the end tag. |
| </p> |
| <p><source> DTD <!ENTITY zwnj "&#8204;"> |
| |
| <a name="marker">&zwnj;</a></source> |
| </p> |
| <p>Transitional XHTML is not usually well-formed XML. |
| It becomes a mix of HTML version 4 and XML markup. |
| Strict XHTML is required to be well-formed XML. |
| </p> |
| </s2> |
| </s1> |