blob: 74aa27479268af6ce03c44da3ef0e920464e37b3 [file] [log] [blame]
<html>
<head>
<meta http-equiv="content-type" content="">
<title>Sending Binary data with SOAP</title>
</head>
<body>
<h1>Sending Binary Data with SOAP</h1>
<ul>
<li><a href="#1">Introduction</a></li>
<li><a href="#2">MTOM with Axis2 </a>
<ul>
<li><a href="#21">Programming Model</a></li>
<li><a href="#22">Enabling MTOM optimization at client side</a></li>
<li><a href="#23">Enabling MTOM optimization at server side</a></li>
<li><a href="#24">Accessing received Binary Data (Sample Code) </a>
<ul>
<li><a href="#241">Service</a></li>
<li><a href="#242">Client</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#3">SOAP with Attachments with Axis2</a></li>
<li><a href="#4">Advanced Topics </a>
<ul>
<li><a href="#41">File Caching for Attachments</a></li>
</ul>
</li>
</ul>
<h2>Introduction</h2>
<p>Despite the flexibility, interoperability and global acceptance of XML,
there are times when serializing data into XML does not make sense. Web
services users may need to transmit binary attachments of various sorts like
images, drawings, xml docs, etc together with SOAP message. Such data are
often in a particular binary format.<br/>
Traditionally, two techniques have been used in dealing with opaque data in
XML;</p>
<ol>
<li><strong>"By value"</strong></li>
<blockquote>
<p>Sending binary data by value is achieved by embedding opaque data (of
course after some form of encoding) as element or attribute content of
the XML component of data. The main advantage of this technique is that
it gives applications the ability to process and describe data based and
looking only on XML component of the data.</p>
<p>XML supports opaque data as content through the use of either base64
or hexadecimal text encoding. Both these techniques bloat the size of the
data. For UTF-8 underlying text encoding, base64 encoding increases the
size of the binary data by a factor of 1.33x of the original size, while
hexadecimal encoding expands data by a factor of 2x. Above factors will
be doubled if UTF-16 text encoding is used. Also of concern is the
overhead in processing costs (both real and perceived) for these formats,
especially when decoding back into raw binary.</p>
</blockquote>
<li><strong>"By reference"</strong>
<blockquote>
<p>Sending binary data by reference is achieved by attaching pure
binary data as external unparsed general entities outside of the XML
document and then embedding reference URI's to those entities as
elements or attribute values. This prevents the unnecessary bloating of
data and wasting of processing power. The primary obstacle for using
these unparsed entities is their heavy reliance on DTDs, which impedes
modularity as well as use of XML namespaces.</p>
<p>There were several specifications introduced in the Web services
world to deal with this binary attachment problem using the "by
reference" technique. <a
href="http://www.w3.org/TR/SOAP-attachments">SOAP with Attachments</a>
is one such example. Since SOAP prohibits document type declarations
(DTD) in messages, this leads to the problem of not representing data
as part of the message infoset, creating two data models. This scenario
is like sending attachments with an e-mail message. Even though those
attachments are related to the message content they are not inside the
message. This causes the technologies for processing and description
of data based on XML component of the data, to malfunction. One example
is WS-Security.</p>
</blockquote>
</li>
</ol>
<p><a href="http://www.w3.org/TR/2004/PR-soap12-mtom-20041116/">MTOM (SOAP
Message Transmission Optimization Mechanism)</a> is another specification
which focuses on solving the "Attachments" problem. MTOM tries to leverage
the advantages of above two techniques by trying to merge the two techniques.
MTOM is actually a "by reference" method. Wire format of a MTOM optimized
message is same as the Soap with Attachments message, which also makes it
backward compatible with SwA endpoints. The most notable feature of MTOM is
the use of XOP:Include element, which is defined in <a
href="http://www.w3.org/TR/2004/PR-xop10-20041116/">XML Binary Optimized
Packaging (XOP)</a> specification to reference the binary attachments
(external unparsed general entities) of the message. With the use of this
exclusive element the attached binary content logically become inline (by
value) with the SOAP document even though actually it is attached separately.
This merges the two realms by making it possible to work only with one data
model. This allows the applications to process and describe by only looking
at XML part making reliance on DTDs obsolete. On a lighter note MTOM has
standardized the referencing mechanism of SwA. Following is an extract from
the <a href="http://www.w3.org/TR/2004/PR-xop10-20041116/">XOP</a>
specification.</p>
<p><em>At the conceptual level, this binary data can be thought of as being
base64-encoded in the XML Document. As this conceptual form might be needed
during some processing of the XML Document (e.g., for signing the XML
document), it is necessary to have a one to one correspondence between XML
Infosets and XOP Packages. Therefore, the conceptual representation of such
binary data is as if it were base64-encoded, using the canonical lexical form
of XML Schema base64Binary datatype (see <a href="#XMLSchemaP2">[XML Schema
Part 2: Datatypes Second Edition] </a><a
href="http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#base64Binary">3.2.16
base64Binary</a>). In the reverse direction, XOP is capable of optimizing
only base64-encoded Infoset data that is in the canonical lexical
form.</em></p>
<p>Apache Axis2 supports <strong>Base64 encoding</strong>, <strong>SOAP with
Attachments</strong> &amp; <strong>MTOM (SOAP Message Transmission
Optimization Mechanism).</strong></p>
<h1>MTOM with Axis2</h1>
<h2>Programming Model</h2>
<p>AXIOM is (and may be the first) Object Model which has the ability to hold
binary data. It has been given this ability by allowing OMText to hold raw
binary content in the form of javax.activation.DataHandler. OMText has been
chosen for this purpose with two reasons. One is that XOP (MTOM) is capable
of optimizing only base64-encoded Infoset data that is in the canonical
lexical form of XML Schema base64Binary datatype. Other one is to preserve
the infoset in both sender and receiver (To store the binary content in the
same kind of object regardless of whether it is optimized or not).</p>
<p>MTOM allows to selectively encode portions of the message, which allows us
to send base64encoded data as well as externally attached raw binary data
referenced by "XOP" element (optimized content) to be send in a SOAP message.
User can specify whether an OMText node which contains raw binary data or
base64encoded binary data is qualified to be optimized or not at the
construction time of that node or later. To take the optimum efficiency of
MTOM a user is advised to send smaller binary attachments using
base64encoding (None optimized) and larger attachments as optimized
content.</p>
<source><pre> OMElement imageElement = fac.createOMElement("image", omNs);
// Creating the Data Handler for the image.
// User has the option to use a FileDataSource or a ImageDataSource
// in this scenario...
Image image;
image = new ImageIO()
.loadImage(new FileInputStream(inputImageFileName));
ImageDataSource dataSource = new ImageDataSource("test.jpg",image);
DataHandler dataHandler = new DataHandler(dataSource);
//create an OMText node with the above DataHandler and set optimized to true
OMText textData = fac.createText(dataHandler, true);
imageElement.addChild(textData);
//User can set optimized to false by using the following
//textData.doOptimize(false);</pre>
</source>
<p>Also a user can create an optimizable binary content node using a base64
encoded string, which contains encoded binary content, given with the mime
type of the actual binary representation.</p>
<source><pre> String base64String = "some_string";
OMText binaryNode = fac.createText(base64String,"image/jpg",true);</pre>
</source>
<p>Axis2 uses javax.activation.DataHandler to handle the binary data. All
optimized binary content nodes will be serialized as Base64 Strings if "MTOM
is not enabled". One can also create binary content nodes which will not be
optimized at any case. They will be serialized and send as Base64 Strings.</p>
<source><pre> //create an OMText node with the above DataHandler and set "optimized" to false
//This data will be send as Base64 encoded string regardless of MTOM is enabled or not
javax.activation.DataHandler dataHandler = new javax.activation.DataHandler(new FileDataHandler("someLocation"));
OMText textData = fac.createText(dataHandler, false);
image.addChild(textData);</pre>
</source>
<h2>Enabling MTOM optimization at client side</h2>
<p>Set the "enableMTOM" property in the Options to true, when sending
messages.</p>
<source><pre> ServiceClient serviceClient = new ServiceClient ();
Options options = new Options();
options.setTo(targetEPR);
options.setProperty(Constants.Configuration.ENABLE_MTOM, Constants.VALUE_TRUE);
serviceClient .setOptions(options);</pre>
</source>
<p>When this property is set to true any SOAP envelope which contains
optimizable content (OMText nodes containing binary content with optimizable
flag "true") will be serialized as a MTOM optimized message. Messages will
not be packaged as MTOM if they did not contain any optimizable content even
though MTOM is enabled. But due considering the policy assertions, there may
be a policy saying, all the request should be optimized eventhough there are
any optimized contents. To support this phenomenon there is an entry called
"forced mime" which has to be set as</p>
<source><pre> ServiceClient serviceClient = new ServiceClient ();
Options options = new Options();
options.setTo(targetEPR);
options.setProperty(Constants.Configuration.FORCE_MIME, Constants.VALUE_TRUE);
serviceClient.setOptions(options);</pre>
</source>
<p></p>
<p>Axis2 serializes all binary content nodes as Base64 encoded strings
regardless of they are qualified to be optimize or not, if,</p>
<ul>
<li>"enableMTOM" property is set to false.</li>
<li>If envelope contains any element information items of name xop:Include
(see <a href="#XOP">[XML-binary Optimized Packaging] </a><a
href="http://www.w3.org/TR/2005/REC-xop10-20050125/#xop_infosets">3. XOP
Infosets Constructs </a>).</li>
</ul>
<p>MTOM is *always enabled* in Axis2 when it comes to receiving messages.
Axis2 will automatically identify and de-serialize any MTOM message it
receives.</p>
<p></p>
<p><a name="23"></a></p>
<h2>Enabling MTOM optimization in the Server side</h2>
<p>Axis 2 server automatically identifies incoming MTOM optimized messages
based on the content-type and de-serializes accordingly. User can enableMTOM
in the server side for outgoing messages,</p>
<ul>
<li>Globally for all services
<blockquote>
<p>add and set the "enableMTOM" parameter to true in the Axis2.xml.
When it is set, *outgoing* messages *which contains optimizable
content* will be serialized and send as MTOM optimized messages. If it
is not set all the binary data in binary content nodes will be
serialized as Base64 encoded strings.</p>
</blockquote>
</li>
</ul>
<p><source>
<pre>&lt;parameter name="enableMTOM" locked="false"&gt;true&lt;/parameter&gt;</pre>
</source>
</p>
<ul>
<ul>
<p>User must restart the server after setting this parameter.</p>
</ul>
</ul>
<p><a name="24"></a></p>
<h2>Accessing received Binary Data</h2>
<ul>
<li><strong><a name="241"></a>Service</strong></li>
</ul>
<source><pre>public class MTOMService {
public OMElement mtomSample(OMElement element) throws Exception {
OMElement imageEle = element.getFirstElement();
OMElement imageName = (OMElement) imageEle.getNextSibling();
OMText binaryNode = (OMText) imageEle.getFirstChild();
String fileName = imageName.getText();
//Extracting the data and saving
DataHandler actualDH;
actualDH = binaryNode.getDataHandler();
Image actualObject = new ImageIO().loadImage(actualDH.getDataSource()
.getInputStream());
FileOutputStream imageOutStream = new FileOutputStream(fileName);
new JDK13IO().saveImage("image/jpeg", actualObject, imageOutStream);
//setting response
OMFactory fac = OMAbstractFactory.getOMFactory();
OMNamespace ns = fac.createOMNamespace("urn://fakenamespace", "ns");
OMElement ele = fac.createOMElement("response", ns);
ele.setText("Image Saved");
return ele;
}
}</pre>
</source><ul>
<a name="242"/>
<li><strong>Client</strong></li>
</ul>
<source><pre> Options options = new Options();
options.setTo(targetEPR);
// enabling MTOM
options.set(Constants.Configuration.ENABLE_MTOM, Constants.VALUE_TRUE);
options.setTransportInfo(Constants.TRANSPORT_HTTP,
Constants.TRANSPORT_HTTP, false);
options.setSoapVersionURI(SOAP12Constants.SOAP_ENVELOPE_NAMESPACE_URI);
OMElement result = (OMElement) call.invokeBlocking(operationName.
getLocalPart(),payload);
OMElement ele = (OMElement) result.getFirstChild();
OMText binaryNode = (OMText) ele.getFirstChild();
// Retrieving the DataHandler &amp; then do whatever the processing to the data
DataHandler actualDH;
actualDH = binaryNode.getDataHandler();
Image actualObject = new ImageIO().loadImage(actualDH.getDataSource()
.getInputStream());</pre>
</source>
<p><a name="3"></a></p>
<h1>SOAP with Attachments (SwA) with Axis2</h1>
<p>Axis2 Handles SwA messages at the inflow only. When Axis2 receives a SwA
message it extracts the binary attachment parts and puts a reference to those
parts in the Message Context. Users can access binary attachments using the
content-id. Care should be taken to rip off the "cid" prefix when content-id
is taken from the "Href" attributes. When accessing the message context from
a service users need to use the message context injection mechanism by
introducing an "init" method to the service class.(see the following service
example)</p>
<p>Note: Axis2 supports content-id referencing only. Axis2 does not support
Content Location referencing of MIME parts.</p>
<ul>
<li><strong>Sample service which accepts a SwA message</strong></li>
</ul>
<source><pre>public class EchoSwA {
private MessageContext msgcts;
public void init(MessageContext msgcts) {
this.msgcts = msgcts;
}
public OMElement echoAttachment(OMElement omEle) {
OMElement child = (OMElement)omEle.getFirstChild();
//retreiving the Href attribute which contains the Content Id
OMAttribute attr = (OMAttribute)child.getFirstAttribute(new QName("href"));
String contentID = attr.getValue();
//content-id processing to remove the "cid" prefix
contentID = contentID.trim();
if (contentID.substring(0, 3).equalsIgnoreCase("cid")) {
contentID = contentID.substring(4);
}
// Retrieving the MIMEHelper instance (which contains reference to attachments)
// from the Message Context
MIMEHelper attachments = (MIMEHelper)msgcts.getProperty(MIMEHelper.ATTACHMENTS);
// Retrieving the respective DataHandler referenced by the content-id
DataHandler dataHandler = attachments.getDataHandler(contentID);
// Echoing back the attachment. Sends out MTOM message
OMText textNode = new OMTextImpl(dataHandler);
omEle.build();
child.detach();
omEle.addChild(textNode);
return omEle;
}
}</pre>
</source>
<p>MTOM specification is designed to be backward compatible with the SOAP
with Attachments specification. Even though the representation is different,
both technologies have the same wire format. We can safely assume that any
SOAP with Attachments endpoint can accept a MTOM optimized messages and treat
them as SOAP with Attachment messages - Any MTOM optimized message is a valid
SwA message. Because of that Axis2 does not define a separate programming
model or serialization for SwA. Users can use the MTOM programming model and
serialization to send messages to SwA endpoints.</p>
<p>Note : Above is tested with Axis 1.x</p>
<ul>
<li><strong>A sample SwA message from Axis 1.x</strong></li>
</ul>
<source><pre>Content-Type: multipart/related; type="text/xml";
start="&lt;9D645C8EBB837CE54ABD027A3659535D&gt;";
boundary="----=_Part_0_1977511.1123163571138"
------=_Part_0_1977511.1123163571138
Content-Type: text/xml; charset=UTF-8
Content-Transfer-Encoding: binary
Content-Id: &lt;9D645C8EBB837CE54ABD027A3659535D&gt;
&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;soapenv:Envelope xmlns:soapenv="...."....&gt;
........
&lt;source href="cid:3936AE19FBED55AE4620B81C73BDD76E" xmlns="/&gt;
........
&lt;/soapenv:Envelope&gt;
------=_Part_0_1977511.1123163571138
Content-Type: text/plain
Content-Transfer-Encoding: binary
Content-Id: &lt;3936AE19FBED55AE4620B81C73BDD76E&gt;
<em>Binary Data.....</em>
------=_Part_0_1977511.1123163571138--</pre>
</source><ul>
<li><strong>Corresponding MTOM message from Axis2</strong></li>
</ul>
<source><pre>Content-Type: multipart/related; boundary=MIMEBoundary4A7AE55984E7438034;
type="application/xop+xml"; start="<0.09BC7F4BE2E4D3EF1B@apache.org>";
start-info="text/xml; charset=utf-8"
--MIMEBoundary4A7AE55984E7438034
content-type: application/xop+xml; charset=utf-8; type="application/soap+xml;"
content-transfer-encoding: binary
content-id: &lt;0.09BC7F4BE2E4D3EF1B@apache.org&gt;
&lt;?xml version='1.0' encoding='utf-8'?&gt;
&lt;soapenv:Envelope xmlns:soapenv="...."....&gt;
........
&lt;xop:Include href="cid:1.A91D6D2E3D7AC4D580@apache.org"
xmlns:xop="http://www.w3.org/2004/08/xop/include"&gt;
&lt;/xop:Include&gt;
........
&lt;/soapenv:Envelope&gt;
--MIMEBoundary4A7AE55984E7438034
content-type: application/octet-stream
content-transfer-encoding: binary
content-id: <1.A91D6D2E3D7AC4D580@apache.org>
<em>Binary Data.....</em>
--MIMEBoundary4A7AE55984E7438034--</pre>
</source>
<h1>Advanced Topics</h1>
<h2>File Caching For Attachments</h2>
<p>Axis2 comes handy with a file caching mechanism for incoming attachments,
which gives Axis2 the ability to handle very large attachments without
buffering them in memory at any time. Axis2 file caching streams the incoming
MIME parts directly in to files, after reading the MIME part headers.</p>
<p>Also a user can specify a size threshold for the File caching. When this
threshold value is specified, only the attachments whose size is bigger than
the threshold value will get cached in files. Smaller attachments will remain
in Memory.</p>
<p>NOTE : It is a must to specify a directory to temporary store the
attachments. Also care should be taken to *clean that directory* from time to
time.</p>
<p>The following parameters need to be set in Axis2.xml in order to enable
file caching.</p>
<source><pre><em>&lt;axisconfig name="AxisJava2.0"&gt;
&lt;!-- ================================================= --&gt;
&lt;!-- Parameters --&gt;
&lt;!-- ================================================= --&gt;</em>
&lt;parameter name="cacheAttachments" locked="xsd:false"&gt;true&lt;/parameter&gt;
&lt;parameter name="attachmentDIR" locked="xsd:false"&gt;<em>temp directory</em>&lt;/parameter&gt;
&lt;parameter name="sizeThreshold" locked="xsd:false"&gt;4000&lt;/parameter&gt;
.........
.........
&lt;/axisconfig&gt;</pre>
</source></body>
</html>