<html> | |
<head> | |
<meta http-equiv="content-type" content=""> | |
<title>Sending Binary data with SOAP</title> | |
</head> | |
<body> | |
<h1>Sending Binary Data with SOAP</h1> | |
<ul> | |
<li><a href="#1">Introduction</a></li> | |
<li><a href="#2">MTOM with Axis2 </a> | |
<ul> | |
<li><a href="#21">Programming Model</a></li> | |
<li><a href="#22">Enabling MTOM optimization at client side</a></li> | |
<li><a href="#23">Enabling MTOM optimization at server side</a></li> | |
<li><a href="#24">Accessing received Binary Data (Sample Code) </a> | |
<ul> | |
<li><a href="#241">Service</a></li> | |
<li><a href="#242">Client</a></li> | |
</ul> | |
</li> | |
</ul> | |
</li> | |
<li><a href="#3">SOAP with Attachments with Axis2</a></li> | |
<li><a href="#4">Advanced Topics </a> | |
<ul> | |
<li><a href="#41">File Caching for Attachments</a></li> | |
</ul> | |
</li> | |
</ul> | |
<h2>Introduction</h2> | |
<p>Despite the flexibility, interoperability and global acceptance of XML, | |
there are times when serializing data into XML does not make sense. Web | |
services users may need to transmit binary attachments of various sorts like | |
images, drawings, xml docs, etc together with SOAP message. Such data are | |
often in a particular binary format.<br/> | |
Traditionally, two techniques have been used in dealing with opaque data in | |
XML;</p> | |
<ol> | |
<li><strong>"By value"</strong></li> | |
<blockquote> | |
<p>Sending binary data by value is achieved by embedding opaque data (of | |
course after some form of encoding) as element or attribute content of | |
the XML component of data. The main advantage of this technique is that | |
it gives applications the ability to process and describe data based and | |
looking only on XML component of the data.</p> | |
<p>XML supports opaque data as content through the use of either base64 | |
or hexadecimal text encoding. Both these techniques bloat the size of the | |
data. For UTF-8 underlying text encoding, base64 encoding increases the | |
size of the binary data by a factor of 1.33x of the original size, while | |
hexadecimal encoding expands data by a factor of 2x. Above factors will | |
be doubled if UTF-16 text encoding is used. Also of concern is the | |
overhead in processing costs (both real and perceived) for these formats, | |
especially when decoding back into raw binary.</p> | |
</blockquote> | |
<li><strong>"By reference"</strong> | |
<blockquote> | |
<p>Sending binary data by reference is achieved by attaching pure | |
binary data as external unparsed general entities outside of the XML | |
document and then embedding reference URI's to those entities as | |
elements or attribute values. This prevents the unnecessary bloating of | |
data and wasting of processing power. The primary obstacle for using | |
these unparsed entities is their heavy reliance on DTDs, which impedes | |
modularity as well as use of XML namespaces.</p> | |
<p>There were several specifications introduced in the Web services | |
world to deal with this binary attachment problem using the "by | |
reference" technique. <a | |
href="http://www.w3.org/TR/SOAP-attachments">SOAP with Attachments</a> | |
is one such example. Since SOAP prohibits document type declarations | |
(DTD) in messages, this leads to the problem of not representing data | |
as part of the message infoset, creating two data models. This scenario | |
is like sending attachments with an e-mail message. Even though those | |
attachments are related to the message content they are not inside the | |
message. This causes the technologies for processing and description | |
of data based on XML component of the data, to malfunction. One example | |
is WS-Security.</p> | |
</blockquote> | |
</li> | |
</ol> | |
<p><a href="http://www.w3.org/TR/2004/PR-soap12-mtom-20041116/">MTOM (SOAP | |
Message Transmission Optimization Mechanism)</a> is another specification | |
which focuses on solving the "Attachments" problem. MTOM tries to leverage | |
the advantages of above two techniques by trying to merge the two techniques. | |
MTOM is actually a "by reference" method. Wire format of a MTOM optimized | |
message is same as the Soap with Attachments message, which also makes it | |
backward compatible with SwA endpoints. The most notable feature of MTOM is | |
the use of XOP:Include element, which is defined in <a | |
href="http://www.w3.org/TR/2004/PR-xop10-20041116/">XML Binary Optimized | |
Packaging (XOP)</a> specification to reference the binary attachments | |
(external unparsed general entities) of the message. With the use of this | |
exclusive element the attached binary content logically become inline (by | |
value) with the SOAP document even though actually it is attached separately. | |
This merges the two realms by making it possible to work only with one data | |
model. This allows the applications to process and describe by only looking | |
at XML part making reliance on DTDs obsolete. On a lighter note MTOM has | |
standardized the referencing mechanism of SwA. Following is an extract from | |
the <a href="http://www.w3.org/TR/2004/PR-xop10-20041116/">XOP</a> | |
specification.</p> | |
<p><em>At the conceptual level, this binary data can be thought of as being | |
base64-encoded in the XML Document. As this conceptual form might be needed | |
during some processing of the XML Document (e.g., for signing the XML | |
document), it is necessary to have a one to one correspondence between XML | |
Infosets and XOP Packages. Therefore, the conceptual representation of such | |
binary data is as if it were base64-encoded, using the canonical lexical form | |
of XML Schema base64Binary datatype (see <a href="#XMLSchemaP2">[XML Schema | |
Part 2: Datatypes Second Edition] </a><a | |
href="http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#base64Binary">3.2.16 | |
base64Binary</a>). In the reverse direction, XOP is capable of optimizing | |
only base64-encoded Infoset data that is in the canonical lexical | |
form.</em></p> | |
<p>Apache Axis2 supports <strong>Base64 encoding</strong>, <strong>SOAP with | |
Attachments</strong> & <strong>MTOM (SOAP Message Transmission | |
Optimization Mechanism).</strong></p> | |
<h1>MTOM with Axis2</h1> | |
<h2>Programming Model</h2> | |
<p>AXIOM is (and may be the first) Object Model which has the ability to hold | |
binary data. It has been given this ability by allowing OMText to hold raw | |
binary content in the form of javax.activation.DataHandler. OMText has been | |
chosen for this purpose with two reasons. One is that XOP (MTOM) is capable | |
of optimizing only base64-encoded Infoset data that is in the canonical | |
lexical form of XML Schema base64Binary datatype. Other one is to preserve | |
the infoset in both sender and receiver (To store the binary content in the | |
same kind of object regardless of whether it is optimized or not).</p> | |
<p>MTOM allows to selectively encode portions of the message, which allows us | |
to send base64encoded data as well as externally attached raw binary data | |
referenced by "XOP" element (optimized content) to be send in a SOAP message. | |
User can specify whether an OMText node which contains raw binary data or | |
base64encoded binary data is qualified to be optimized or not at the | |
construction time of that node or later. To take the optimum efficiency of | |
MTOM a user is advised to send smaller binary attachments using | |
base64encoding (None optimized) and larger attachments as optimized | |
content.</p> | |
<source><pre> OMElement imageElement = fac.createOMElement("image", omNs); | |
// Creating the Data Handler for the image. | |
// User has the option to use a FileDataSource or a ImageDataSource | |
// in this scenario... | |
Image image; | |
image = new ImageIO() | |
.loadImage(new FileInputStream(inputImageFileName)); | |
ImageDataSource dataSource = new ImageDataSource("test.jpg",image); | |
DataHandler dataHandler = new DataHandler(dataSource); | |
//create an OMText node with the above DataHandler and set optimized to true | |
OMText textData = fac.createText(dataHandler, true); | |
imageElement.addChild(textData); | |
//User can set optimized to false by using the following | |
//textData.doOptimize(false);</pre> | |
</source> | |
<p>Also a user can create an optimizable binary content node using a base64 | |
encoded string, which contains encoded binary content, given with the mime | |
type of the actual binary representation.</p> | |
<source><pre> String base64String = "some_string"; | |
OMText binaryNode = fac.createText(base64String,"image/jpg",true);</pre> | |
</source> | |
<p>Axis2 uses javax.activation.DataHandler to handle the binary data. All | |
optimized binary content nodes will be serialized as Base64 Strings if "MTOM | |
is not enabled". One can also create binary content nodes which will not be | |
optimized at any case. They will be serialized and send as Base64 Strings.</p> | |
<source><pre> //create an OMText node with the above DataHandler and set "optimized" to false | |
//This data will be send as Base64 encoded string regardless of MTOM is enabled or not | |
javax.activation.DataHandler dataHandler = new javax.activation.DataHandler(new FileDataHandler("someLocation")); | |
OMText textData = fac.createText(dataHandler, false); | |
image.addChild(textData);</pre> | |
</source> | |
<h2>Enabling MTOM optimization at client side</h2> | |
<p>Set the "enableMTOM" property in the Options to true, when sending | |
messages.</p> | |
<source><pre> ServiceClient serviceClient = new ServiceClient (); | |
Options options = new Options(); | |
options.setTo(targetEPR); | |
options.setProperty(Constants.Configuration.ENABLE_MTOM, Constants.VALUE_TRUE); | |
serviceClient .setOptions(options);</pre> | |
</source> | |
<p>When this property is set to true any SOAP envelope which contains | |
optimizable content (OMText nodes containing binary content with optimizable | |
flag "true") will be serialized as a MTOM optimized message. Messages will | |
not be packaged as MTOM if they did not contain any optimizable content even | |
though MTOM is enabled. But due considering the policy assertions, there may | |
be a policy saying, all the request should be optimized eventhough there are | |
any optimized contents. To support this phenomenon there is an entry called | |
"forced mime" which has to be set as</p> | |
<source><pre> ServiceClient serviceClient = new ServiceClient (); | |
Options options = new Options(); | |
options.setTo(targetEPR); | |
options.setProperty(Constants.Configuration.FORCE_MIME, Constants.VALUE_TRUE); | |
serviceClient.setOptions(options);</pre> | |
</source> | |
<p></p> | |
<p>Axis2 serializes all binary content nodes as Base64 encoded strings | |
regardless of they are qualified to be optimize or not, if,</p> | |
<ul> | |
<li>"enableMTOM" property is set to false.</li> | |
<li>If envelope contains any element information items of name xop:Include | |
(see <a href="#XOP">[XML-binary Optimized Packaging] </a><a | |
href="http://www.w3.org/TR/2005/REC-xop10-20050125/#xop_infosets">3. XOP | |
Infosets Constructs </a>).</li> | |
</ul> | |
<p>MTOM is *always enabled* in Axis2 when it comes to receiving messages. | |
Axis2 will automatically identify and de-serialize any MTOM message it | |
receives.</p> | |
<p></p> | |
<p><a name="23"></a></p> | |
<h2>Enabling MTOM optimization in the Server side</h2> | |
<p>Axis 2 server automatically identifies incoming MTOM optimized messages | |
based on the content-type and de-serializes accordingly. User can enableMTOM | |
in the server side for outgoing messages,</p> | |
<ul> | |
<li>Globally for all services | |
<blockquote> | |
<p>add and set the "enableMTOM" parameter to true in the Axis2.xml. | |
When it is set, *outgoing* messages *which contains optimizable | |
content* will be serialized and send as MTOM optimized messages. If it | |
is not set all the binary data in binary content nodes will be | |
serialized as Base64 encoded strings.</p> | |
</blockquote> | |
</li> | |
</ul> | |
<p><source> | |
<pre><parameter name="enableMTOM" locked="false">true</parameter></pre> | |
</source> | |
</p> | |
<ul> | |
<ul> | |
<p>User must restart the server after setting this parameter.</p> | |
</ul> | |
</ul> | |
<p><a name="24"></a></p> | |
<h2>Accessing received Binary Data</h2> | |
<ul> | |
<li><strong><a name="241"></a>Service</strong></li> | |
</ul> | |
<source><pre>public class MTOMService { | |
public OMElement mtomSample(OMElement element) throws Exception { | |
OMElement imageEle = element.getFirstElement(); | |
OMElement imageName = (OMElement) imageEle.getNextSibling(); | |
OMText binaryNode = (OMText) imageEle.getFirstChild(); | |
String fileName = imageName.getText(); | |
//Extracting the data and saving | |
DataHandler actualDH; | |
actualDH = binaryNode.getDataHandler(); | |
Image actualObject = new ImageIO().loadImage(actualDH.getDataSource() | |
.getInputStream()); | |
FileOutputStream imageOutStream = new FileOutputStream(fileName); | |
new JDK13IO().saveImage("image/jpeg", actualObject, imageOutStream); | |
//setting response | |
OMFactory fac = OMAbstractFactory.getOMFactory(); | |
OMNamespace ns = fac.createOMNamespace("urn://fakenamespace", "ns"); | |
OMElement ele = fac.createOMElement("response", ns); | |
ele.setText("Image Saved"); | |
return ele; | |
} | |
}</pre> | |
</source><ul> | |
<a name="242"/> | |
<li><strong>Client</strong></li> | |
</ul> | |
<source><pre> Options options = new Options(); | |
options.setTo(targetEPR); | |
// enabling MTOM | |
options.set(Constants.Configuration.ENABLE_MTOM, Constants.VALUE_TRUE); | |
options.setTransportInfo(Constants.TRANSPORT_HTTP, | |
Constants.TRANSPORT_HTTP, false); | |
options.setSoapVersionURI(SOAP12Constants.SOAP_ENVELOPE_NAMESPACE_URI); | |
OMElement result = (OMElement) call.invokeBlocking(operationName. | |
getLocalPart(),payload); | |
OMElement ele = (OMElement) result.getFirstChild(); | |
OMText binaryNode = (OMText) ele.getFirstChild(); | |
// Retrieving the DataHandler & then do whatever the processing to the data | |
DataHandler actualDH; | |
actualDH = binaryNode.getDataHandler(); | |
Image actualObject = new ImageIO().loadImage(actualDH.getDataSource() | |
.getInputStream());</pre> | |
</source> | |
<p><a name="3"></a></p> | |
<h1>SOAP with Attachments (SwA) with Axis2</h1> | |
<p>Axis2 Handles SwA messages at the inflow only. When Axis2 receives a SwA | |
message it extracts the binary attachment parts and puts a reference to those | |
parts in the Message Context. Users can access binary attachments using the | |
content-id. Care should be taken to rip off the "cid" prefix when content-id | |
is taken from the "Href" attributes. When accessing the message context from | |
a service users need to use the message context injection mechanism by | |
introducing an "init" method to the service class.(see the following service | |
example)</p> | |
<p>Note: Axis2 supports content-id referencing only. Axis2 does not support | |
Content Location referencing of MIME parts.</p> | |
<ul> | |
<li><strong>Sample service which accepts a SwA message</strong></li> | |
</ul> | |
<source><pre>public class EchoSwA { | |
private MessageContext msgcts; | |
public void init(MessageContext msgcts) { | |
this.msgcts = msgcts; | |
} | |
public OMElement echoAttachment(OMElement omEle) { | |
OMElement child = (OMElement)omEle.getFirstChild(); | |
//retreiving the Href attribute which contains the Content Id | |
OMAttribute attr = (OMAttribute)child.getFirstAttribute(new QName("href")); | |
String contentID = attr.getValue(); | |
//content-id processing to remove the "cid" prefix | |
contentID = contentID.trim(); | |
if (contentID.substring(0, 3).equalsIgnoreCase("cid")) { | |
contentID = contentID.substring(4); | |
} | |
// Retrieving the MIMEHelper instance (which contains reference to attachments) | |
// from the Message Context | |
MIMEHelper attachments = (MIMEHelper)msgcts.getProperty(MIMEHelper.ATTACHMENTS); | |
// Retrieving the respective DataHandler referenced by the content-id | |
DataHandler dataHandler = attachments.getDataHandler(contentID); | |
// Echoing back the attachment. Sends out MTOM message | |
OMText textNode = new OMTextImpl(dataHandler); | |
omEle.build(); | |
child.detach(); | |
omEle.addChild(textNode); | |
return omEle; | |
} | |
}</pre> | |
</source> | |
<p>MTOM specification is designed to be backward compatible with the SOAP | |
with Attachments specification. Even though the representation is different, | |
both technologies have the same wire format. We can safely assume that any | |
SOAP with Attachments endpoint can accept a MTOM optimized messages and treat | |
them as SOAP with Attachment messages - Any MTOM optimized message is a valid | |
SwA message. Because of that Axis2 does not define a separate programming | |
model or serialization for SwA. Users can use the MTOM programming model and | |
serialization to send messages to SwA endpoints.</p> | |
<p>Note : Above is tested with Axis 1.x</p> | |
<ul> | |
<li><strong>A sample SwA message from Axis 1.x</strong></li> | |
</ul> | |
<source><pre>Content-Type: multipart/related; type="text/xml"; | |
start="<9D645C8EBB837CE54ABD027A3659535D>"; | |
boundary="----=_Part_0_1977511.1123163571138" | |
------=_Part_0_1977511.1123163571138 | |
Content-Type: text/xml; charset=UTF-8 | |
Content-Transfer-Encoding: binary | |
Content-Id: <9D645C8EBB837CE54ABD027A3659535D> | |
<?xml version="1.0" encoding="UTF-8"?> | |
<soapenv:Envelope xmlns:soapenv="...."....> | |
........ | |
<source href="cid:3936AE19FBED55AE4620B81C73BDD76E" xmlns="/> | |
........ | |
</soapenv:Envelope> | |
------=_Part_0_1977511.1123163571138 | |
Content-Type: text/plain | |
Content-Transfer-Encoding: binary | |
Content-Id: <3936AE19FBED55AE4620B81C73BDD76E> | |
<em>Binary Data.....</em> | |
------=_Part_0_1977511.1123163571138--</pre> | |
</source><ul> | |
<li><strong>Corresponding MTOM message from Axis2</strong></li> | |
</ul> | |
<source><pre>Content-Type: multipart/related; boundary=MIMEBoundary4A7AE55984E7438034; | |
type="application/xop+xml"; start="<0.09BC7F4BE2E4D3EF1B@apache.org>"; | |
start-info="text/xml; charset=utf-8" | |
--MIMEBoundary4A7AE55984E7438034 | |
content-type: application/xop+xml; charset=utf-8; type="application/soap+xml;" | |
content-transfer-encoding: binary | |
content-id: <0.09BC7F4BE2E4D3EF1B@apache.org> | |
<?xml version='1.0' encoding='utf-8'?> | |
<soapenv:Envelope xmlns:soapenv="...."....> | |
........ | |
<xop:Include href="cid:1.A91D6D2E3D7AC4D580@apache.org" | |
xmlns:xop="http://www.w3.org/2004/08/xop/include"> | |
</xop:Include> | |
........ | |
</soapenv:Envelope> | |
--MIMEBoundary4A7AE55984E7438034 | |
content-type: application/octet-stream | |
content-transfer-encoding: binary | |
content-id: <1.A91D6D2E3D7AC4D580@apache.org> | |
<em>Binary Data.....</em> | |
--MIMEBoundary4A7AE55984E7438034--</pre> | |
</source> | |
<h1>Advanced Topics</h1> | |
<h2>File Caching For Attachments</h2> | |
<p>Axis2 comes handy with a file caching mechanism for incoming attachments, | |
which gives Axis2 the ability to handle very large attachments without | |
buffering them in memory at any time. Axis2 file caching streams the incoming | |
MIME parts directly in to files, after reading the MIME part headers.</p> | |
<p>Also a user can specify a size threshold for the File caching. When this | |
threshold value is specified, only the attachments whose size is bigger than | |
the threshold value will get cached in files. Smaller attachments will remain | |
in Memory.</p> | |
<p>NOTE : It is a must to specify a directory to temporary store the | |
attachments. Also care should be taken to *clean that directory* from time to | |
time.</p> | |
<p>The following parameters need to be set in Axis2.xml in order to enable | |
file caching.</p> | |
<source><pre><em><axisconfig name="AxisJava2.0"> | |
<!-- ================================================= --> | |
<!-- Parameters --> | |
<!-- ================================================= --></em> | |
<parameter name="cacheAttachments" locked="xsd:false">true</parameter> | |
<parameter name="attachmentDIR" locked="xsd:false"><em>temp directory</em></parameter> | |
<parameter name="sizeThreshold" locked="xsd:false">4000</parameter> | |
......... | |
......... | |
</axisconfig></pre> | |
</source></body> | |
</html> |