blob: eebc851dc7ede5a342c903aa2abf2d7a46522ccb [file] [log] [blame]
== Splitting big XML payloads
=== Introduction
This example shows how to deal with big XML files in Camel.
The XPath tokenizer will load the entire XML content into memory, so
it's not well suited for very big XML payloads.
Instead, you can use the StAX or XML tokenizers to efficiently iterate
the XML payload in a streamed fashion.
For more information please read the[official documentation].
There are 2 tests:
. `StaxTokenizerTest` : requires using JAXB and process messages using
a SAX ContentHandler
. `XmlTokenizerTest` : easier to use but can't handle complex XML
structures (i.e. nested naming clash)
The test XML contains a simple collection of records.
<?xml version="1.0" encoding="UTF-8"?>
<records xmlns="">
<value>The quick brown fox jumps over the lazy dog</value>
You can customize numOfRecords and maxWaitTime to do performance tests
with different payloads. +
Max JVM heap is restricted to 20 MB to show that it works with a very
limited amount of memory (see `pom.xml`).
There are also a number of optional runtime settings: - no cache enabled
- no parallel processing - no mock endpoints with in-memory exchange
store - enabled Throughput Logging for DEBUG level - disabled JMX
=== Build and run
The test XML file is built once beforehand using `@BeforeClass`.
$ mvn test
=== Test results
Tested on MacBook Pro 2,8 GHz Intel Core i7; 16 GB 2133 MHz LPDDR3; Java
|tokenizer |numOfRecords |maxWaitTime (ms) |XML size (kB) |time (ms)
|StAX |40000 |5000 |3543 |3052
|XML |40000 |5000 |3543 |2756
|StAX |1000000 |20000 |89735 |11740
|XML |1000000 |20000 |89735 |11137
|StAX |15000000 |200000 |1366102 |132176
|XML |15000000 |200000 |1366102 |132549
=== Help and contributions
If you hit any problem using Camel or have some feedback, then please[let us know].
We also love contributors, so[get involved] :-)
The Camel riders!