| <!DOCTYPE html> |
| <html lang="en"> |
| <head> |
| |
| |
| <title>Apache Jena - SAX Input into Jena and ARP</title> |
| <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"> |
| |
| <link href="/css/bootstrap.min.css" rel="stylesheet" media="screen"> |
| <link href="/css/bootstrap-extension.css" rel="stylesheet" type="text/css"> |
| <link href="/css/jena.css" rel="stylesheet" type="text/css"> |
| <link rel="shortcut icon" href="/images/favicon.ico" /> |
| |
| <script src="https://code.jquery.com/jquery-2.2.4.min.js" |
| integrity="sha256-BbhdlvQf/xTY9gja0Dq3HiwQF8LaCRTXxZKRutelT44=" |
| crossorigin="anonymous"></script> |
| <script src="/js/jena-navigation.js" type="text/javascript"></script> |
| <script src="/js/bootstrap.min.js" type="text/javascript"></script> |
| |
| <script src="/js/improve.js" type="text/javascript"></script> |
| |
| |
| </head> |
| |
| <body> |
| |
| <nav class="navbar navbar-default" role="navigation"> |
| <div class="container"> |
| <div class="navbar-header"> |
| <button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".navbar-ex1-collapse"> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| </button> |
| <a class="navbar-brand" href="/index.html"> |
| <img class="logo-menu" src="/images/jena-logo/jena-logo-notext-small.png" alt="jena logo">Apache Jena</a> |
| </div> |
| |
| <div class="collapse navbar-collapse navbar-ex1-collapse"> |
| <ul class="nav navbar-nav"> |
| <li id="homepage"><a href="/index.html"><span class="glyphicon glyphicon-home"></span> Home</a></li> |
| <li id="download"><a href="/download/index.cgi"><span class="glyphicon glyphicon-download-alt"></span> Download</a></li> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown"><span class="glyphicon glyphicon-book"></span> Learn <b class="caret"></b></a> |
| <ul class="dropdown-menu"> |
| <li class="dropdown-header">Tutorials</li> |
| <li><a href="/tutorials/index.html">Overview</a></li> |
| <li><a href="/documentation/fuseki2/index.html">Fuseki Triplestore</a></li> |
| <li><a href="/documentation/notes/index.html">How-To's</a></li> |
| <li><a href="/documentation/query/manipulating_sparql_using_arq.html">Manipulating SPARQL using ARQ</a></li> |
| <li><a href="/tutorials/rdf_api.html">RDF core API tutorial</a></li> |
| <li><a href="/tutorials/sparql.html">SPARQL tutorial</a></li> |
| <li><a href="/tutorials/using_jena_with_eclipse.html">Using Jena with Eclipse</a></li> |
| <li class="divider"></li> |
| <li class="dropdown-header">References</li> |
| <li><a href="/documentation/index.html">Overview</a></li> |
| <li><a href="/documentation/query/index.html">ARQ (SPARQL)</a></li> |
| <li><a href="/documentation/assembler/index.html">Assembler</a></li> |
| <li><a href="/documentation/tools/index.html">Command-line tools</a></li> |
| <li><a href="/documentation/rdfs/">Data with RDFS Inferencing</a></li> |
| <li><a href="/documentation/geosparql/index.html">GeoSPARQL</a></li> |
| <li><a href="/documentation/inference/index.html">Inference API</a></li> |
| <li><a href="/documentation/javadoc.html">Javadoc</a></li> |
| <li><a href="/documentation/ontology/">Ontology API</a></li> |
| <li><a href="/documentation/permissions/index.html">Permissions</a></li> |
| <li><a href="/documentation/extras/querybuilder/index.html">Query Builder</a></li> |
| <li><a href="/documentation/rdf/index.html">RDF API</a></li> |
| <li><a href="/documentation/rdfconnection/">RDF Connection - SPARQL API</a></li> |
| <li><a href="/documentation/io/">RDF I/O</a></li> |
| <li><a href="/documentation/rdfstar/index.html">RDF-star</a></li> |
| <li><a href="/documentation/shacl/index.html">SHACL</a></li> |
| <li><a href="/documentation/shex/index.html">ShEx</a></li> |
| <li><a href="/documentation/jdbc/index.html">SPARQL over JDBC</a></li> |
| <li><a href="/documentation/tdb/index.html">TDB</a></li> |
| <li><a href="/documentation/tdb2/index.html">TDB2</a></li> |
| <li><a href="/documentation/query/text-query.html">Text Search</a></li> |
| </ul> |
| </li> |
| |
| <li class="drop down"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown"><span class="glyphicon glyphicon-book"></span> Javadoc <b class="caret"></b></a> |
| <ul class="dropdown-menu"> |
| <li><a href="/documentation/javadoc.html">All Javadoc</a></li> |
| <li><a href="/documentation/javadoc/arq/">ARQ</a></li> |
| <li><a href="/documentation/javadoc_elephas.html">Elephas</a></li> |
| <li><a href="/documentation/javadoc/fuseki2/">Fuseki</a></li> |
| <li><a href="/documentation/javadoc/geosparql/">GeoSPARQL</a></li> |
| <li><a href="/documentation/javadoc/jdbc/">JDBC</a></li> |
| <li><a href="/documentation/javadoc/jena/">Jena Core</a></li> |
| <li><a href="/documentation/javadoc/permissions/">Permissions</a></li> |
| <li><a href="/documentation/javadoc/extras/querybuilder/">Query Builder</a></li> |
| <li><a href="/documentation/javadoc/shacl/">SHACL</a></li> |
| <li><a href="/documentation/javadoc/tdb/">TDB</a></li> |
| <li><a href="/documentation/javadoc/text/">Text Search</a></li> |
| </ul> |
| </li> |
| |
| <li id="ask"><a href="/help_and_support/index.html"><span class="glyphicon glyphicon-question-sign"></span> Ask</a></li> |
| |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown"><span class="glyphicon glyphicon-bullhorn"></span> Get involved <b class="caret"></b></a> |
| <ul class="dropdown-menu"> |
| <li><a href="/getting_involved/index.html">Contribute</a></li> |
| <li><a href="/help_and_support/bugs_and_suggestions.html">Report a bug</a></li> |
| <li class="divider"></li> |
| <li class="dropdown-header">Project</li> |
| <li><a href="/about_jena/about.html">About Jena</a></li> |
| <li><a href="/about_jena/architecture.html">Architecture</a></li> |
| <li><a href="/about_jena/citing.html">Citing</a></li> |
| <li><a href="/about_jena/team.html">Project team</a></li> |
| <li><a href="/about_jena/contributions.html">Related projects</a></li> |
| <li><a href="/about_jena/roadmap.html">Roadmap</a></li> |
| <li class="divider"></li> |
| <li class="dropdown-header">ASF</li> |
| <li><a href="http://www.apache.org/">Apache Software Foundation</a></li> |
| <li><a href="http://www.apache.org/foundation/sponsorship.html">Become a Sponsor</a></li> |
| <li><a href="http://www.apache.org/licenses/LICENSE-2.0">License</a></li> |
| <li><a href="http://www.apache.org/security/">Security</a></li> |
| <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li> |
| </ul> |
| </li> |
| |
| |
| |
| |
| <li id="edit"><a href="https://github.com/apache/jena-site/edit/main/source/documentation/io/arp_sax.md" title="Edit this page on GitHub"><span class="glyphicon glyphicon-pencil"></span> Edit this page</a></li> |
| </ul> |
| </div> |
| </div> |
| </nav> |
| |
| |
| <div class="container"> |
| <div class="row"> |
| <div class="col-md-12"> |
| <div id="breadcrumbs"> |
| |
|
|
|
|
|
|
|
|
|
|
|
|
| <ol class="breadcrumb">
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| <li><a href='/documentation'>DOCUMENTATION</a></li>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| <li><a href='/documentation/io'>IO</a></li>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| <li class="active">ARP SAX</li>
|
|
|
|
|
|
|
|
|
| </ol>
|
|
|
|
|
|
|
| |
| |
| </div> |
| <h1 class="title">SAX Input into Jena and ARP</h1> |
| |
| <p>Normally, both ARP and Jena are used to read files either from the |
| local machine or from the Web. A different use case, addressed |
| here, is when the XML source is available in-memory in some way. In |
| these cases, ARP and Jena can be used as a SAX event handler, |
| turning SAX events into triples, or a DOM tree can be parsed into a |
| Jena Model.</p> |
| <h2 id="contents">Contents</h2> |
| <ul> |
| <li><a href="#overview">Overview</a></li> |
| <li><a href="#sample-code">Sample Code</a></li> |
| <li><a href="#initializing-sax-event-source">Initializing SAX event source</a></li> |
| <li><a href="#error-handler">Error Handler</a></li> |
| <li><a href="#options">Options</a></li> |
| <li><a href="#xml-lang-and-namespaces">XML Lang and Namespaces</a></li> |
| <li><a href="#using-your-own-triple-handler">Using your own triple handler</a></li> |
| <li><a href="#using-a-dom-as-input">Using a DOM as input</a></li> |
| </ul> |
| <h2 id="1-overview">1. Overview</h2> |
| <p>To read an arbitrary SAX source as triples to be added into a Jena |
| model, it is not possible to use a |
| <code>Model.</code><a href="/documentation/javadoc/jena/org/apache/jena/rdf/model/Model.html#read(java.io.InputStream,%20java.lang.String)"><code>read</code></a>() |
| operation. Instead, you construct a SAX event handler of class |
| <a href="/documentation/javadoc/jena/org/apache/jena/rdf/arp/SAX2Model.html"><code>SAX2Model</code></a>, |
| using the |
| <a href="/documentation/javadoc/jena/org/apache/jena/rdf/arp/SAX2Model.html#create(java.lang.String,%20org.apache.jena.rdf.model.Model)"><code>create</code></a> |
| method, install these as the handler on your SAX event source, and |
| then stream the SAX events. It is possible to have fine-grained |
| control over the SAX events, for instance, by inserting or deleting |
| events, before passing them to the |
| <a href="/documentation/javadoc/jena/org/apache/jena/rdf/arp/SAX2Model.html"><code>SAX2Model</code></a> |
| handler.</p> |
| <h2 id="sample-code">Sample Code</h2> |
| <p>This code uses the Xerces parser as a SAX event stream, and adds |
| the triple to a |
| <a href="/documentation/javadoc/jena/org/apache/jena/rdf/model/Model.html"><code>Model</code></a> using |
| default options.</p> |
| <pre><code>// Use your own SAX source. |
| XMLReader saxParser = new SAXParser(); |
| |
| // set up SAX input |
| InputStream in = new FileInputStream("kb.rdf"); |
| InputSource ins = new InputSource(in); |
| ins.setSystemId(base); |
| |
| Model m = ModelFactory.createDefaultModel(); |
| String base = "http://example.org/"; |
| |
| // create handler, linked to Model |
| SAX2Model handler = SAX2Model.create(base, m); |
| |
| // install handler on SAX event stream |
| SAX2RDF.installHandlers(saxParser, handler); |
| |
| try { |
| try { |
| saxParser.parse(ins); |
| } finally { |
| // MUST ensure handler is closed. |
| handler.close(); |
| } |
| } catch (SAXParseException e) { |
| // Fatal parsing errors end here, |
| // but they will already have been reported. |
| } |
| </code></pre> |
| <h2 id="initializing-sax-event-source">Initializing SAX event source</h2> |
| <p>If your SAX event source is a subclass of <code>XMLReader</code>, then the |
| <a href="/documentation/javadoc/jena/org/apache/jena/rdf/arp/SAX2RDF.html#installHandlers(org.xml.sax.XMLReader,%20org.apache.jena.rdf.arp.XMLHandler)">installHandlers</a> |
| static method can be used as shown in the sample. Otherwise, you |
| have to do it yourself. The |
| <a href="/documentation/javadoc/jena/org/apache/jena/rdf/arp/SAX2RDF.html#installHandlers(org.xml.sax.XMLReader,%20org.apache.jena.rdf.arp.XMLHandler)"><code>installHandlers</code></a> |
| code is like this:</p> |
| <pre><code>static public void installHandlers(XMLReader rdr, XMLHandler sax2rdf) |
| throws SAXException |
| { |
| rdr.setEntityResolver(sax2rdf); |
| rdr.setDTDHandler(sax2rdf); |
| rdr.setContentHandler(sax2rdf); |
| rdr.setErrorHandler(sax2rdf); |
| rdr.setFeature("http://xml.org/sax/features/namespaces", true); |
| rdr.setFeature( |
| "http://xml.org/sax/features/namespace-prefixes", |
| true); |
| rdr.setProperty( |
| "http://xml.org/sax/properties/lexical-handler", |
| sax2rdf); |
| } |
| </code></pre> |
| <p>For some other SAX source, the exact code will differ, but the |
| required operations are as above.</p> |
| <h2 id="error-handler">Error Handler</h2> |
| <p>The <a href="/documentation/javadoc/jena/org/apache/jena/rdf/arp/SAX2Model.html">SAX2Model</a> |
| handler supports the |
| <a href="/documentation/javadoc/jena/org/apache/jena/rdf/model/RDFReader.html#setErrorHandler(org.apache.jena.rdf.model.RDFErrorHandler)">setErrorHandler</a> |
| method, from the Jena |
| <a href="/documentation/javadoc/jena/org/apache/jena/rdf/model/RDFReader.html">RDFReader</a> |
| interface. This is used in the same way as that method to control |
| error reporting.</p> |
| <p>A specific fatal error, new in Jena 2.3, is ERR_INTERRUPTED, which |
| indicates that the current Thread received an interrupt. This |
| allows long jobs to be aborted on user request.</p> |
| <h2 id="options">Options</h2> |
| <p>The <a href="/documentation/javadoc/jena/org/apache/jena/rdf/arp/SAX2Model.html"><code>SAX2Model</code></a> |
| handler supports the |
| <a href="/documentation/javadoc/jena/org/apache/jena/rdf/model/RDFReader.html#setProperty(java.lang.String,%20java.lang.Object)"><code>setProperty</code></a> |
| method, from the Jena |
| <a href="/documentation/javadoc/jena/org/apache/jena/rdf/model/RDFReader.html"><code>RDFReader</code></a> |
| interface. This is used in nearly the same way to have fine grain |
| control over ARPs behaviour, particularly over error reporting, see |
| the <a href="iohowto.html#arp_properties">I/O howto</a>. Setting SAX or |
| Xerces properties cannot be done using this method.</p> |
| <h2 id="xml-lang-and-namespaces">XML Lang and Namespaces</h2> |
| <p>If you are only treating some document subset as RDF/XML then it is |
| necessary to ensure that ARP knows the correct value for <code>xml:lang</code> |
| and desirable that it knows the correct mappings of namespace |
| prefixes.</p> |
| <p>There is a second version of the |
| <a href="/documentation/javadoc/jena/org/apache/jena/rdf/arp/SAX2Model.html#create(java.lang.String,%20org.apache.jena.rdf.model.Model,%20java.lang.String)"><code>create</code></a> |
| method, which allows specification of the <code>xml:lang</code> value from the |
| outer context. If this is inappropriate it is possible, but hard |
| work, to synthesis an appropriate SAX event.</p> |
| <p>For the namespaces prefixes, it is possible to call the |
| <a href="/documentation/javadoc/jena/org/apache/jena/rdf/arp/SAX2RDF.html#startPrefixMapping(java.lang.String,%20java.lang.String)"><code>startPrefixMapping</code></a> |
| SAX event, before passing the other SAX events, to declare each |
| namespace, one by one. Failure to do this is permitted, but, for |
| instance, a Jena Model will then not know the (advisory) namespace |
| prefix bindings. These should be paired with endPrefixMapping |
| events, but nothing untoward is likely if such code is omitted.</p> |
| <h2 id="using-your-own-triple-handler">Using your own triple handler</h2> |
| <p>As with ARP, it is possible to use this functionality, without |
| using other Jena features, in particular, without using a Jena |
| Model. Instead of using the class SAX2Model, you use its superclass |
| <a href="/documentation/javadoc/jena/org/apache/jena/rdf/arp/SAX2RDF.html">SAX2RDF</a>. The |
| <a href="/documentation/javadoc/jena/org/apache/jena/rdf/arp/SAX2RDF.html#create(java.lang.String)">create</a> |
| method on this class does not provide any means of specifying what |
| to do with the triples. Instead, the class implements the |
| <a href="/documentation/javadoc/jena/org/apache/jena/rdf/arp/SAX2RDF.html">ARPConfig</a> |
| interface, which permits the setting of handlers and parser |
| options, as described in the documentation for using |
| <a href="standalone.html">ARP without Jena</a>.</p> |
| <p>Thus you need to:</p> |
| <ol> |
| <li>Create a SAX2RDF using |
| <a href="/documentation/javadoc/jena/org/apache/jena/rdf/arp/SAX2RDF.html#create(java.lang.String)">SAX2RDF.create()</a></li> |
| <li>Attach your StatementHandler and SAXErrorHandler and optionally |
| your NamespaceHandler and ExtendedHandler to the SAX2RDF instance.</li> |
| <li>Install the SAX2RDF instance as the SAX handler on your SAX |
| source.</li> |
| <li>Follow the remainder of the code sample above.</li> |
| </ol> |
| <h2 id="using-a-dom-as-input">Using a DOM as Input</h2> |
| <p>None of the approaches listed here work with Java 1.4.1_04. We |
| suggest using Java 1.4.2_04 or greater for this functionality. |
| This issue has no impact on any other Jena functionality.</p> |
| <h3 id="using-a-dom-as-input-to-jena">Using a DOM as Input to Jena</h3> |
| <p>The <a href="/documentation/javadoc/jena/org/apache/jena/rdf/arp/DOM2Model.html"><code>DOM2Model</code></a> |
| subclass of SAX2Model, allows the parsing of a DOM using ARP. The |
| procedure to follow is:</p> |
| <ul> |
| <li>Construct a <code>DOM2Model</code>, using a factory method such as |
| <a href="/documentation/javadoc/jena/org/apache/jena/rdf/arp/DOM2Model.html#createD2M(java.lang.String,%20org.apache.jena.rdf.model.Model)"><code>createD2M</code></a>, |
| specifying the xml:base of the document to be loaded, the Model to |
| load into, optionally the xml:lang value (particularly useful if |
| using a DOM Node from within a Document).</li> |
| <li>Set any properties, error handlers etc. on the <code>DOM2Model</code> |
| object.</li> |
| <li>The DOM is parsed simply by calling the |
| <a href="/documentation/javadoc/jena/org/apache/jena/rdf/arp/DOM2Model.html#load(org.w3c.dom.Node)"><code>load(Node)</code></a> |
| method.</li> |
| </ul> |
| <h3 id="using-a-dom-as-input-to-arp">Using a DOM as Input to ARP</h3> |
| <p>DOM2Model is a subclass of SAX2RDF, and handlers etc. can be set on |
| the DOM2Model as for SAX2RDF. Using a null model as the argument to |
| the factory indicates this usage.</p> |
| |
| |
| </div> |
| </div> |
| |
| </div> |
| |
| <footer class="footer"> |
| <div class="container" style="font-size:80%" > |
| <p> |
| Copyright © 2011–2022 The Apache Software Foundation, Licensed under the |
| <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>. |
| </p> |
| <p> |
| Apache Jena, Jena, the Apache Jena project logo, Apache and the Apache feather logos are trademarks of |
| The Apache Software Foundation. |
| <br/> |
| <a href="https://privacy.apache.org/policies/privacy-policy-public.html" |
| >Apache Software Foundation Privacy Policy</a>. |
| </p> |
| </div> |
| </footer> |
| |
| |
| <script type="text/javascript"> |
| var link = $('a[href="' + this.location.pathname + '"]'); |
| if (link != undefined) |
| link.parents('li,ul').addClass('active'); |
| </script> |
| |
| </body> |
| </html> |