<!DOCTYPE html>
<html lang="en">
<head>
    

    <title>Apache Jena - Typed literals how-to</title>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">

    <link href="/css/bootstrap.min.css" rel="stylesheet" media="screen">
    <link href="/css/bootstrap-extension.css" rel="stylesheet" type="text/css">
    <link href="/css/jena.css" rel="stylesheet" type="text/css">
    <link rel="shortcut icon" href="/images/favicon.ico" />

    <script src="https://code.jquery.com/jquery-2.2.4.min.js"
            integrity="sha256-BbhdlvQf/xTY9gja0Dq3HiwQF8LaCRTXxZKRutelT44="
            crossorigin="anonymous"></script>
    <script src="/js/jena-navigation.js" type="text/javascript"></script>
    <script src="/js/bootstrap.min.js" type="text/javascript"></script>

    <script src="/js/improve.js" type="text/javascript"></script>

    
</head>

<body>

<nav class="navbar navbar-default" role="navigation">
    <div class="container">
        <div class="navbar-header">
            <button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".navbar-ex1-collapse">
                <span class="icon-bar"></span>
                <span class="icon-bar"></span>
                <span class="icon-bar"></span>
            </button>
            <a class="navbar-brand" href="/index.html">
                <img class="logo-menu" src="/images/jena-logo/jena-logo-notext-small.png" alt="jena logo">Apache Jena</a>
        </div>

        <div class="collapse navbar-collapse navbar-ex1-collapse">
            <ul class="nav navbar-nav">
                <li id="homepage"><a href="/index.html"><span class="glyphicon glyphicon-home"></span> Home</a></li>
                <li id="download"><a href="/download/index.cgi"><span class="glyphicon glyphicon-download-alt"></span> Download</a></li>
                <li class="dropdown">
                    <a href="#" class="dropdown-toggle" data-toggle="dropdown"><span class="glyphicon glyphicon-book"></span> Learn <b class="caret"></b></a>
                    <ul class="dropdown-menu">
                        <li class="dropdown-header">Tutorials</li>
                        <li><a href="/tutorials/index.html">Overview</a></li>
                        <li><a href="/documentation/fuseki2/index.html">Fuseki Triplestore</a></li>
                        <li><a href="/documentation/notes/index.html">How-To's</a></li>
                        <li><a href="/documentation/query/manipulating_sparql_using_arq.html">Manipulating SPARQL using ARQ</a></li>
                        <li><a href="/tutorials/rdf_api.html">RDF core API tutorial</a></li>
                        <li><a href="/tutorials/sparql.html">SPARQL tutorial</a></li>
                        <li><a href="/tutorials/using_jena_with_eclipse.html">Using Jena with Eclipse</a></li>
                        <li class="divider"></li>
                        <li class="dropdown-header">References</li>
                        <li><a href="/documentation/index.html">Overview</a></li>
                        <li><a href="/documentation/query/index.html">ARQ (SPARQL)</a></li>
                        <li><a href="/documentation/assembler/index.html">Assembler</a></li>
                        <li><a href="/documentation/tools/index.html">Command-line tools</a></li>
                        <li><a href="/documentation/rdfs/">Data with RDFS Inferencing</a></li>
                        <li><a href="/documentation/geosparql/index.html">GeoSPARQL</a></li>
                        <li><a href="/documentation/inference/index.html">Inference API</a></li>
                        <li><a href="/documentation/javadoc.html">Javadoc</a></li>
                        <li><a href="/documentation/ontology/">Ontology API</a></li>
                        <li><a href="/documentation/permissions/index.html">Permissions</a></li>
                        <li><a href="/documentation/extras/querybuilder/index.html">Query Builder</a></li>
                        <li><a href="/documentation/rdf/index.html">RDF API</a></li>
                        <li><a href="/documentation/rdfconnection/">RDF Connection - SPARQL API</a></li>
                        <li><a href="/documentation/io/">RDF I/O</a></li>
                        <li><a href="/documentation/rdfstar/index.html">RDF-star</a></li>
                        <li><a href="/documentation/shacl/index.html">SHACL</a></li>
                        <li><a href="/documentation/shex/index.html">ShEx</a></li>
                        <li><a href="/documentation/jdbc/index.html">SPARQL over JDBC</a></li>
                        <li><a href="/documentation/tdb/index.html">TDB</a></li>
                        <li><a href="/documentation/tdb2/index.html">TDB2</a></li>
                        <li><a href="/documentation/query/text-query.html">Text Search</a></li>
                    </ul>
                </li>

                <li class="drop down">
                    <a href="#" class="dropdown-toggle" data-toggle="dropdown"><span class="glyphicon glyphicon-book"></span> Javadoc <b class="caret"></b></a>
                    <ul class="dropdown-menu">
                        <li><a href="/documentation/javadoc.html">All Javadoc</a></li>
                        <li><a href="/documentation/javadoc/arq/">ARQ</a></li>
                        <li><a href="/documentation/javadoc_elephas.html">Elephas</a></li>
                        <li><a href="/documentation/javadoc/fuseki2/">Fuseki</a></li>
                        <li><a href="/documentation/javadoc/geosparql/">GeoSPARQL</a></li>
                        <li><a href="/documentation/javadoc/jdbc/">JDBC</a></li>
                        <li><a href="/documentation/javadoc/jena/">Jena Core</a></li>
                        <li><a href="/documentation/javadoc/permissions/">Permissions</a></li>
                        <li><a href="/documentation/javadoc/extras/querybuilder/">Query Builder</a></li>
                        <li><a href="/documentation/javadoc/shacl/">SHACL</a></li>
                        <li><a href="/documentation/javadoc/tdb/">TDB</a></li>
                        <li><a href="/documentation/javadoc/text/">Text Search</a></li>
                    </ul>
                </li>

                <li id="ask"><a href="/help_and_support/index.html"><span class="glyphicon glyphicon-question-sign"></span> Ask</a></li>

                <li class="dropdown">
                    <a href="#" class="dropdown-toggle" data-toggle="dropdown"><span class="glyphicon glyphicon-bullhorn"></span> Get involved <b class="caret"></b></a>
                    <ul class="dropdown-menu">
                        <li><a href="/getting_involved/index.html">Contribute</a></li>
                        <li><a href="/help_and_support/bugs_and_suggestions.html">Report a bug</a></li>
                        <li class="divider"></li>
                        <li class="dropdown-header">Project</li>
                        <li><a href="/about_jena/about.html">About Jena</a></li>
                        <li><a href="/about_jena/architecture.html">Architecture</a></li>
                        <li><a href="/about_jena/citing.html">Citing</a></li>
                        <li><a href="/about_jena/team.html">Project team</a></li>
                        <li><a href="/about_jena/contributions.html">Related projects</a></li>
                        <li><a href="/about_jena/roadmap.html">Roadmap</a></li>
                        <li class="divider"></li>
                        <li class="dropdown-header">ASF</li>
                        <li><a href="http://www.apache.org/">Apache Software Foundation</a></li>
                        <li><a href="http://www.apache.org/foundation/sponsorship.html">Become a Sponsor</a></li>
                        <li><a href="http://www.apache.org/licenses/LICENSE-2.0">License</a></li>
                        <li><a href="http://www.apache.org/security/">Security</a></li>
                        <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
                    </ul>
                </li>


    

                <li id="edit"><a href="https://github.com/apache/jena-site/edit/main/source/documentation/notes/typed-literals.md" title="Edit this page on GitHub"><span class="glyphicon glyphicon-pencil"></span> Edit this page</a></li>
            </ul>
        </div>
    </div>
</nav>


<div class="container">
    <div class="row">
        <div class="col-md-12">
            <div id="breadcrumbs">
                
                    





<ol class="breadcrumb">
    
    
        
        
    
        
        
            
                <li><a href='/documentation'>DOCUMENTATION</a></li>
            
            
        
    
        
        
            
                <li><a href='/documentation/notes'>NOTES</a></li>
            
            
        
    
        
        
            
                <li class="active">TYPED LITERALS</li>
            
            
        
    
</ol>




                
            </div>
            <h1 class="title">Typed literals how-to</h1>
            
	<h2 id="what-are-typed-literals">What are typed literals?</h2>
<p>In the original RDF specifications there were two types of literal
values defined - plain literals (which are basically strings with
an optional language tag) and XML literals (which are more or less
plain literals plus a &ldquo;well-formed-xml&rdquo; flag).</p>
<p>Part of the remit for the 2001
<a href="http://www.w3.org/2001/sw/RDFCore/">RDF Core</a> working group was to
add to RDF support for typed values, i.e. things like numbers.
These notes describe the support for typed literals in
Jena2.</p>
<p>Before going into the Jena details here are some informal reminders
of how typed literals work in RDF. We refer readers to the RDF core
<a href="http://www.w3.org/TR/rdf-mt/">semantics</a>,
<a href="http://www.w3.org/TR/rdf-syntax-grammar">syntax</a> and
<a href="http://www.w3.org/TR/rdf-concepts/">concepts</a> documents for more
precise details.</p>
<p>In RDF, typed literal values comprise a string (the lexical form of
the literal) and a datatype (identified by a URI). The datatype is
supposed to denote a mapping from lexical forms to some space of
values. The pair comprising the literal then denotes an element of
the value space of the datatype. For example, a typed literal
comprising <code>(&quot;true&quot;, xsd:boolean)</code> would denote the abstract true
value <code>T</code>.</p>
<p>In the RDF/XML syntax typed literals are notated with syntax such
as:</p>
<pre><code>&lt;age rdf:datatype=&quot;http://www.w3.org/2001/XMLSchema#int&quot;&gt;13&lt;/age&gt;
</code></pre>
<p>In NTriple syntax the notation is:</p>
<pre><code>&quot;13&quot;^^&lt;http://www.w3.org/2001/XMLSchema#int&gt;
</code></pre>
<p>In Turtle, it can be abbreviated:</p>
<pre><code>&quot;13&quot;^^xsd:int
</code></pre>
<p>This <code>^^</code> notation will appear in literals printed by Jena.</p>
<p>Note that a literal is either typed or plain (an old style literal)
and which it is can be determined statically. There is no way to
define a literal as having a lexical value of, say &ldquo;13&rdquo; but leave
its datatype open and then infer the datatype from some schema or
ontology definition.</p>
<p>In the new scheme of things well-formed XML literals are treated as
typed literals whose datatype is the special type
<code>rdf:XMLLiteral</code>.</p>
<h2 id="basic-api-operations">Basic API operations</h2>
<p>Jena will correctly parse typed literals within RDF/XML, NTriple
and Turtle source files. The same Java object,
<a href="/documentation/javadoc/jena/org/apache/jena/rdf/model/Literal.html"><code>Literal</code></a>
will represent &ldquo;plain&rdquo; and &ldquo;typed&rdquo; literals. Literal now supports
some new methods:</p>
<ul>
<li>
<p><code>getDatatype()</code>
Returns null for a plain literal or a Java object which represents
the datatype of a typed Literal.</p>
</li>
<li>
<p><code>getDatatypeURI()</code>
Returns null for a plain literal or the URI of the datatype of a
typed Literal.</p>
</li>
<li>
<p><code>getValue()</code>
Returns a Java object representing the value of the literal, for
example for an xsd:int this will be a java.lang.Integer, for plain
literals it will be a String.
The converse operation of creating a Java object to represent a
typed literal in a model can be achieved using:</p>
</li>
<li>
<p><code>model.createTypedLiteral(value, datatype)</code>
This allows the <code>value</code> to be specified by a lexical form (i.e. a
String) or by a Java object representing the typed value; the
<code>datatype</code> can be specified by a URI string or a Java object
representing the datatype.</p>
</li>
</ul>
<p>In addition there is a built in mapping from standard Java wrapper
objects to XSD datatypes (see later) so that the simpler call:</p>
<pre><code>model.createTypedLiteral(Object)
</code></pre>
<p>will create a typed literal with the datatype appropriate for
representing that java object. For example,</p>
<pre><code>Literal l = model.createTypedLiteral(new Integer(25));
</code></pre>
<p>will create a typed literal with the lexical value &ldquo;25&rdquo;, of type
<code>xsd:int</code>.</p>
<p>Note that there are also functions which look similar but do not
use typed literals. For example::</p>
<pre><code>Literal l = model.createLiteral(25);
int age = l.getInt();
</code></pre>
<p>These worked by converting the primitive to a string and storing
the resulting string as a plain literal. The inverse operation then
attempts to parse the string of the plain literal (as an int in
this example). These are for backward compatibility with earlier
versions of Jena and older datasets. In normal circumstances
<code>createTypedLiteral</code> is preferable.</p>
<h3 id="equality-issues">Equality issues</h3>
<p>There is a well defined notion of when two typed literals should be
equal, based on the equality defined for the datatype in question.
Jena2 implements this equality function by using the method
<code>sameValueAs</code>. Thus two literals (&ldquo;13&rdquo;, xsd:int) and (&ldquo;13&rdquo;,
xsd:decimal) will test as sameValueAs each other but neither will
test sameValueAs (&ldquo;13&rdquo;, xsd:string).</p>
<p>Note that this is a different function from the Java <code>equals</code>
method. Had we changed the equals method to test for semantic
equality problems would have arisen because the two objects are not
substitutable in the Java sense (for example they return different
values from a getDatatype() call). This would, for example, have
made it impossible to cache literals in a hash table.</p>
<h2 id="how-datatypes-are-represented">How datatypes are represented</h2>
<p>Datatypes for typed literals are represented by instances of the
interface
<a href="/documentation/javadoc/jena/org/apache/jena/datatypes/RDFDatatype.html"><code>org.apache.jena.datatypes.RDFDatatype</code></a>.
Instances of this interface can be used to parse and serialized
typed data, test for equality and test if a typed or lexical value
is a legal value for this datatype.</p>
<p>Prebuilt instances of this interface are included for all the main
XSD datatypes (see <a href="#xsd">below</a>).</p>
<p>In addition, it is possible for an application to define new
datatypes and register them against some URI (see
<a href="#userdef">below</a>).</p>
<h3 id="error-detection">Error detection</h3>
<p>When Jena parses a datatype whose lexical value is not legal for
the declared datatype is does not immediately throw an error. This
is because the RDFCore working group has defined that illegal
datatype values are errors but are not syntactic errors so we try
to avoid have parsers break at this point. Instead a literal is
created which is marked internally as ill-formed and the first time
an application attempts to access its value (with <code>getValue()</code>) an
error will be thrown.</p>
<p>When Jena is reading a file there is also the issue of what to do
when it encounters a typed value whose datatype URI is not one that
is knows about. The default behaviour is to create a new datatype
object (whose value space is the same as its lexical space). Again
this behaviour seems in keeping with the working group preference
that illegal datatypes are semantic but not syntactic errors.</p>
<p>However, both of these behaviours can mean that simple common
errors (such as mis-spelling the xsd namespace) may go unnoticed
until very late on. To overcome this we have hidden some global
switches that allow you to force Jena to report such syntactic
errors earlier. These are static Boolean parameters:</p>
<pre><code>org.apache.jena.shared.impl.JenaParameters.enableEagerLiteralValidation
org.apache.jena.shared.impl.JenaParameters.enableSilentAcceptanceOfUnknownDatatypes
</code></pre>
<p>They are placed here in an impl package (and thus only visible in
the full javadoc, not the API javadoc) because they should not be
regarded as stable. We plan to develop a cleaner way of setting
mode switches for Jena and these switches will migrate there in due
course, if they prove to be useful.</p>
<h2 id="xsd-data-types">XSD data types</h2>
<p>Jena includes prebuilt, and pre-registered, instances of
<code>RDFDatatype</code> for all of the relevant XSD types:</p>
<pre><code>float double int long short byte unsignedByte unsignedShort
unsignedInt unsignedLong decimal integer nonPositiveInteger
nonNegativeInteger positiveInteger negativeInteger Boolean string
normalizedString anyURI token Name QName language NMTOKEN ENTITIES
NMTOKENS ENTITY ID NCName IDREF IDREFS NOTATION hexBinary
base64Binary date time dateTime duration gDay gMonth gYear
gYearMonth gMonthDay
</code></pre>
<p>These are all available as static member variables from
<a href="/documentation/javadoc/jena/org/apache/jena/datatypes/xsd/XSDDatatype.html"><code>org.apache.jena.datatypes.xsd.XSDDatatype</code></a>.</p>
<p>Of these types, the following are registered as the default type to
use to represent certain Java classes:</p>
<table>
<thead>
<tr>
<th>Java class</th>
<th>xsd type</th>
</tr>
</thead>
<tbody>
<tr>
<td>Float</td>
<td>float</td>
</tr>
<tr>
<td>Double</td>
<td>double</td>
</tr>
<tr>
<td>Integer</td>
<td>int</td>
</tr>
<tr>
<td>Long</td>
<td>long</td>
</tr>
<tr>
<td>Short</td>
<td>short</td>
</tr>
<tr>
<td>Byte</td>
<td>byte</td>
</tr>
<tr>
<td>BigInteger</td>
<td>integer</td>
</tr>
<tr>
<td>BigDecimal</td>
<td>decimal</td>
</tr>
<tr>
<td>Boolean</td>
<td>Boolean</td>
</tr>
<tr>
<td>String</td>
<td>string</td>
</tr>
</tbody>
</table>
<p>Thus when creating a typed literal from a Java <code>BigInteger</code> then
<code>xsd:integer</code> will be used. The converse mapping is more adaptive.
When parsing an xsd:integer the Java value object used will be an
Integer, Long or BigInteger depending on the size of the specific
value being represented.</p>
<h2 id="user-defined-xsd-data-types">User defined XSD data types</h2>
<p>XML schema allows derived types to be defined in which a base type
is modified through some facet restriction such as limiting the
min/max of an integer or restricting a string to a regular
expression. It also allows new types to be created by unioning
other types or by constructing lists of other types.</p>
<p>Jena provides support for derived and union types but not for list
types.</p>
<p>These are supported through the <code>XSDDatatype.loadUserDefined</code>
method which allows an XML schema datatype file to be loaded. This
registers a new <code>RDFDatatype</code> that can be used to create, parse,
serialize, test instances of that datatype.</p>
<p>There is one difficult issue in here, what URI to give to the user
defined datatype? This is not defined by XML Schema, nor RDF nor
OWL. Jena2 adopts the position that the defined
datatype should have the base URI of the schema file with a
fragment identifier given by the datatype name.</p>
<p>To illustrate working with the defined types, the following code
then tries to create and use two instances of the over 12 type:</p>
<pre><code>Model m = ModelFactory.createDefaultModel();
RDFDatatype over12Type = tm.getSafeTypeByName(uri + &quot;#over12&quot;);
Object value = null;
try {
    value = &quot;15&quot;;
    m.createTypedLiteral((String)value, over12Type).getValue();
    System.out.println(&quot;Over 12 value of &quot; + value + &quot; is ok&quot;);
    value = &quot;12&quot;;
    m.createTypedLiteral((String)value, over12Type).getValue();
    System.out.println(&quot;Over 12 value of &quot; + value + &quot; is OK&quot;);
} catch (DatatypeFormatException e) {
    System.out.println(&quot;Over 12 value of &quot; + value + &quot; is illegal&quot;);
}
</code></pre>
<p>which products the output:</p>
<pre><code>Over 12 value of 15 is OK
Over 12 value of 12 is illegal
</code></pre>
<h2 id="user-defined-non-xsd-data-types">User defined non-XSD data types</h2>
<p>RDF allows any URI to be used as a datatype but provides no
standard for how to map the datatype URI to a datatype definition.</p>
<p>Within Jena2 we allow new datatypes to be created and registered by
using the
<a href="/documentation/javadoc/jena/org/apache/jena/datatypes/TypeMapper.html"><code>TypeMapper</code></a>
class.</p>
<p>The easiest way to define a new RDFDatatype is to subclass
BaseDatatype and define implementations for parse, unparse and
isEqual.</p>
<p>For example here is the outline of a type used to represent
rational numbers:</p>
<pre><code>class RationalType extends BaseDatatype {
    public static final String theTypeURI = &quot;urn:x-hp-dt:rational&quot;;
    public static final RDFDatatype theRationalType = new RationalType();

    /** private constructor - single global instance */
    private RationalType() {
        super(theTypeURI);
    }

    /**
     * Convert a value of this datatype out
     * to lexical form.
     */
    public String unparse(Object value) {
        Rational r = (Rational) value;
        return Integer.toString(r.getNumerator()) + &quot;/&quot; + r.getDenominator();
    }

    /**
     * Parse a lexical form of this datatype to a value
     * @throws DatatypeFormatException if the lexical form is not legal
     */
    public Object parse(String lexicalForm) throws DatatypeFormatException {
        int index = lexicalForm.indexOf(&quot;/&quot;);
        if (index == -1) {
            throw new DatatypeFormatException(lexicalForm, theRationalType, &quot;&quot;);
        }
        try {
            int numerator = Integer.parseInt(lexicalForm.substring(0, index));
            int denominator = Integer.parseInt(lexicalForm.substring(index+1));
            return new Rational(numerator, denominator);
        } catch (NumberFormatException e) {
            throw new DatatypeFormatException(lexicalForm, theRationalType, &quot;&quot;);
        }
    }

    /**
     * Compares two instances of values of the given datatype.
     * This does not allow rationals to be compared to other number
     * formats, Lang tag is not significant.
     */
    Public Boolean isEqual(LiteralLabel value1, LiteralLabel value2) {
        return value1.getDatatype() == value2.getDatatype()
             &amp;&amp; value1.getValue().equals(value2.getValue());
    }
}
</code></pre>
<p>To register and use this type you simply need the call:</p>
<pre><code>RDFDatatype rtype = RationalType.theRationalType;
TypeMapper.getInstance().registerDatatype(rtype);
...
// Create a rational literal
Literal l1 = m.createTypedLiteral(&quot;3/5&quot;, rtype);
</code></pre>
<p>Note that whilst any serialization of RDF containing such user
defined literals will be perfectly legal a client application has
no standard way of looking up the datatype URI you have chosen.
This has to be done &ldquo;out of band&rdquo; as they say.</p>
<h2 id="a-note-on-xmllang">A note on xml:Lang</h2>
<p>Plain literals have an xml:Lang tag as well as a string value. Two
plain literals with the same string but different Lang tags are not
equal.</p>
<p>XML Schema states that xml:Lang is not meaningful on xsd
datatypes.</p>
<p>Thus for almost all typed literals there is no xml:Lang tag.</p>
<p>At the time of last call the RDF specifications allowed the special
case that <code>rdf:XMLLiteral</code>s could have a Lang tag that would be
significant in equality testing. Thus in preview releases of Jena2
the createTypedLiterals calls took an extra Lang tag argument.</p>
<p>However, at the time of writing that specification has been changed
so that Lang tags will never be significant on typed literals
(whether this means that xml:Lang is not significant on XMLLiterals
or means that XMLLiteral will cease to be a typed literal is not
completely certain).</p>
<p>For this reason we have removed the Lang tag from the
createTypedLiterals calls and deprecated the createLiteral call
which allowed both wellFormedXML and Lang tag to be specified.</p>
<p>We do not expect to need to change the API even if the working
group decision changes again, the most we might expect to do would
be to undeprecate the 3-argument version of createLiteral.</p>


        </div>
    </div>

</div>

<footer class="footer">
    <div class="container" style="font-size:80%" >
        <p>
            Copyright &copy; 2011&ndash;2022 The Apache Software Foundation, Licensed under the
            <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.
        </p>
        <p>
            Apache Jena, Jena, the Apache Jena project logo, Apache and the Apache feather logos are trademarks of
            The Apache Software Foundation.
            <br/>
          <a href="https://privacy.apache.org/policies/privacy-policy-public.html"
             >Apache Software Foundation Privacy Policy</a>.
        </p>
    </div>
</footer>


<script type="text/javascript">
    var link = $('a[href="' + this.location.pathname + '"]');
    if (link != undefined)
        link.parents('li,ul').addClass('active');
</script>

</body>
</html>
