| <!DOCTYPE html> |
| <html lang="en"> |
| <head> |
| |
| |
| <title>Apache Jena - SDB Database Notes</title> |
| <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"> |
| |
| <link href="/css/bootstrap.min.css" rel="stylesheet" media="screen"> |
| <link href="/css/bootstrap-extension.css" rel="stylesheet" type="text/css"> |
| <link href="/css/jena.css" rel="stylesheet" type="text/css"> |
| <link rel="shortcut icon" href="/images/favicon.ico" /> |
| |
| <script src="https://code.jquery.com/jquery-2.2.4.min.js" |
| integrity="sha256-BbhdlvQf/xTY9gja0Dq3HiwQF8LaCRTXxZKRutelT44=" |
| crossorigin="anonymous"></script> |
| <script src="/js/jena-navigation.js" type="text/javascript"></script> |
| <script src="/js/bootstrap.min.js" type="text/javascript"></script> |
| |
| <script src="/js/improve.js" type="text/javascript"></script> |
| |
| |
| </head> |
| |
| <body> |
| |
| <nav class="navbar navbar-default" role="navigation"> |
| <div class="container"> |
| <div class="navbar-header"> |
| <button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".navbar-ex1-collapse"> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| </button> |
| <a class="navbar-brand" href="/index.html"> |
| <img class="logo-menu" src="/images/jena-logo/jena-logo-notext-small.png" alt="jena logo">Apache Jena</a> |
| </div> |
| |
| <div class="collapse navbar-collapse navbar-ex1-collapse"> |
| <ul class="nav navbar-nav"> |
| <li id="homepage"><a href="/index.html"><span class="glyphicon glyphicon-home"></span> Home</a></li> |
| <li id="download"><a href="/download/index.cgi"><span class="glyphicon glyphicon-download-alt"></span> Download</a></li> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown"><span class="glyphicon glyphicon-book"></span> Learn <b class="caret"></b></a> |
| <ul class="dropdown-menu"> |
| <li class="dropdown-header">Tutorials</li> |
| <li><a href="/tutorials/index.html">Overview</a></li> |
| <li><a href="/documentation/fuseki2/index.html">Fuseki Triplestore</a></li> |
| <li><a href="/documentation/notes/index.html">How-To's</a></li> |
| <li><a href="/documentation/query/manipulating_sparql_using_arq.html">Manipulating SPARQL using ARQ</a></li> |
| <li><a href="/tutorials/rdf_api.html">RDF core API tutorial</a></li> |
| <li><a href="/tutorials/sparql.html">SPARQL tutorial</a></li> |
| <li><a href="/tutorials/using_jena_with_eclipse.html">Using Jena with Eclipse</a></li> |
| <li class="divider"></li> |
| <li class="dropdown-header">References</li> |
| <li><a href="/documentation/index.html">Overview</a></li> |
| <li><a href="/documentation/query/index.html">ARQ (SPARQL)</a></li> |
| <li><a href="/documentation/assembler/index.html">Assembler</a></li> |
| <li><a href="/documentation/tools/index.html">Command-line tools</a></li> |
| <li><a href="/documentation/rdfs/">Data with RDFS Inferencing</a></li> |
| <li><a href="/documentation/geosparql/index.html">GeoSPARQL</a></li> |
| <li><a href="/documentation/inference/index.html">Inference API</a></li> |
| <li><a href="/documentation/javadoc.html">Javadoc</a></li> |
| <li><a href="/documentation/ontology/">Ontology API</a></li> |
| <li><a href="/documentation/permissions/index.html">Permissions</a></li> |
| <li><a href="/documentation/extras/querybuilder/index.html">Query Builder</a></li> |
| <li><a href="/documentation/rdf/index.html">RDF API</a></li> |
| <li><a href="/documentation/rdfconnection/">RDF Connection - SPARQL API</a></li> |
| <li><a href="/documentation/io/">RDF I/O</a></li> |
| <li><a href="/documentation/rdfstar/index.html">RDF-star</a></li> |
| <li><a href="/documentation/shacl/index.html">SHACL</a></li> |
| <li><a href="/documentation/shex/index.html">ShEx</a></li> |
| <li><a href="/documentation/jdbc/index.html">SPARQL over JDBC</a></li> |
| <li><a href="/documentation/tdb/index.html">TDB</a></li> |
| <li><a href="/documentation/tdb2/index.html">TDB2</a></li> |
| <li><a href="/documentation/query/text-query.html">Text Search</a></li> |
| </ul> |
| </li> |
| |
| <li class="drop down"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown"><span class="glyphicon glyphicon-book"></span> Javadoc <b class="caret"></b></a> |
| <ul class="dropdown-menu"> |
| <li><a href="/documentation/javadoc.html">All Javadoc</a></li> |
| <li><a href="/documentation/javadoc/arq/">ARQ</a></li> |
| <li><a href="/documentation/javadoc_elephas.html">Elephas</a></li> |
| <li><a href="/documentation/javadoc/fuseki2/">Fuseki</a></li> |
| <li><a href="/documentation/javadoc/geosparql/">GeoSPARQL</a></li> |
| <li><a href="/documentation/javadoc/jdbc/">JDBC</a></li> |
| <li><a href="/documentation/javadoc/jena/">Jena Core</a></li> |
| <li><a href="/documentation/javadoc/permissions/">Permissions</a></li> |
| <li><a href="/documentation/javadoc/extras/querybuilder/">Query Builder</a></li> |
| <li><a href="/documentation/javadoc/shacl/">SHACL</a></li> |
| <li><a href="/documentation/javadoc/tdb/">TDB</a></li> |
| <li><a href="/documentation/javadoc/text/">Text Search</a></li> |
| </ul> |
| </li> |
| |
| <li id="ask"><a href="/help_and_support/index.html"><span class="glyphicon glyphicon-question-sign"></span> Ask</a></li> |
| |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown"><span class="glyphicon glyphicon-bullhorn"></span> Get involved <b class="caret"></b></a> |
| <ul class="dropdown-menu"> |
| <li><a href="/getting_involved/index.html">Contribute</a></li> |
| <li><a href="/help_and_support/bugs_and_suggestions.html">Report a bug</a></li> |
| <li class="divider"></li> |
| <li class="dropdown-header">Project</li> |
| <li><a href="/about_jena/about.html">About Jena</a></li> |
| <li><a href="/about_jena/architecture.html">Architecture</a></li> |
| <li><a href="/about_jena/citing.html">Citing</a></li> |
| <li><a href="/about_jena/team.html">Project team</a></li> |
| <li><a href="/about_jena/contributions.html">Related projects</a></li> |
| <li><a href="/about_jena/roadmap.html">Roadmap</a></li> |
| <li class="divider"></li> |
| <li class="dropdown-header">ASF</li> |
| <li><a href="http://www.apache.org/">Apache Software Foundation</a></li> |
| <li><a href="http://www.apache.org/foundation/sponsorship.html">Become a Sponsor</a></li> |
| <li><a href="http://www.apache.org/licenses/LICENSE-2.0">License</a></li> |
| <li><a href="http://www.apache.org/security/">Security</a></li> |
| <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li> |
| </ul> |
| </li> |
| |
| |
| |
| |
| <li id="edit"><a href="https://github.com/apache/jena-site/edit/main/source/documentation/archive/sdb/db_notes.md" title="Edit this page on GitHub"><span class="glyphicon glyphicon-pencil"></span> Edit this page</a></li> |
| </ul> |
| </div> |
| </div> |
| </nav> |
| |
| |
| <div class="container"> |
| <div class="row"> |
| <div class="col-md-12"> |
| <div id="breadcrumbs"> |
| |
|
|
|
|
|
|
|
|
|
|
|
|
| <ol class="breadcrumb">
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| <li><a href='/documentation'>DOCUMENTATION</a></li>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| <li><a href='/documentation/archive'>ARCHIVE</a></li>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| <li><a href='/documentation/archive/sdb'>SDB</a></li>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| <li class="active">DB NOTES</li>
|
|
|
|
|
|
|
|
|
| </ol>
|
|
|
|
|
|
|
| |
| |
| </div> |
| <h1 class="title">SDB Database Notes</h1> |
| |
| <ul> |
| <li><a href="#db2">DB2</a></li> |
| <li><a href="#derby">Derby</a></li> |
| <li><a href="#ms-sql">MS SQL</a></li> |
| <li><a href="#mysql">MySQL</a></li> |
| <li><a href="#postgresql">PostgresQL</a></li> |
| </ul> |
| <h2 id="db2">DB2</h2> |
| <h3 id="database-creation">Database creation</h3> |
| <p>The database should be created with code set UTF-8 so unicode is |
| enabled (SDB creates tables CCSID UNICODE for full |
| internationalization support).</p> |
| <h2 id="derby">Derby</h2> |
| <h3 id="loading-restriction">Loading Restriction</h3> |
| <p>Only one load operation can be active at any one time. Limitations |
| on temporary tables in Derby mean the loader tables are not |
| temporary and hence are shared by all connections.</p> |
| <h2 id="ms-sql">MS SQL</h2> |
| <p>The collation sequence for the database must be one that is binary |
| (BIN in the name). It does not matter which one is used. Without |
| BIN, string matching is case insensitive but RDF requires case |
| sensitive literals and IRIs. The normal layout is not affected by |
| this because it does not use string comparisons.</p> |
| <h2 id="mysql">MySQL</h2> |
| <h3 id="national-characters">National Characters</h3> |
| <p>SDB formats all table columns used for storing text in the MySQL |
| schema to UTF-8. However, this does not cause the data to be |
| transmitted in UTF-8 over the JDBC connection.</p> |
| <p>The best way is to run the server with a default character set of |
| UTF-8. This is set in the MySQL server configuration file:</p> |
| <pre><code>[mysql] |
| default-character-set=utf8 |
| </code></pre> |
| <p>A less reliable way is to pass parameters to the JDBC driver in the |
| JDBC URL. The application will need to explicitly set the JDBC URL |
| in the |
| <a href="store_description.html" title="SDB/Store Description">store configuration</a> |
| file.</p> |
| <pre><code> ...?useUnicode=true&characterEncoding=UTF-8 |
| </code></pre> |
| <h3 id="connection-timeouts">Connection timeouts</h3> |
| <p>If you get the connection timing out after (by default) 8 hours of |
| no activity, try setting <code>autoReconnect=true</code> in the JDBC URL.</p> |
| <h3 id="tuning">Tuning</h3> |
| <ol> |
| <li>For InndoDB, the critical parameter is |
| <code>innodb_buffer_pool_size</code>. See the MySQL sample configuration files |
| for details.</li> |
| <li>Using ANALYZE TABLE on the database tables can improve the |
| choices made by the MySQL optimizer.</li> |
| </ol> |
| <h3 id="connection-timeout">Connection Timeout</h3> |
| <p>MySQL closes the JDBC connection after a period of no use (8 hours |
| by default).</p> |
| <p>While deprecated my MySQL, <code>?autoReconnect=true</code> may help here.</p> |
| <p>Other ways of addressing the problem are to make a simple query |
| call on a regular basis just to keep the connection alive (e.g. |
| <code>SELECT * { <http://example/junk> <http://example/junk> <http://example/junk> }</code>).</p> |
| <p>Some connection pool systems automatic compensate for this feature |
| of MySQL.</p> |
| <h2 id="postgresql">PostgresQL</h2> |
| <h3 id="databases-must-use-utf-8-encoding">Databases must use UTF-8 encoding</h3> |
| <p>Create SDB stores with encoding UTF-8.</p> |
| <p>International character sets can cause corrupted databases |
| otherwise. The database will not pass the SDB test suite.</p> |
| <p>Set this when creating the database with pgAdmin or if you use the |
| command line, for example:</p> |
| <pre><code> CREATE DATABASE "YourStoreName" |
| WITH OWNER = "user" |
| ENCODING = 'UTF8' |
| TABLESPACE = pg_default; |
| </code></pre> |
| <h3 id="improving-loading-rates">Improving loading rates</h3> |
| <p>The index layout (“layout2/index”) usually loads faster than the |
| hash form.</p> |
| <p><em>Existing store</em></p> |
| <p>When loading into an existing store, where there is existing data |
| and <code>ANALYZE</code> has been run, the process is:</p> |
| <ul> |
| <li> |
| <p>Drop indexes</p> |
| <p>sdbconfig –drop</p> |
| </li> |
| <li> |
| <p>Load data</p> |
| <p>sdbload file</p> |
| </li> |
| <li> |
| <p>Redo the indexes</p> |
| <p>sdbconfig –index</p> |
| </li> |
| </ul> |
| <p><em>Fresh store</em></p> |
| <p>PostgreSQL needs statistics to improve load performance through the |
| use of <code>ANALYSE</code>.</p> |
| <p>When loading the first time, there are no statistics so, for a |
| large load, it is advisable to load a sample, run <code>ANALYSE</code> and |
| then load the whole data.</p> |
| <ul> |
| <li> |
| <p>Create the database without indexes (just the primary keys).</p> |
| <p>sdbconfig –format</p> |
| </li> |
| <li> |
| <p>Load a sample of the triples (say, a 100K or a million triples</p> |
| <ul> |
| <li>until</li> |
| </ul> |
| </li> |
| </ul> |
| <p>the load rate starts to drop appreciably). The sample must be |
| representative of the data.</p> |
| <pre><code> sdbload --time sample |
| </code></pre> |
| <ul> |
| <li> |
| <p>Run <code>ANALYZE</code> on the database.</p> |
| </li> |
| <li> |
| <p>If your sample is one part of a large set of files, this set is |
| not necessary at all. If you are loading one single large file then |
| you might wish to empty the database. This is only needed if the |
| data has bNodes in</p> |
| </li> |
| </ul> |
| <p>it because the load process suppresses duplicates.</p> |
| <pre><code> sdbconfig --truncate |
| </code></pre> |
| <ul> |
| <li> |
| <p>Now load the data or rest of the data.</p> |
| <p>sdbload –time file</p> |
| </li> |
| <li> |
| <p>Add the indexes. This only takes a few minutes even on a very |
| large store but calculating them during loading (that is, |
| <code>--create</code>, not <code>--format</code>) is noticeably slower.</p> |
| <p>sdbconfig –index</p> |
| </li> |
| <li> |
| <p>Run <code>ANALYZE</code> on the database again.</p> |
| </li> |
| </ul> |
| <h3 id="tuning-1">Tuning</h3> |
| <p>It is essential to run the PostgreSQL ANALYZE command on a |
| database, either during or after building. This is done via the |
| command line <code>psql</code> or via <code>pgAdmin</code>. The PostgreSQL documentation |
| describes ways to run this as a background daemon.</p> |
| <p>Various of the PostgreSQL configuration parameters will affect |
| performance, particularly <code>effective_cache_size</code>. The parameter |
| <code>enable_seqscan</code> <em>may</em> help avoid some unexpected slow queries.</p> |
| |
| |
| </div> |
| </div> |
| |
| </div> |
| |
| <footer class="footer"> |
| <div class="container" style="font-size:80%" > |
| <p> |
| Copyright © 2011–2022 The Apache Software Foundation, Licensed under the |
| <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>. |
| </p> |
| <p> |
| Apache Jena, Jena, the Apache Jena project logo, Apache and the Apache feather logos are trademarks of |
| The Apache Software Foundation. |
| <br/> |
| <a href="https://privacy.apache.org/policies/privacy-policy-public.html" |
| >Apache Software Foundation Privacy Policy</a>. |
| </p> |
| </div> |
| </footer> |
| |
| |
| <script type="text/javascript"> |
| var link = $('a[href="' + this.location.pathname + '"]'); |
| if (link != undefined) |
| link.parents('li,ul').addClass('active'); |
| </script> |
| |
| </body> |
| </html> |