blob: 20037b35828ff21b2674f955406d39bebfabf4e0 [file] [log] [blame]
<!DOCTYPE HTML>
<html lang="de">
<head>
<!-- Generated by javadoc (17) -->
<title>ModelParameterChunker (Apache OpenNLP Tools 2.3.3 API)</title>
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="description" content="declaration: package: opennlp.tools.ml.model, class: ModelParameterChunker">
<meta name="generator" content="javadoc/ClassWriterImpl">
<link rel="stylesheet" type="text/css" href="../../../../stylesheet.css" title="Style">
<link rel="stylesheet" type="text/css" href="../../../../script-dir/jquery-ui.min.css" title="Style">
<link rel="stylesheet" type="text/css" href="../../../../jquery-ui.overrides.css" title="Style">
<script type="text/javascript" src="../../../../script.js"></script>
<script type="text/javascript" src="../../../../script-dir/jquery-3.6.1.min.js"></script>
<script type="text/javascript" src="../../../../script-dir/jquery-ui.min.js"></script>
</head>
<body class="class-declaration-page">
<script type="text/javascript">var evenRowColor = "even-row-color";
var oddRowColor = "odd-row-color";
var tableTab = "table-tab";
var activeTableTab = "active-table-tab";
var pathtoroot = "../../../../";
loadScripts(document, 'script');</script>
<noscript>
<div>JavaScript is disabled on your browser.</div>
</noscript>
<div class="flex-box">
<header role="banner" class="flex-header">
<nav role="navigation">
<!-- ========= START OF TOP NAVBAR ======= -->
<div class="top-nav" id="navbar-top">
<div class="skip-nav"><a href="#skip-navbar-top" title="Skip navigation links">Skip navigation links</a></div>
<ul id="navbar-top-firstrow" class="nav-list" title="Navigation">
<li><a href="../../../../index.html">Overview</a></li>
<li><a href="package-summary.html">Package</a></li>
<li class="nav-bar-cell1-rev">Class</li>
<li><a href="package-tree.html">Tree</a></li>
<li><a href="../../../../deprecated-list.html">Deprecated</a></li>
<li><a href="../../../../index-all.html">Index</a></li>
<li><a href="../../../../help-doc.html#class">Help</a></li>
</ul>
</div>
<div class="sub-nav">
<div>
<ul class="sub-nav-list">
<li>Summary:&nbsp;</li>
<li>Nested&nbsp;|&nbsp;</li>
<li><a href="#field-summary">Field</a>&nbsp;|&nbsp;</li>
<li>Constr&nbsp;|&nbsp;</li>
<li><a href="#method-summary">Method</a></li>
</ul>
<ul class="sub-nav-list">
<li>Detail:&nbsp;</li>
<li><a href="#field-detail">Field</a>&nbsp;|&nbsp;</li>
<li>Constr&nbsp;|&nbsp;</li>
<li><a href="#method-detail">Method</a></li>
</ul>
</div>
<div class="nav-list-search"><label for="search-input">SEARCH:</label>
<input type="text" id="search-input" value="search" disabled="disabled">
<input type="reset" id="reset-button" value="reset" disabled="disabled">
</div>
</div>
<!-- ========= END OF TOP NAVBAR ========= -->
<span class="skip-nav" id="skip-navbar-top"></span></nav>
</header>
<div class="flex-content">
<main role="main">
<!-- ======== START OF CLASS DATA ======== -->
<div class="header">
<div class="sub-title"><span class="package-label-in-type">Package</span>&nbsp;<a href="package-summary.html">opennlp.tools.ml.model</a></div>
<h1 title="Class ModelParameterChunker" class="title">Class ModelParameterChunker</h1>
</div>
<div class="inheritance" title="Inheritance Tree"><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html" title="class or interface in java.lang" class="external-link">java.lang.Object</a>
<div class="inheritance">opennlp.tools.ml.model.ModelParameterChunker</div>
</div>
<section class="class-description" id="class-description">
<hr>
<div class="type-signature"><span class="modifiers">public final class </span><span class="element-name type-name-label">ModelParameterChunker</span>
<span class="extends-implements">extends <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html" title="class or interface in java.lang" class="external-link">Object</a></span></div>
<div class="block">A helper class that handles Strings with more than 64k (65535 bytes) in length.
This is achieved via the signature <a href="#SIGNATURE_CHUNKED_PARAMS"><code>SIGNATURE_CHUNKED_PARAMS</code></a> at the beginning of
the String instance to be written to a <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/io/DataOutputStream.html" title="class or interface in java.io" class="external-link"><code>DataOutputStream</code></a>.
<p>
Background: In OpenNLP, for large(r) corpora, we train models whose (UTF String) parameters will exceed
the <code>MAX_CHUNK_SIZE_BYTES</code> bytes limit set in <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/io/DataOutputStream.html" title="class or interface in java.io" class="external-link"><code>DataOutputStream</code></a>.
For writing and reading those models, we have to chunk up those string instances in 64kB blocks and
recombine them correctly upon reading a (binary) model file.
<p>
The problem was raised in <a href="https://issues.apache.org/jira/browse/OPENNLP-1366">ticket OPENNLP-1366</a>.
<p>
Solution strategy:
<ul>
<li>If writing parameters to a <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/io/DataOutputStream.html" title="class or interface in java.io" class="external-link"><code>DataOutputStream</code></a> blows up with a <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/io/UTFDataFormatException.html" title="class or interface in java.io" class="external-link"><code>UTFDataFormatException</code></a> a
large String instance is chunked up and written as appropriate blocks.</li>
<li>To indicate that chunking was conducted, we start with the <a href="#SIGNATURE_CHUNKED_PARAMS"><code>SIGNATURE_CHUNKED_PARAMS</code></a> indicator,
directly followed by the number of chunks used. This way, when reading in chunked model parameters,
recombination is achieved transparently.</li>
</ul>
<p>
Note: Both, existing (binary) model files and newly trained models which don't require the chunking
technique, will be supported like in previous OpenNLP versions.</div>
<dl class="notes">
<dt>Author:</dt>
<dd><a href="mailto:martin.wiesner@hs-heilbronn.de">Martin Wiesner</a>, <a href="mailto:struberg@apache.org">Mark Struberg</a></dd>
</dl>
</section>
<section class="summary">
<ul class="summary-list">
<!-- =========== FIELD SUMMARY =========== -->
<li>
<section class="field-summary" id="field-summary">
<h2>Field Summary</h2>
<div class="caption"><span>Fields</span></div>
<div class="summary-table three-column-summary">
<div class="table-header col-first">Modifier and Type</div>
<div class="table-header col-second">Field</div>
<div class="table-header col-last">Description</div>
<div class="col-first even-row-color"><code>static final <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a></code></div>
<div class="col-second even-row-color"><code><a href="#SIGNATURE_CHUNKED_PARAMS" class="member-name-link">SIGNATURE_CHUNKED_PARAMS</a></code></div>
<div class="col-last even-row-color">&nbsp;</div>
</div>
</section>
</li>
<!-- ========== METHOD SUMMARY =========== -->
<li>
<section class="method-summary" id="method-summary">
<h2>Method Summary</h2>
<div id="method-summary-table">
<div class="table-tabs" role="tablist" aria-orientation="horizontal"><button id="method-summary-table-tab0" role="tab" aria-selected="true" aria-controls="method-summary-table.tabpanel" tabindex="0" onkeydown="switchTab(event)" onclick="show('method-summary-table', 'method-summary-table', 3)" class="active-table-tab">All Methods</button><button id="method-summary-table-tab1" role="tab" aria-selected="false" aria-controls="method-summary-table.tabpanel" tabindex="-1" onkeydown="switchTab(event)" onclick="show('method-summary-table', 'method-summary-table-tab1', 3)" class="table-tab">Static Methods</button><button id="method-summary-table-tab4" role="tab" aria-selected="false" aria-controls="method-summary-table.tabpanel" tabindex="-1" onkeydown="switchTab(event)" onclick="show('method-summary-table', 'method-summary-table-tab4', 3)" class="table-tab">Concrete Methods</button></div>
<div id="method-summary-table.tabpanel" role="tabpanel">
<div class="summary-table three-column-summary" aria-labelledby="method-summary-table-tab0">
<div class="table-header col-first">Modifier and Type</div>
<div class="table-header col-second">Method</div>
<div class="table-header col-last">Description</div>
<div class="col-first even-row-color method-summary-table method-summary-table-tab1 method-summary-table-tab4"><code>static <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a></code></div>
<div class="col-second even-row-color method-summary-table method-summary-table-tab1 method-summary-table-tab4"><code><a href="#readUTF(java.io.DataInputStream)" class="member-name-link">readUTF</a><wbr>(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/io/DataInputStream.html" title="class or interface in java.io" class="external-link">DataInputStream</a>&nbsp;dis)</code></div>
<div class="col-last even-row-color method-summary-table method-summary-table-tab1 method-summary-table-tab4">
<div class="block">Reads model parameters from <code>dis</code>.</div>
</div>
<div class="col-first odd-row-color method-summary-table method-summary-table-tab1 method-summary-table-tab4"><code>static void</code></div>
<div class="col-second odd-row-color method-summary-table method-summary-table-tab1 method-summary-table-tab4"><code><a href="#writeUTF(java.io.DataOutputStream,java.lang.String)" class="member-name-link">writeUTF</a><wbr>(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/io/DataOutputStream.html" title="class or interface in java.io" class="external-link">DataOutputStream</a>&nbsp;dos,
<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>&nbsp;s)</code></div>
<div class="col-last odd-row-color method-summary-table method-summary-table-tab1 method-summary-table-tab4">
<div class="block">Writes the model parameter <code>s</code> to <code>dos</code>.</div>
</div>
</div>
</div>
</div>
<div class="inherited-list">
<h3 id="methods-inherited-from-class-java.lang.Object">Methods inherited from class&nbsp;java.lang.<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html" title="class or interface in java.lang" class="external-link">Object</a></h3>
<code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html#equals(java.lang.Object)" title="class or interface in java.lang" class="external-link">equals</a>, <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html#getClass()" title="class or interface in java.lang" class="external-link">getClass</a>, <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html#hashCode()" title="class or interface in java.lang" class="external-link">hashCode</a>, <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html#notify()" title="class or interface in java.lang" class="external-link">notify</a>, <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html#notifyAll()" title="class or interface in java.lang" class="external-link">notifyAll</a>, <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html#toString()" title="class or interface in java.lang" class="external-link">toString</a>, <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html#wait()" title="class or interface in java.lang" class="external-link">wait</a>, <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html#wait(long)" title="class or interface in java.lang" class="external-link">wait</a>, <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html#wait(long,int)" title="class or interface in java.lang" class="external-link">wait</a></code></div>
</section>
</li>
</ul>
</section>
<section class="details">
<ul class="details-list">
<!-- ============ FIELD DETAIL =========== -->
<li>
<section class="field-details" id="field-detail">
<h2>Field Details</h2>
<ul class="member-list">
<li>
<section class="detail" id="SIGNATURE_CHUNKED_PARAMS">
<h3>SIGNATURE_CHUNKED_PARAMS</h3>
<div class="member-signature"><span class="modifiers">public static final</span>&nbsp;<span class="return-type"><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a></span>&nbsp;<span class="element-name">SIGNATURE_CHUNKED_PARAMS</span></div>
<dl class="notes">
<dt>See Also:</dt>
<dd>
<ul class="see-list">
<li><a href="../../../../constant-values.html#opennlp.tools.ml.model.ModelParameterChunker.SIGNATURE_CHUNKED_PARAMS">Constant Field Values</a></li>
</ul>
</dd>
</dl>
</section>
</li>
</ul>
</section>
</li>
<!-- ============ METHOD DETAIL ========== -->
<li>
<section class="method-details" id="method-detail">
<h2>Method Details</h2>
<ul class="member-list">
<li>
<section class="detail" id="readUTF(java.io.DataInputStream)">
<h3>readUTF</h3>
<div class="member-signature"><span class="modifiers">public static</span>&nbsp;<span class="return-type"><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a></span>&nbsp;<span class="element-name">readUTF</span><wbr><span class="parameters">(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/io/DataInputStream.html" title="class or interface in java.io" class="external-link">DataInputStream</a>&nbsp;dis)</span>
throws <span class="exceptions"><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/io/IOException.html" title="class or interface in java.io" class="external-link">IOException</a></span></div>
<div class="block">Reads model parameters from <code>dis</code>. In case the stream start with <a href="#SIGNATURE_CHUNKED_PARAMS"><code>SIGNATURE_CHUNKED_PARAMS</code></a>,
the number of chunks is detected and the original large parameter string is reconstructed from several
chunks.</div>
<dl class="notes">
<dt>Parameters:</dt>
<dd><code>dis</code> - The stream which will be used to read the model parameter from.</dd>
<dt>Throws:</dt>
<dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/io/IOException.html" title="class or interface in java.io" class="external-link">IOException</a></code></dd>
</dl>
</section>
</li>
<li>
<section class="detail" id="writeUTF(java.io.DataOutputStream,java.lang.String)">
<h3>writeUTF</h3>
<div class="member-signature"><span class="modifiers">public static</span>&nbsp;<span class="return-type">void</span>&nbsp;<span class="element-name">writeUTF</span><wbr><span class="parameters">(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/io/DataOutputStream.html" title="class or interface in java.io" class="external-link">DataOutputStream</a>&nbsp;dos,
<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>&nbsp;s)</span>
throws <span class="exceptions"><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/io/IOException.html" title="class or interface in java.io" class="external-link">IOException</a></span></div>
<div class="block">Writes the model parameter <code>s</code> to <code>dos</code>. In case <code>s</code> does exceed
<code>MAX_CHUNK_SIZE_BYTES</code> in length, the chunking mechanism is used; otherwise the parameter is
written 'as is'.</div>
<dl class="notes">
<dt>Parameters:</dt>
<dd><code>dos</code> - The <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/io/DataOutputStream.html" title="class or interface in java.io" class="external-link"><code>DataOutputStream</code></a> stream which will be used to persist the model.</dd>
<dd><code>s</code> - The input string that is checked for length and chunked if <code>MAX_CHUNK_SIZE_BYTES</code> is
exceeded.</dd>
<dt>Throws:</dt>
<dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/io/IOException.html" title="class or interface in java.io" class="external-link">IOException</a></code></dd>
</dl>
</section>
</li>
</ul>
</section>
</li>
</ul>
</section>
<!-- ========= END OF CLASS DATA ========= -->
</main>
<footer role="contentinfo">
<hr>
<p class="legal-copy"><small>Copyright &#169; 2023 <a href="https://www.apache.org/">The Apache Software Foundation</a>. All rights reserved.</small></p>
</footer>
</div>
</div>
</body>
</html>