docs/4.8.0-beta00013/api/codecs/Lucene.Net.Codecs.Memory.html - lucenenet-site - Git at Google

 <!DOCTYPE html>
 <!--[if IE]><![endif]-->
 <html>

   <head>
     <meta charset="utf-8">
     <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
     <title>Namespace Lucene.Net.Codecs.Memory
    | Apache Lucene.NET 4.8.0-beta00013 Documentation </title>
     <meta name="viewport" content="width=device-width">
     <meta name="title" content="Namespace Lucene.Net.Codecs.Memory
    | Apache Lucene.NET 4.8.0-beta00013 Documentation ">
     <meta name="generator" content="docfx 2.56.2.0">

     <link rel="shortcut icon" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/favicon.ico">
     <link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.css">
     <link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.css">
     <link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.css">
     <meta property="docfx:navrel" content="toc.html">
     <meta property="docfx:tocrel" content="codecs/toc.html">

     <meta property="docfx:rel" content="https://lucenenet.apache.org/docs/4.8.0-beta00009/">

   </head>
   <body data-spy="scroll" data-target="#affix" data-offset="120">
     <span id="forkongithub"><a href="https://github.com/apache/lucenenet" target="_blank">Fork me on GitHub</a></span>
     <div id="wrapper">
       <header>

         <nav id="autocollapse" class="navbar ng-scope" role="navigation">
           <div class="container">
             <div class="navbar-header">
               <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar">
                 <span class="sr-only">Toggle navigation</span>
                 <span class="icon-bar"></span>
                 <span class="icon-bar"></span>
                 <span class="icon-bar"></span>
               </button>

               <a class="navbar-brand" href="/">
                 <img id="logo" class="svg" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/lucene-net-color.png" alt="">
               </a>
             </div>
             <div class="collapse navbar-collapse" id="navbar">
               <form class="navbar-form navbar-right" role="search" id="search">
                 <div class="form-group">
                   <input type="text" class="form-control" id="search-query" placeholder="Search" autocomplete="off">
                 </div>
               </form>
             </div>
           </div>
         </nav>

         <div class="subnav navbar navbar-default">
           <div class="container hide-when-search">
             <ul class="level0 breadcrumb">
                 <li>
                     <a href="https://lucenenet.apache.org/docs/4.8.0-beta00009/">API</a>
                      <span id="breadcrumb">
                         <ul class="breadcrumb">
                           <li></li>
                         </ul>
                     </span>
                 </li>
             </ul>
           </div>
         </div>
       </header>
       <div class="container body-content">

         <div id="search-results">
           <div class="search-list"></div>
           <div class="sr-items">
             <p><i class="glyphicon glyphicon-refresh index-loading"></i></p>
           </div>
           <ul id="pagination"></ul>
         </div>
       </div>
       <div role="main" class="container body-content hide-when-search">

         <div class="sidenav hide-when-search">
           <a class="btn toc-toggle collapse" data-toggle="collapse" href="#sidetoggle" aria-expanded="false" aria-controls="sidetoggle">Show / Hide Table of Contents</a>
           <div class="sidetoggle collapse" id="sidetoggle">
             <div id="sidetoc"></div>
           </div>
         </div>
         <div class="article row grid-right">
           <div class="col-md-10">
             <article class="content wrap" id="_content" data-uid="Lucene.Net.Codecs.Memory">

   <h1 id="Lucene_Net_Codecs_Memory" data-uid="Lucene.Net.Codecs.Memory" class="text-break">Namespace Lucene.Net.Codecs.Memory
   </h1>
   <div class="markdown level0 summary"><!--
  Licensed to the Apache Software Foundation (ASF) under one or more
  contributor license agreements.  See the NOTICE file distributed with
  this work for additional information regarding copyright ownership.
  The ASF licenses this file to You under the Apache License, Version 2.0
  (the "License"); you may not use this file except in compliance with
  the License.  You may obtain a copy of the License at

      http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License.
 -->
 <p>Term dictionary, DocValues or Postings formats that are read entirely into memory.</p>
 </div>
   <div class="markdown level0 conceptual"></div>
   <div class="markdown level0 remarks"></div>
     <h3 id="classes">Classes
   </h3>
       <h4><a class="xref" href="Lucene.Net.Codecs.Memory.DirectDocValuesFormat.html">DirectDocValuesFormat</a></h4>
       <section><p>In-memory docvalues format that does no (or very little)
 compression.  Indexed values are stored on disk, but
 then at search time all values are loaded into memory as
 simple .NET arrays.  For numeric values, it uses
 byte[], short[], int[], long[] as necessary to fit the
 range of the values.  For binary values, there is an <span class="xref">System.Int32</span>
 (4 bytes) overhead per value.</p>
 <p>Limitations:
 <ul><li>For binary and sorted fields the total space
        required for all binary values cannot exceed about
        2.1 GB (see <a class="xref" href="Lucene.Net.Codecs.Memory.DirectDocValuesFormat.html#Lucene_Net_Codecs_Memory_DirectDocValuesFormat_MAX_TOTAL_BYTES_LENGTH">MAX_TOTAL_BYTES_LENGTH</a>).</li><li>For sorted set fields, the sum of the size of each
        document&apos;s set of values cannot exceed about 2.1 B
        values (see <a class="xref" href="Lucene.Net.Codecs.Memory.DirectDocValuesFormat.html#Lucene_Net_Codecs_Memory_DirectDocValuesFormat_MAX_SORTED_SET_ORDS">MAX_SORTED_SET_ORDS</a>).  For example,
        if every document has 10 values (10 instances of
 <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Documents.SortedSetDocValuesField.html">SortedSetDocValuesField</a>) added, then no
 more than ~210 M documents can be added to one
 segment. </li></ul>
 </p>
 </section>
       <h4><a class="xref" href="Lucene.Net.Codecs.Memory.DirectPostingsFormat.html">DirectPostingsFormat</a></h4>
       <section><p>Wraps <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Codecs.Lucene41.Lucene41PostingsFormat.html">Lucene41PostingsFormat</a> format for on-disk
 storage, but then at read time loads and stores all
 terms &amp; postings directly in RAM as byte[], int[].</p>
 <p><p><strong>WARNING</strong>: This is
 exceptionally RAM intensive: it makes no effort to
 compress the postings data, storing terms as separate
 byte[] and postings as separate int[], but as a result it
 gives substantial increase in search performance.</p>
 <p>
 <p>This postings format supports <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Index.TermsEnum.html#Lucene_Net_Index_TermsEnum_Ord">Ord</a>
 and <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Index.TermsEnum.html#Lucene_Net_Index_TermsEnum_SeekExact_System_Int64_">SeekExact(Int64)</a>.</p>
 <p>
 <p>Because this holds all term bytes as a single
 byte[], you cannot have more than 2.1GB worth of term
 bytes in a single segment.
 </p></p>
 <div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div><p>
 </section>
       <h4><a class="xref" href="Lucene.Net.Codecs.Memory.FSTOrdPostingsFormat.html">FSTOrdPostingsFormat</a></h4>
       <section><p>FSTOrd term dict + Lucene41PBF</p>
 </section>
       <h4><a class="xref" href="Lucene.Net.Codecs.Memory.FSTOrdPulsing41PostingsFormat.html">FSTOrdPulsing41PostingsFormat</a></h4>
       <section><p>FSTOrd + Pulsing41
 <p>
 <div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div><p>
 </section>
       <h4><a class="xref" href="Lucene.Net.Codecs.Memory.FSTOrdTermsReader.html">FSTOrdTermsReader</a></h4>
       <section><p>FST-based terms dictionary reader.
 <p>
 The FST index maps each term and its ord, and during seek
 the ord is used fetch metadata from a single block.
 The term dictionary is fully memory resident.
 <p>
 <div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
       <h4><a class="xref" href="Lucene.Net.Codecs.Memory.FSTOrdTermsWriter.html">FSTOrdTermsWriter</a></h4>
       <section><p>FST-based term dict, using ord as FST output.
 <p>
 The FST holds the mapping between &lt;term, ord&gt;, and
 term&apos;s metadata is delta encoded into a single byte block.
 <p>
 Typically the byte block consists of four parts:
 <ol><li>term statistics: docFreq, totalTermFreq;</li><li>monotonic long[], e.g. the pointer to the postings list for that term;</li><li>generic byte[], e.g. other information customized by postings base.</li><li>single-level skip list to speed up metadata decoding by ord.</li></ol>
 <p>
 <p>
 Files:
 <ul><li><code>.tix</code>: <a href="#Termindex">Term Index</a></li><li><code>.tbk</code>: <a href="#Termblock">Term Block</a></li></ul>
 </p></p>
 <p><a name="Termindex" id="Termindex"></a>
 <h3>Term Index</h3>
 <p>
  The .tix contains a list of FSTs, one for each field.
  The FST maps a term to its corresponding order in current field.
 </p></p>
 <ul><li>TermIndex(.tix) --&gt; Header, TermFST<sup>NumFields</sup>, Footer</li><li>TermFST --&gt; <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Util.Fst.FST-1.html">FST&lt;T&gt;</a></li><li>Header --&gt; CodecHeader (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Codecs.CodecUtil.html#Lucene_Net_Codecs_CodecUtil_WriteHeader_Lucene_Net_Store_DataOutput_System_String_System_Int32_">WriteHeader(DataOutput, String, Int32)</a>) </li><li>Footer --&gt; CodecFooter (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Codecs.CodecUtil.html#Lucene_Net_Codecs_CodecUtil_WriteFooter_Lucene_Net_Store_IndexOutput_">WriteFooter(IndexOutput)</a>) </li></ul>

 <p>Notes:</p>
 <ul><li>
  Since terms are already sorted before writing to <a href="#Termblock">Term Block</a>,
  their ords can directly used to seek term metadata from term block.
 </li></ul>

 <a name="Termblock" id="Termblock"></a>
 <h3>Term Block</h3>
 <p>
 The .tbk contains all the statistics and metadata for terms, along with field summary (e.g.
 per-field data like number of documents in current field). For each field, there are four blocks:
 <ul><li>statistics bytes block: contains term statistics; </li><li>metadata longs block: delta-encodes monotonic part of metadata; </li><li>metadata bytes block: encodes other parts of metadata; </li><li>skip block: contains skip data, to speed up metadata seeking and decoding</li></ul>
 </p>

 <p><p>File Format:</p>
 <ul><li>TermBlock(.tbk) --&gt; Header, <em>PostingsHeader</em>, FieldSummary, DirOffset</li><li>FieldSummary --&gt; NumFields, &lt;FieldNumber, NumTerms, SumTotalTermFreq?, SumDocFreq,
                                         DocCount, LongsSize, DataBlock &gt; <sup>NumFields</sup>, Footer</li><li>DataBlock --&gt; StatsBlockLength, MetaLongsBlockLength, MetaBytesBlockLength,
                       SkipBlock, StatsBlock, MetaLongsBlock, MetaBytesBlock </li><li>SkipBlock --&gt; &lt; StatsFPDelta, MetaLongsSkipFPDelta, MetaBytesSkipFPDelta,
                            MetaLongsSkipDelta<sup>LongsSize</sup> &gt;<sup>NumTerms</sup></li><li>StatsBlock --&gt; &lt; DocFreq[Same?], (TotalTermFreq-DocFreq) ? &gt; <sup>NumTerms</sup></li><li>MetaLongsBlock --&gt; &lt; LongDelta<sup>LongsSize</sup>, BytesSize &gt; <sup>NumTerms</sup></li><li>MetaBytesBlock --&gt; Byte <sup>MetaBytesBlockLength</sup></li><li>Header --&gt; CodecHeader (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Codecs.CodecUtil.html#Lucene_Net_Codecs_CodecUtil_WriteHeader_Lucene_Net_Store_DataOutput_System_String_System_Int32_">WriteHeader(DataOutput, String, Int32)</a>) </li><li>DirOffset --&gt; Uint64 (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.DataOutput.html#Lucene_Net_Store_DataOutput_WriteInt64_System_Int64_">WriteInt64(Int64)</a>) </li><li>NumFields, FieldNumber, DocCount, DocFreq, LongsSize,
        FieldNumber, DocCount --&gt; VInt (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.DataOutput.html#Lucene_Net_Store_DataOutput_WriteVInt32_System_Int32_">WriteVInt32(Int32)</a>) </li><li>NumTerms, SumTotalTermFreq, SumDocFreq, StatsBlockLength, MetaLongsBlockLength, MetaBytesBlockLength,
        StatsFPDelta, MetaLongsSkipFPDelta, MetaBytesSkipFPDelta, MetaLongsSkipStart, TotalTermFreq,
        LongDelta,--&gt; VLong (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.DataOutput.html#Lucene_Net_Store_DataOutput_WriteVInt64_System_Int64_">WriteVInt64(Int64)</a>) </li><li>Footer --&gt; CodecFooter (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Codecs.CodecUtil.html#Lucene_Net_Codecs_CodecUtil_WriteFooter_Lucene_Net_Store_IndexOutput_">WriteFooter(IndexOutput)</a>) </li></ul>
 <p>Notes: </p>
 <ul><li>
   The format of PostingsHeader and MetaBytes are customized by the specific postings implementation:
   they contain arbitrary per-file data (such as parameters or versioning information), and per-term data
   (non-monotonic ones like pulsed postings data).
 </li><li>
  During initialization the reader will load all the blocks into memory. SkipBlock will be decoded, so that during seek
  term dict can lookup file pointers directly. StatsFPDelta, MetaLongsSkipFPDelta, etc. are file offset
  for every SkipInterval&apos;s term. MetaLongsSkipDelta is the difference from previous one, which indicates
  the value of preceding metadata longs for every SkipInterval&apos;s term.
 </li><li>
  DocFreq is the count of documents which contain the term. TotalTermFreq is the total number of occurrences of the term.
  Usually these two values are the same for long tail terms, therefore one bit is stole from DocFreq to check this case,
  so that encoding of TotalTermFreq may be omitted.
 </li></ul>
 <p>
 <div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div><p>
 </section>
       <h4><a class="xref" href="Lucene.Net.Codecs.Memory.FSTPostingsFormat.html">FSTPostingsFormat</a></h4>
       <section><p>FST term dict + Lucene41PBF</p>
 </section>
       <h4><a class="xref" href="Lucene.Net.Codecs.Memory.FSTPulsing41PostingsFormat.html">FSTPulsing41PostingsFormat</a></h4>
       <section><p>FST + Pulsing41, test only, since
 FST does no delta encoding here!
 <p>
 <div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div><p>
 </section>
       <h4><a class="xref" href="Lucene.Net.Codecs.Memory.FSTTermsReader.html">FSTTermsReader</a></h4>
       <section><p>FST-based terms dictionary reader.
 <p>
 The FST directly maps each term and its metadata,
 it is memory resident.
 <p>
 <div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
       <h4><a class="xref" href="Lucene.Net.Codecs.Memory.FSTTermsWriter.html">FSTTermsWriter</a></h4>
       <section><p>FST-based term dict, using metadata as FST output.
 <p>
 The FST directly holds the mapping between &lt;term, metadata&gt;.
 <p>
 Term metadata consists of three parts:
 <ol><li>term statistics: docFreq, totalTermFreq;</li><li>monotonic long[], e.g. the pointer to the postings list for that term;</li><li>generic byte[], e.g. other information need by postings reader.</li></ol>
 <p>
 File:
 <ul><li><code>.tst</code>: <a href="#Termdictionary">Term Dictionary</a></li></ul>
 </p>
 <p>
 <p><a name="Termdictionary" id="Termdictionary"></a>
 <h3>Term Dictionary</h3>
 </p>
 <p>
  The .tst contains a list of FSTs, one for each field.
  The FST maps a term to its corresponding statistics (e.g. docfreq)
  and metadata (e.g. information for postings list reader like file pointer
  to postings list).
 </p>
 <p>
 Typically the metadata is separated into two parts:
 <ul><li>
    Monotonical long array: Some metadata will always be ascending in order
    with the corresponding term. This part is used by FST to share outputs between arcs.
 </li><li>
  Generic byte array: Used to store non-monotonic metadata.
 </li></ul>
 </p></p>
 <p>File format:
 <ul><li>TermsDict(.tst) --&gt; Header, <em>PostingsHeader</em>, FieldSummary, DirOffset</li><li>FieldSummary --&gt; NumFields, &lt;FieldNumber, NumTerms, SumTotalTermFreq?,
                                      SumDocFreq, DocCount, LongsSize, TermFST &gt;<sup>NumFields</sup></li><li>TermFST TermData</li><li>TermData --&gt; Flag, BytesSize?, LongDelta<sup>LongsSize</sup>?, Byte<sup>BytesSize</sup>?,
                      &lt; DocFreq[Same?], (TotalTermFreq-DocFreq) &gt; ? </li><li>Header --&gt; CodecHeader (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Codecs.CodecUtil.html#Lucene_Net_Codecs_CodecUtil_WriteHeader_Lucene_Net_Store_DataOutput_System_String_System_Int32_">WriteHeader(DataOutput, String, Int32)</a>) </li><li>DirOffset --&gt; Uint64 (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.DataOutput.html#Lucene_Net_Store_DataOutput_WriteInt64_System_Int64_">WriteInt64(Int64)</a>) </li><li>DocFreq, LongsSize, BytesSize, NumFields,
        FieldNumber, DocCount --&gt; VInt (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.DataOutput.html#Lucene_Net_Store_DataOutput_WriteVInt32_System_Int32_">WriteVInt32(Int32)</a>) </li><li>TotalTermFreq, NumTerms, SumTotalTermFreq, SumDocFreq, LongDelta --&gt;
        VLong (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.DataOutput.html#Lucene_Net_Store_DataOutput_WriteVInt64_System_Int64_">WriteVInt64(Int64)</a>) </li></ul>
 <p>Notes:</p>
 <ul><li>
   The format of PostingsHeader and generic meta bytes are customized by the specific postings implementation:
   they contain arbitrary per-file data (such as parameters or versioning information), and per-term data
   (non-monotonic ones like pulsed postings data).
 </li><li>
  The format of TermData is determined by FST, typically monotonic metadata will be dense around shallow arcs,
  while in deeper arcs only generic bytes and term statistics exist.
 </li><li>
  The byte Flag is used to indicate which part of metadata exists on current arc. Specially the monotonic part
  is omitted when it is an array of 0s.
 </li><li>
  Since LongsSize is per-field fixed, it is only written once in field summary.
 </li></ul>
 <p>
 <div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
       <h4><a class="xref" href="Lucene.Net.Codecs.Memory.MemoryDocValuesFormat.html">MemoryDocValuesFormat</a></h4>
       <section><p>In-memory docvalues format. </p>
 </section>
       <h4><a class="xref" href="Lucene.Net.Codecs.Memory.MemoryPostingsFormat.html">MemoryPostingsFormat</a></h4>
       <section><p>Stores terms &amp; postings (docs, positions, payloads) in
 RAM, using an FST.</p>
 <p><p>Note that this codec implements advance as a linear
 scan!  This means if you store large fields in here,
 queries that rely on advance will (AND BooleanQuery,
 PhraseQuery) will be relatively slow!
 </p></p>
 <div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div><p>
 </section>
 </article>
           </div>

           <div class="hidden-sm col-md-2" role="complementary">
             <div class="sideaffix">
               <div class="contribution">
                 <ul class="nav">
                   <li>
                     <a href="https://github.com/apache/lucenenet/blob/docs/4.8.0-beta00013/src/Lucene.Net.Codecs/Memory/package.md/#L2" class="contribution-link">Improve this Doc</a>
                   </li>
                 </ul>
               </div>
               <nav class="bs-docs-sidebar hidden-print hidden-xs hidden-sm affix" id="affix">
               <!-- <p><a class="back-to-top" href="#top">Back to top</a><p> -->
               </nav>
             </div>
           </div>
         </div>
       </div>

       <footer>
         <div class="grad-bottom"></div>
         <div class="footer">
           <div class="container">
             <span class="pull-right">
               <a href="#top">Back to top</a>
             </span>
             Copyright © 2020 The Apache Software Foundation, Licensed under the <a href='http://www.apache.org/licenses/LICENSE-2.0' target='_blank'>Apache License, Version 2.0</a><br> <small>Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation. <br>All other marks mentioned may be trademarks or registered trademarks of their respective owners.</small>

           </div>
         </div>
       </footer>
     </div>

     <script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.js"></script>
     <script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.js"></script>
     <script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.js"></script>
   </body>
 </html>
	<!DOCTYPE html>
	<!--[if IE]><![endif]-->
	<html>

	<head>
	<meta charset="utf-8">
	<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
	<title>Namespace Lucene.Net.Codecs.Memory
	\| Apache Lucene.NET 4.8.0-beta00013 Documentation </title>
	<meta name="viewport" content="width=device-width">
	<meta name="title" content="Namespace Lucene.Net.Codecs.Memory
	\| Apache Lucene.NET 4.8.0-beta00013 Documentation ">
	<meta name="generator" content="docfx 2.56.2.0">

	<link rel="shortcut icon" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/favicon.ico">
	<link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.css">
	<link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.css">
	<link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.css">
	<meta property="docfx:navrel" content="toc.html">
	<meta property="docfx:tocrel" content="codecs/toc.html">

	<meta property="docfx:rel" content="https://lucenenet.apache.org/docs/4.8.0-beta00009/">

	</head>
	<body data-spy="scroll" data-target="#affix" data-offset="120">
	<span id="forkongithub"><a href="https://github.com/apache/lucenenet" target="_blank">Fork me on GitHub</a></span>
	<div id="wrapper">
	<header>

	<nav id="autocollapse" class="navbar ng-scope" role="navigation">
	<div class="container">
	<div class="navbar-header">
	<button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar">
	<span class="sr-only">Toggle navigation</span>
	<span class="icon-bar"></span>
	<span class="icon-bar"></span>
	<span class="icon-bar"></span>
	</button>

	<a class="navbar-brand" href="/">
	<img id="logo" class="svg" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/lucene-net-color.png" alt="">
	</a>
	</div>
	<div class="collapse navbar-collapse" id="navbar">
	<form class="navbar-form navbar-right" role="search" id="search">
	<div class="form-group">
	<input type="text" class="form-control" id="search-query" placeholder="Search" autocomplete="off">
	</div>
	</form>
	</div>
	</div>
	</nav>

	<div class="subnav navbar navbar-default">
	<div class="container hide-when-search">
	<ul class="level0 breadcrumb">
	<li>
	<a href="https://lucenenet.apache.org/docs/4.8.0-beta00009/">API</a>
	<span id="breadcrumb">
	<ul class="breadcrumb">
	<li></li>
	</ul>
	</span>
	</li>
	</ul>
	</div>
	</div>
	</header>
	<div class="container body-content">

	<div id="search-results">
	<div class="search-list"></div>
	<div class="sr-items">
	<p><i class="glyphicon glyphicon-refresh index-loading"></i></p>
	</div>
	<ul id="pagination"></ul>
	</div>
	</div>
	<div role="main" class="container body-content hide-when-search">

	<div class="sidenav hide-when-search">
	<a class="btn toc-toggle collapse" data-toggle="collapse" href="#sidetoggle" aria-expanded="false" aria-controls="sidetoggle">Show / Hide Table of Contents</a>
	<div class="sidetoggle collapse" id="sidetoggle">
	<div id="sidetoc"></div>
	</div>
	</div>
	<div class="article row grid-right">
	<div class="col-md-10">
	<article class="content wrap" id="_content" data-uid="Lucene.Net.Codecs.Memory">

	<h1 id="Lucene_Net_Codecs_Memory" data-uid="Lucene.Net.Codecs.Memory" class="text-break">Namespace Lucene.Net.Codecs.Memory
	</h1>
	<div class="markdown level0 summary"><!--
	Licensed to the Apache Software Foundation (ASF) under one or more
	contributor license agreements. See the NOTICE file distributed with
	this work for additional information regarding copyright ownership.
	The ASF licenses this file to You under the Apache License, Version 2.0
	(the "License"); you may not use this file except in compliance with
	the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing, software
	distributed under the License is distributed on an "AS IS" BASIS,
	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	See the License for the specific language governing permissions and
	limitations under the License.
	-->
	<p>Term dictionary, DocValues or Postings formats that are read entirely into memory.</p>
	</div>
	<div class="markdown level0 conceptual"></div>
	<div class="markdown level0 remarks"></div>
	<h3 id="classes">Classes
	</h3>
	<h4><a class="xref" href="Lucene.Net.Codecs.Memory.DirectDocValuesFormat.html">DirectDocValuesFormat</a></h4>
	<section><p>In-memory docvalues format that does no (or very little)
	compression. Indexed values are stored on disk, but
	then at search time all values are loaded into memory as
	simple .NET arrays. For numeric values, it uses
	byte[], short[], int[], long[] as necessary to fit the
	range of the values. For binary values, there is an <span class="xref">System.Int32</span>
	(4 bytes) overhead per value.</p>
	<p>Limitations:
	<ul><li>For binary and sorted fields the total space
	required for all binary values cannot exceed about
	2.1 GB (see <a class="xref" href="Lucene.Net.Codecs.Memory.DirectDocValuesFormat.html#Lucene_Net_Codecs_Memory_DirectDocValuesFormat_MAX_TOTAL_BYTES_LENGTH">MAX_TOTAL_BYTES_LENGTH</a>).</li><li>For sorted set fields, the sum of the size of each
	document's set of values cannot exceed about 2.1 B
	values (see <a class="xref" href="Lucene.Net.Codecs.Memory.DirectDocValuesFormat.html#Lucene_Net_Codecs_Memory_DirectDocValuesFormat_MAX_SORTED_SET_ORDS">MAX_SORTED_SET_ORDS</a>). For example,
	if every document has 10 values (10 instances of
	<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Documents.SortedSetDocValuesField.html">SortedSetDocValuesField</a>) added, then no
	more than ~210 M documents can be added to one
	segment. </li></ul>
	</p>
	</section>
	<h4><a class="xref" href="Lucene.Net.Codecs.Memory.DirectPostingsFormat.html">DirectPostingsFormat</a></h4>
	<section><p>Wraps <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Codecs.Lucene41.Lucene41PostingsFormat.html">Lucene41PostingsFormat</a> format for on-disk
	storage, but then at read time loads and stores all
	terms & postings directly in RAM as byte[], int[].</p>
	<p><p><strong>WARNING</strong>: This is
	exceptionally RAM intensive: it makes no effort to
	compress the postings data, storing terms as separate
	byte[] and postings as separate int[], but as a result it
	gives substantial increase in search performance.</p>
	<p>
	<p>This postings format supports <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Index.TermsEnum.html#Lucene_Net_Index_TermsEnum_Ord">Ord</a>
	and <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Index.TermsEnum.html#Lucene_Net_Index_TermsEnum_SeekExact_System_Int64_">SeekExact(Int64)</a>.</p>
	<p>
	<p>Because this holds all term bytes as a single
	byte[], you cannot have more than 2.1GB worth of term
	bytes in a single segment.
	</p></p>
	<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div><p>
	</section>
	<h4><a class="xref" href="Lucene.Net.Codecs.Memory.FSTOrdPostingsFormat.html">FSTOrdPostingsFormat</a></h4>
	<section><p>FSTOrd term dict + Lucene41PBF</p>
	</section>
	<h4><a class="xref" href="Lucene.Net.Codecs.Memory.FSTOrdPulsing41PostingsFormat.html">FSTOrdPulsing41PostingsFormat</a></h4>
	<section><p>FSTOrd + Pulsing41
	<p>
	<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div><p>
	</section>
	<h4><a class="xref" href="Lucene.Net.Codecs.Memory.FSTOrdTermsReader.html">FSTOrdTermsReader</a></h4>
	<section><p>FST-based terms dictionary reader.
	<p>
	The FST index maps each term and its ord, and during seek
	the ord is used fetch metadata from a single block.
	The term dictionary is fully memory resident.
	<p>
	<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
	<h4><a class="xref" href="Lucene.Net.Codecs.Memory.FSTOrdTermsWriter.html">FSTOrdTermsWriter</a></h4>
	<section><p>FST-based term dict, using ord as FST output.
	<p>
	The FST holds the mapping between <term, ord>, and
	term's metadata is delta encoded into a single byte block.
	<p>
	Typically the byte block consists of four parts:
	<ol><li>term statistics: docFreq, totalTermFreq;</li><li>monotonic long[], e.g. the pointer to the postings list for that term;</li><li>generic byte[], e.g. other information customized by postings base.</li><li>single-level skip list to speed up metadata decoding by ord.</li></ol>
	<p>
	<p>
	Files:
	<ul><li><code>.tix</code>: <a href="#Termindex">Term Index</a></li><li><code>.tbk</code>: <a href="#Termblock">Term Block</a></li></ul>
	</p></p>
	<p><a name="Termindex" id="Termindex"></a>
	<h3>Term Index</h3>
	<p>
	The .tix contains a list of FSTs, one for each field.
	The FST maps a term to its corresponding order in current field.
	</p></p>
	<ul><li>TermIndex(.tix) --> Header, TermFST<sup>NumFields</sup>, Footer</li><li>TermFST --> <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Util.Fst.FST-1.html">FST<T></a></li><li>Header --> CodecHeader (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Codecs.CodecUtil.html#Lucene_Net_Codecs_CodecUtil_WriteHeader_Lucene_Net_Store_DataOutput_System_String_System_Int32_">WriteHeader(DataOutput, String, Int32)</a>) </li><li>Footer --> CodecFooter (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Codecs.CodecUtil.html#Lucene_Net_Codecs_CodecUtil_WriteFooter_Lucene_Net_Store_IndexOutput_">WriteFooter(IndexOutput)</a>) </li></ul>

	<p>Notes:</p>
	<ul><li>
	Since terms are already sorted before writing to <a href="#Termblock">Term Block</a>,
	their ords can directly used to seek term metadata from term block.
	</li></ul>

	<a name="Termblock" id="Termblock"></a>
	<h3>Term Block</h3>
	<p>
	The .tbk contains all the statistics and metadata for terms, along with field summary (e.g.
	per-field data like number of documents in current field). For each field, there are four blocks:
	<ul><li>statistics bytes block: contains term statistics; </li><li>metadata longs block: delta-encodes monotonic part of metadata; </li><li>metadata bytes block: encodes other parts of metadata; </li><li>skip block: contains skip data, to speed up metadata seeking and decoding</li></ul>
	</p>

	<p><p>File Format:</p>
	<ul><li>TermBlock(.tbk) --> Header, <em>PostingsHeader</em>, FieldSummary, DirOffset</li><li>FieldSummary --> NumFields, <FieldNumber, NumTerms, SumTotalTermFreq?, SumDocFreq,
	DocCount, LongsSize, DataBlock > <sup>NumFields</sup>, Footer</li><li>DataBlock --> StatsBlockLength, MetaLongsBlockLength, MetaBytesBlockLength,
	SkipBlock, StatsBlock, MetaLongsBlock, MetaBytesBlock </li><li>SkipBlock --> < StatsFPDelta, MetaLongsSkipFPDelta, MetaBytesSkipFPDelta,
	MetaLongsSkipDelta<sup>LongsSize</sup> ><sup>NumTerms</sup></li><li>StatsBlock --> < DocFreq[Same?], (TotalTermFreq-DocFreq) ? > <sup>NumTerms</sup></li><li>MetaLongsBlock --> < LongDelta<sup>LongsSize</sup>, BytesSize > <sup>NumTerms</sup></li><li>MetaBytesBlock --> Byte <sup>MetaBytesBlockLength</sup></li><li>Header --> CodecHeader (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Codecs.CodecUtil.html#Lucene_Net_Codecs_CodecUtil_WriteHeader_Lucene_Net_Store_DataOutput_System_String_System_Int32_">WriteHeader(DataOutput, String, Int32)</a>) </li><li>DirOffset --> Uint64 (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.DataOutput.html#Lucene_Net_Store_DataOutput_WriteInt64_System_Int64_">WriteInt64(Int64)</a>) </li><li>NumFields, FieldNumber, DocCount, DocFreq, LongsSize,
	FieldNumber, DocCount --> VInt (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.DataOutput.html#Lucene_Net_Store_DataOutput_WriteVInt32_System_Int32_">WriteVInt32(Int32)</a>) </li><li>NumTerms, SumTotalTermFreq, SumDocFreq, StatsBlockLength, MetaLongsBlockLength, MetaBytesBlockLength,
	StatsFPDelta, MetaLongsSkipFPDelta, MetaBytesSkipFPDelta, MetaLongsSkipStart, TotalTermFreq,
	LongDelta,--> VLong (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.DataOutput.html#Lucene_Net_Store_DataOutput_WriteVInt64_System_Int64_">WriteVInt64(Int64)</a>) </li><li>Footer --> CodecFooter (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Codecs.CodecUtil.html#Lucene_Net_Codecs_CodecUtil_WriteFooter_Lucene_Net_Store_IndexOutput_">WriteFooter(IndexOutput)</a>) </li></ul>
	<p>Notes: </p>
	<ul><li>
	The format of PostingsHeader and MetaBytes are customized by the specific postings implementation:
	they contain arbitrary per-file data (such as parameters or versioning information), and per-term data
	(non-monotonic ones like pulsed postings data).
	</li><li>
	During initialization the reader will load all the blocks into memory. SkipBlock will be decoded, so that during seek
	term dict can lookup file pointers directly. StatsFPDelta, MetaLongsSkipFPDelta, etc. are file offset
	for every SkipInterval's term. MetaLongsSkipDelta is the difference from previous one, which indicates
	the value of preceding metadata longs for every SkipInterval's term.
	</li><li>
	DocFreq is the count of documents which contain the term. TotalTermFreq is the total number of occurrences of the term.
	Usually these two values are the same for long tail terms, therefore one bit is stole from DocFreq to check this case,
	so that encoding of TotalTermFreq may be omitted.
	</li></ul>
	<p>
	<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div><p>
	</section>
	<h4><a class="xref" href="Lucene.Net.Codecs.Memory.FSTPostingsFormat.html">FSTPostingsFormat</a></h4>
	<section><p>FST term dict + Lucene41PBF</p>
	</section>
	<h4><a class="xref" href="Lucene.Net.Codecs.Memory.FSTPulsing41PostingsFormat.html">FSTPulsing41PostingsFormat</a></h4>
	<section><p>FST + Pulsing41, test only, since
	FST does no delta encoding here!
	<p>
	<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div><p>
	</section>
	<h4><a class="xref" href="Lucene.Net.Codecs.Memory.FSTTermsReader.html">FSTTermsReader</a></h4>
	<section><p>FST-based terms dictionary reader.
	<p>
	The FST directly maps each term and its metadata,
	it is memory resident.
	<p>
	<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
	<h4><a class="xref" href="Lucene.Net.Codecs.Memory.FSTTermsWriter.html">FSTTermsWriter</a></h4>
	<section><p>FST-based term dict, using metadata as FST output.
	<p>
	The FST directly holds the mapping between <term, metadata>.
	<p>
	Term metadata consists of three parts:
	<ol><li>term statistics: docFreq, totalTermFreq;</li><li>monotonic long[], e.g. the pointer to the postings list for that term;</li><li>generic byte[], e.g. other information need by postings reader.</li></ol>
	<p>
	File:
	<ul><li><code>.tst</code>: <a href="#Termdictionary">Term Dictionary</a></li></ul>
	</p>
	<p>
	<p><a name="Termdictionary" id="Termdictionary"></a>
	<h3>Term Dictionary</h3>
	</p>
	<p>
	The .tst contains a list of FSTs, one for each field.
	The FST maps a term to its corresponding statistics (e.g. docfreq)
	and metadata (e.g. information for postings list reader like file pointer
	to postings list).
	</p>
	<p>
	Typically the metadata is separated into two parts:
	<ul><li>
	Monotonical long array: Some metadata will always be ascending in order
	with the corresponding term. This part is used by FST to share outputs between arcs.
	</li><li>
	Generic byte array: Used to store non-monotonic metadata.
	</li></ul>
	</p></p>
	<p>File format:
	<ul><li>TermsDict(.tst) --> Header, <em>PostingsHeader</em>, FieldSummary, DirOffset</li><li>FieldSummary --> NumFields, <FieldNumber, NumTerms, SumTotalTermFreq?,
	SumDocFreq, DocCount, LongsSize, TermFST ><sup>NumFields</sup></li><li>TermFST TermData</li><li>TermData --> Flag, BytesSize?, LongDelta<sup>LongsSize</sup>?, Byte<sup>BytesSize</sup>?,
	< DocFreq[Same?], (TotalTermFreq-DocFreq) > ? </li><li>Header --> CodecHeader (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Codecs.CodecUtil.html#Lucene_Net_Codecs_CodecUtil_WriteHeader_Lucene_Net_Store_DataOutput_System_String_System_Int32_">WriteHeader(DataOutput, String, Int32)</a>) </li><li>DirOffset --> Uint64 (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.DataOutput.html#Lucene_Net_Store_DataOutput_WriteInt64_System_Int64_">WriteInt64(Int64)</a>) </li><li>DocFreq, LongsSize, BytesSize, NumFields,
	FieldNumber, DocCount --> VInt (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.DataOutput.html#Lucene_Net_Store_DataOutput_WriteVInt32_System_Int32_">WriteVInt32(Int32)</a>) </li><li>TotalTermFreq, NumTerms, SumTotalTermFreq, SumDocFreq, LongDelta -->
	VLong (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.DataOutput.html#Lucene_Net_Store_DataOutput_WriteVInt64_System_Int64_">WriteVInt64(Int64)</a>) </li></ul>
	<p>Notes:</p>
	<ul><li>
	The format of PostingsHeader and generic meta bytes are customized by the specific postings implementation:
	they contain arbitrary per-file data (such as parameters or versioning information), and per-term data
	(non-monotonic ones like pulsed postings data).
	</li><li>
	The format of TermData is determined by FST, typically monotonic metadata will be dense around shallow arcs,
	while in deeper arcs only generic bytes and term statistics exist.
	</li><li>
	The byte Flag is used to indicate which part of metadata exists on current arc. Specially the monotonic part
	is omitted when it is an array of 0s.
	</li><li>
	Since LongsSize is per-field fixed, it is only written once in field summary.
	</li></ul>
	<p>
	<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
	<h4><a class="xref" href="Lucene.Net.Codecs.Memory.MemoryDocValuesFormat.html">MemoryDocValuesFormat</a></h4>
	<section><p>In-memory docvalues format. </p>
	</section>
	<h4><a class="xref" href="Lucene.Net.Codecs.Memory.MemoryPostingsFormat.html">MemoryPostingsFormat</a></h4>
	<section><p>Stores terms & postings (docs, positions, payloads) in
	RAM, using an FST.</p>
	<p><p>Note that this codec implements advance as a linear
	scan! This means if you store large fields in here,
	queries that rely on advance will (AND BooleanQuery,
	PhraseQuery) will be relatively slow!
	</p></p>
	<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div><p>
	</section>
	</article>
	</div>

	<div class="hidden-sm col-md-2" role="complementary">
	<div class="sideaffix">
	<div class="contribution">
	<ul class="nav">
	<li>
	<a href="https://github.com/apache/lucenenet/blob/docs/4.8.0-beta00013/src/Lucene.Net.Codecs/Memory/package.md/#L2" class="contribution-link">Improve this Doc</a>
	</li>
	</ul>
	</div>
	<nav class="bs-docs-sidebar hidden-print hidden-xs hidden-sm affix" id="affix">
	<!-- <p><a class="back-to-top" href="#top">Back to top</a><p> -->
	</nav>
	</div>
	</div>
	</div>
	</div>

	<footer>
	<div class="grad-bottom"></div>
	<div class="footer">
	<div class="container">
	<span class="pull-right">
	<a href="#top">Back to top</a>
	</span>
	Copyright © 2020 The Apache Software Foundation, Licensed under the <a href='http://www.apache.org/licenses/LICENSE-2.0' target='_blank'>Apache License, Version 2.0</a><br> <small>Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation. <br>All other marks mentioned may be trademarks or registered trademarks of their respective owners.</small>

	</div>
	</div>
	</footer>
	</div>

	<script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.js"></script>
	<script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.js"></script>
	<script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.js"></script>
	</body>
	</html>