| <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> |
| <!-- NewPage --> |
| <html lang="en"> |
| <head> |
| <!-- Generated by javadoc (1.8.0_121) on Fri Apr 14 22:10:58 PDT 2017 --> |
| <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> |
| <title>RandomSampler (Mahout Math 0.13.0 API)</title> |
| <meta name="date" content="2017-04-14"> |
| <link rel="stylesheet" type="text/css" href="../../../../../../../stylesheet.css" title="Style"> |
| <script type="text/javascript" src="../../../../../../../script.js"></script> |
| </head> |
| <body> |
| <script type="text/javascript"><!-- |
| try { |
| if (location.href.indexOf('is-external=true') == -1) { |
| parent.document.title="RandomSampler (Mahout Math 0.13.0 API)"; |
| } |
| } |
| catch(err) { |
| } |
| //--> |
| var methods = {"i0":9}; |
| var tabs = {65535:["t0","All Methods"],1:["t1","Static Methods"],8:["t4","Concrete Methods"]}; |
| var altColor = "altColor"; |
| var rowColor = "rowColor"; |
| var tableTab = "tableTab"; |
| var activeTableTab = "activeTableTab"; |
| </script> |
| <noscript> |
| <div>JavaScript is disabled on your browser.</div> |
| </noscript> |
| <!-- ========= START OF TOP NAVBAR ======= --> |
| <div class="topNav"><a name="navbar.top"> |
| <!-- --> |
| </a> |
| <div class="skipNav"><a href="#skip.navbar.top" title="Skip navigation links">Skip navigation links</a></div> |
| <a name="navbar.top.firstrow"> |
| <!-- --> |
| </a> |
| <ul class="navList" title="Navigation"> |
| <li><a href="../../../../../../../overview-summary.html">Overview</a></li> |
| <li><a href="package-summary.html">Package</a></li> |
| <li class="navBarCell1Rev">Class</li> |
| <li><a href="class-use/RandomSampler.html">Use</a></li> |
| <li><a href="package-tree.html">Tree</a></li> |
| <li><a href="../../../../../../../deprecated-list.html">Deprecated</a></li> |
| <li><a href="../../../../../../../index-all.html">Index</a></li> |
| <li><a href="../../../../../../../help-doc.html">Help</a></li> |
| </ul> |
| </div> |
| <div class="subNav"> |
| <ul class="navList"> |
| <li>Prev Class</li> |
| <li>Next Class</li> |
| </ul> |
| <ul class="navList"> |
| <li><a href="../../../../../../../index.html?org/apache/mahout/math/jet/random/sampling/RandomSampler.html" target="_top">Frames</a></li> |
| <li><a href="RandomSampler.html" target="_top">No Frames</a></li> |
| </ul> |
| <ul class="navList" id="allclasses_navbar_top"> |
| <li><a href="../../../../../../../allclasses-noframe.html">All Classes</a></li> |
| </ul> |
| <div> |
| <script type="text/javascript"><!-- |
| allClassesLink = document.getElementById("allclasses_navbar_top"); |
| if(window==top) { |
| allClassesLink.style.display = "block"; |
| } |
| else { |
| allClassesLink.style.display = "none"; |
| } |
| //--> |
| </script> |
| </div> |
| <div> |
| <ul class="subNavList"> |
| <li>Summary: </li> |
| <li>Nested | </li> |
| <li>Field | </li> |
| <li>Constr | </li> |
| <li><a href="#method.summary">Method</a></li> |
| </ul> |
| <ul class="subNavList"> |
| <li>Detail: </li> |
| <li>Field | </li> |
| <li>Constr | </li> |
| <li><a href="#method.detail">Method</a></li> |
| </ul> |
| </div> |
| <a name="skip.navbar.top"> |
| <!-- --> |
| </a></div> |
| <!-- ========= END OF TOP NAVBAR ========= --> |
| <!-- ======== START OF CLASS DATA ======== --> |
| <div class="header"> |
| <div class="subTitle">org.apache.mahout.math.jet.random.sampling</div> |
| <h2 title="Class RandomSampler" class="title">Class RandomSampler</h2> |
| </div> |
| <div class="contentContainer"> |
| <ul class="inheritance"> |
| <li><a href="http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html?is-external=true" title="class or interface in java.lang">java.lang.Object</a></li> |
| <li> |
| <ul class="inheritance"> |
| <li>org.apache.mahout.math.jet.random.sampling.RandomSampler</li> |
| </ul> |
| </li> |
| </ul> |
| <div class="description"> |
| <ul class="blockList"> |
| <li class="blockList"> |
| <hr> |
| <br> |
| <pre>public final class <span class="typeNameLabel">RandomSampler</span> |
| extends <a href="http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html?is-external=true" title="class or interface in java.lang">Object</a></pre> |
| <div class="block">Space and time efficiently computes a sorted <i>Simple Random Sample Without Replacement |
| (SRSWOR)</i>, that is, a sorted set of <tt>n</tt> random numbers from an interval of <tt>N</tt> numbers; |
| Example: Computing <tt>n=3</tt> random numbers from the interval <tt>[1,50]</tt> may yield |
| the sorted random set <tt>(7,13,47)</tt>. |
| Since we are talking about a set (sampling without replacement), no element will occur more than once. |
| Each number from the <tt>N</tt> numbers has the same probability to be included in the <tt>n</tt> chosen numbers. |
| |
| <p><b>Problem:</b> This class solves problems including the following: <i> |
| Suppose we have a file containing 10^12 objects. |
| We would like to take a truly random subset of 10^6 objects and do something with it, |
| for example, compute the sum over some instance field, or whatever. |
| How do we choose the subset? In particular, how do we avoid multiple equal elements? |
| How do we do this quick and without consuming excessive memory? |
| How do we avoid slowly jumping back and forth within the file? </i> |
| |
| <p><b>Sorted Simple Random Sample Without Replacement (SRSWOR):</b> |
| What are the exact semantics of this class? What is a SRSWOR? In which sense exactly is a returned set "random"? |
| It is random in the sense, that each number from the <tt>N</tt> numbers has the |
| same probability to be included in the <tt>n</tt> chosen numbers. |
| For those who think in implementations rather than abstract interfaces: |
| <i>Suppose, we have an empty list. |
| We pick a random number between 1 and 10^12 and add it to the list only if it was not |
| already picked before, i.e. if it is not already contained in the list. |
| We then do the same thing again and again until we have eventually collected 10^6 distinct numbers. |
| Now we sort the set ascending and return it.</i> |
| <dl> |
| <dt>It is exactly in this sense that this class returns "random" sets. |
| <b>Note, however, that the implementation of this class uses a technique orders of magnitudes |
| better (both in time and space) than the one outlined above.</b></dt></dl> |
| |
| <p><b>Performance:</b> Space requirements are zero. Running time is <tt>O(n)</tt> on average, |
| <tt>O(N)</tt> in the worst case. |
| <h2>Performance (200Mhz Pentium Pro, JDK 1.2, NT)</h2> |
| <center> |
| <table border="1" summary="performance table"> |
| <tr> |
| <td align="center" width="20%">n</td> |
| <td align="center" width="20%">N</td> |
| <td align="center" width="20%">Speed [seconds]</td> |
| </tr> |
| <tr> |
| <td align="center" width="20%">10<sup>3</sup></td> |
| <td align="center" width="20%">1.2*10<sup>3</sup></td> |
| <td align="center" width="20">0.0014</td> |
| </tr> |
| <tr> |
| <td align="center" width="20%">10<sup>3</sup></td> |
| <td align="center" width="20%">10<sup>7</sup></td> |
| <td align="center" width="20">0.006</td> |
| </tr> |
| <tr> |
| <td align="center" width="20%">10<sup>5</sup></td> |
| <td align="center" width="20%">10<sup>7</sup></td> |
| <td align="center" width="20">0.7</td> |
| </tr> |
| <tr> |
| <td align="center" width="20%">9.0*10<sup>6</sup></td> |
| <td align="center" width="20%">10<sup>7</sup></td> |
| <td align="center" width="20">8.5</td> |
| </tr> |
| <tr> |
| <td align="center" width="20%">9.9*10<sup>6</sup></td> |
| <td align="center" width="20%">10<sup>7</sup></td> |
| <td align="center" width="20">2.0 (samples more than 95%)</td> |
| </tr> |
| <tr> |
| <td align="center" width="20%">10<sup>4</sup></td> |
| <td align="center" width="20%">10<sup>12</sup></td> |
| <td align="center" width="20">0.07</td> |
| </tr> |
| <tr> |
| <td align="center" width="20%">10<sup>7</sup></td> |
| <td align="center" width="20%">10<sup>12</sup></td> |
| <td align="center" width="20">60</td> |
| </tr> |
| </table> |
| </center> |
| |
| <p><b>Scalability:</b> This random sampler is designed to be scalable. In iterator style, |
| it is able to compute and deliver sorted random sets stepwise in units called <i>blocks</i>. |
| Example: Computing <tt>n=9</tt> random numbers from the interval <tt>[1,50]</tt> in |
| 3 blocks may yield the blocks <tt>(7,13,14), (27,37,42), (45,46,49)</tt>. |
| (The maximum of a block is guaranteed to be less than the minimum of its successor block. |
| Every block is sorted ascending. No element will ever occur twice, both within a block and among blocks.) |
| A block can be computed and retrieved with method <tt>nextBlock</tt>. |
| Successive calls to method <tt>nextBlock</tt> will deliver as many random numbers as required. |
| |
| <p>Computing and retrieving samples in blocks is useful if you need very many random |
| numbers that cannot be stored in main memory at the same time. |
| For example, if you want to compute 10^10 such numbers you can do this by computing |
| them in blocks of, say, 500 elements each. |
| You then need only space to keep one block of 500 elements (i.e. 4 KB). |
| When you are finished processing the first 500 elements you call <tt>nextBlock</tt> to |
| fill the next 500 elements into the block, process them, and so on. |
| If you have the time and need, by using such blocks you can compute random sets |
| up to <tt>n=10^19</tt> random numbers. |
| |
| <p>If you do not need the block feature, you can also directly call |
| the static methods of this class without needing to construct a <tt>RandomSampler</tt> instance first. |
| |
| <p><b>Random number generation:</b> By default uses <tt>MersenneTwister</tt>, a very |
| strong random number generator, much better than <tt>java.util.Random</tt>. |
| You can also use other strong random number generators of Paul Houle's RngPack package. |
| For example, <tt>Ranecu</tt>, <tt>Ranmar</tt> and <tt>Ranlux</tt> are strong well |
| analyzed research grade pseudo-random number generators with known periods. |
| |
| <p><b>Implementation:</b> after J.S. Vitter, An Efficient Algorithm for Sequential Random Sampling, |
| ACM Transactions on Mathematical Software, Vol 13, 1987. |
| Paper available <A HREF="http://www.cs.duke.edu/~jsv"> here</A>.</div> |
| </li> |
| </ul> |
| </div> |
| <div class="summary"> |
| <ul class="blockList"> |
| <li class="blockList"> |
| <!-- ========== METHOD SUMMARY =========== --> |
| <ul class="blockList"> |
| <li class="blockList"><a name="method.summary"> |
| <!-- --> |
| </a> |
| <h3>Method Summary</h3> |
| <table class="memberSummary" border="0" cellpadding="3" cellspacing="0" summary="Method Summary table, listing methods, and an explanation"> |
| <caption><span id="t0" class="activeTableTab"><span>All Methods</span><span class="tabEnd"> </span></span><span id="t1" class="tableTab"><span><a href="javascript:show(1);">Static Methods</a></span><span class="tabEnd"> </span></span><span id="t4" class="tableTab"><span><a href="javascript:show(8);">Concrete Methods</a></span><span class="tabEnd"> </span></span></caption> |
| <tr> |
| <th class="colFirst" scope="col">Modifier and Type</th> |
| <th class="colLast" scope="col">Method and Description</th> |
| </tr> |
| <tr id="i0" class="altColor"> |
| <td class="colFirst"><code>static void</code></td> |
| <td class="colLast"><code><span class="memberNameLink"><a href="../../../../../../../org/apache/mahout/math/jet/random/sampling/RandomSampler.html#sample-long-long-int-long-long:A-int-java.util.Random-">sample</a></span>(long n, |
| long N, |
| int count, |
| long low, |
| long[] values, |
| int fromIndex, |
| <a href="http://docs.oracle.com/javase/7/docs/api/java/util/Random.html?is-external=true" title="class or interface in java.util">Random</a> randomGenerator)</code> |
| <div class="block">Efficiently computes a sorted random set of <tt>count</tt> elements from the interval <tt>[low,low+N-1]</tt>.</div> |
| </td> |
| </tr> |
| </table> |
| <ul class="blockList"> |
| <li class="blockList"><a name="methods.inherited.from.class.java.lang.Object"> |
| <!-- --> |
| </a> |
| <h3>Methods inherited from class java.lang.<a href="http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html?is-external=true" title="class or interface in java.lang">Object</a></h3> |
| <code><a href="http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html?is-external=true#clone--" title="class or interface in java.lang">clone</a>, <a href="http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html?is-external=true#equals-java.lang.Object-" title="class or interface in java.lang">equals</a>, <a href="http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html?is-external=true#finalize--" title="class or interface in java.lang">finalize</a>, <a href="http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html?is-external=true#getClass--" title="class or interface in java.lang">getClass</a>, <a href="http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html?is-external=true#hashCode--" title="class or interface in java.lang">hashCode</a>, <a href="http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html?is-external=true#notify--" title="class or interface in java.lang">notify</a>, <a href="http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html?is-external=true#notifyAll--" title="class or interface in java.lang">notifyAll</a>, <a href="http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html?is-external=true#toString--" title="class or interface in java.lang">toString</a>, <a href="http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html?is-external=true#wait--" title="class or interface in java.lang">wait</a>, <a href="http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html?is-external=true#wait-long-" title="class or interface in java.lang">wait</a>, <a href="http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html?is-external=true#wait-long-int-" title="class or interface in java.lang">wait</a></code></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| </ul> |
| </div> |
| <div class="details"> |
| <ul class="blockList"> |
| <li class="blockList"> |
| <!-- ============ METHOD DETAIL ========== --> |
| <ul class="blockList"> |
| <li class="blockList"><a name="method.detail"> |
| <!-- --> |
| </a> |
| <h3>Method Detail</h3> |
| <a name="sample-long-long-int-long-long:A-int-java.util.Random-"> |
| <!-- --> |
| </a> |
| <ul class="blockListLast"> |
| <li class="blockList"> |
| <h4>sample</h4> |
| <pre>public static void sample(long n, |
| long N, |
| int count, |
| long low, |
| long[] values, |
| int fromIndex, |
| <a href="http://docs.oracle.com/javase/7/docs/api/java/util/Random.html?is-external=true" title="class or interface in java.util">Random</a> randomGenerator)</pre> |
| <div class="block">Efficiently computes a sorted random set of <tt>count</tt> elements from the interval <tt>[low,low+N-1]</tt>. Since |
| we are talking about a random set, no element will occur more than once. |
| |
| <p>Running time is <tt>O(count)</tt>, on average. Space requirements are zero. |
| |
| <p>Numbers are filled into the specified array starting at index <tt>fromIndex</tt> to the right. The array is |
| returned sorted ascending in the range filled with numbers. |
| |
| <p><b>Random number generation:</b> By default uses <tt>MersenneTwister</tt>, a very strong random number |
| generator, much better than <tt>java.util.Random</tt>. You can also use other strong random number generators of |
| Paul Houle's RngPack package. For example, <tt>Ranecu</tt>, <tt>Ranmar</tt> and <tt>Ranlux</tt> are strong well |
| analyzed research grade pseudo-random number generators with known periods.</div> |
| <dl> |
| <dt><span class="paramLabel">Parameters:</span></dt> |
| <dd><code>n</code> - the total number of elements to choose (must be <tt>n >= 0</tt> and <tt>n <= N</tt>).</dd> |
| <dd><code>N</code> - the interval to choose random numbers from is <tt>[low,low+N-1]</tt>.</dd> |
| <dd><code>count</code> - the number of elements to be filled into <tt>values</tt> by this call (must be >= 0 and |
| <=<tt>n</tt>). Normally, you will set <tt>count=n</tt>.</dd> |
| <dd><code>low</code> - the interval to choose random numbers from is <tt>[low,low+N-1]</tt>. Hint: If |
| <tt>low==0</tt>, then draws random numbers from the interval <tt>[0,N-1]</tt>.</dd> |
| <dd><code>values</code> - the array into which the random numbers are to be filled; must have a length <tt>>= |
| count+fromIndex</tt>.</dd> |
| <dd><code>fromIndex</code> - the first index within <tt>values</tt> to be filled with numbers (inclusive).</dd> |
| <dd><code>randomGenerator</code> - a random number generator. Set this parameter to <tt>null</tt> to use the default random |
| number generator.</dd> |
| </dl> |
| </li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| </ul> |
| </div> |
| </div> |
| <!-- ========= END OF CLASS DATA ========= --> |
| <!-- ======= START OF BOTTOM NAVBAR ====== --> |
| <div class="bottomNav"><a name="navbar.bottom"> |
| <!-- --> |
| </a> |
| <div class="skipNav"><a href="#skip.navbar.bottom" title="Skip navigation links">Skip navigation links</a></div> |
| <a name="navbar.bottom.firstrow"> |
| <!-- --> |
| </a> |
| <ul class="navList" title="Navigation"> |
| <li><a href="../../../../../../../overview-summary.html">Overview</a></li> |
| <li><a href="package-summary.html">Package</a></li> |
| <li class="navBarCell1Rev">Class</li> |
| <li><a href="class-use/RandomSampler.html">Use</a></li> |
| <li><a href="package-tree.html">Tree</a></li> |
| <li><a href="../../../../../../../deprecated-list.html">Deprecated</a></li> |
| <li><a href="../../../../../../../index-all.html">Index</a></li> |
| <li><a href="../../../../../../../help-doc.html">Help</a></li> |
| </ul> |
| </div> |
| <div class="subNav"> |
| <ul class="navList"> |
| <li>Prev Class</li> |
| <li>Next Class</li> |
| </ul> |
| <ul class="navList"> |
| <li><a href="../../../../../../../index.html?org/apache/mahout/math/jet/random/sampling/RandomSampler.html" target="_top">Frames</a></li> |
| <li><a href="RandomSampler.html" target="_top">No Frames</a></li> |
| </ul> |
| <ul class="navList" id="allclasses_navbar_bottom"> |
| <li><a href="../../../../../../../allclasses-noframe.html">All Classes</a></li> |
| </ul> |
| <div> |
| <script type="text/javascript"><!-- |
| allClassesLink = document.getElementById("allclasses_navbar_bottom"); |
| if(window==top) { |
| allClassesLink.style.display = "block"; |
| } |
| else { |
| allClassesLink.style.display = "none"; |
| } |
| //--> |
| </script> |
| </div> |
| <div> |
| <ul class="subNavList"> |
| <li>Summary: </li> |
| <li>Nested | </li> |
| <li>Field | </li> |
| <li>Constr | </li> |
| <li><a href="#method.summary">Method</a></li> |
| </ul> |
| <ul class="subNavList"> |
| <li>Detail: </li> |
| <li>Field | </li> |
| <li>Constr | </li> |
| <li><a href="#method.detail">Method</a></li> |
| </ul> |
| </div> |
| <a name="skip.navbar.bottom"> |
| <!-- --> |
| </a></div> |
| <!-- ======== END OF BOTTOM NAVBAR ======= --> |
| <p class="legalCopy"><small>Copyright © 2008–2017 <a href="http://www.apache.org/">The Apache Software Foundation</a>. All rights reserved.</small></p> |
| </body> |
| </html> |