blob: e738e2d24f23e99c3a0ead58a00e9c0078eaad1c [file] [log] [blame]
<!-- Generated by pkgdown: do not edit by hand -->
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Sources for a Dataset — Source • Arrow R Package</title>
<!-- jquery -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js" integrity="sha256-FgpCb/KJQlLNfOu91ta32o/NMZxltwRo8QtmkMRdAu8=" crossorigin="anonymous"></script>
<!-- Bootstrap -->
<link href="https://cdnjs.cloudflare.com/ajax/libs/bootswatch/3.3.7/cosmo/bootstrap.min.css" rel="stylesheet" crossorigin="anonymous" />
<script src="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha256-U5ZEeKfGNOja007MMD3YBI0A3OSZOQbeG6z2f2Y0hu8=" crossorigin="anonymous"></script>
<!-- Font Awesome icons -->
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.7.1/css/all.min.css" integrity="sha256-nAmazAk6vS34Xqo0BSrTb+abbtFlgsFK7NKSi6o7Y78=" crossorigin="anonymous" />
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.7.1/css/v4-shims.min.css" integrity="sha256-6qHlizsOWFskGlwVOKuns+D1nB6ssZrHQrNj1wGplHc=" crossorigin="anonymous" />
<!-- clipboard.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.4/clipboard.min.js" integrity="sha256-FiZwavyI2V6+EXO1U+xzLG3IKldpiTFf3153ea9zikQ=" crossorigin="anonymous"></script>
<!-- headroom.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.9.4/headroom.min.js" integrity="sha256-DJFC1kqIhelURkuza0AvYal5RxMtpzLjFhsnVIeuk+U=" crossorigin="anonymous"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.9.4/jQuery.headroom.min.js" integrity="sha256-ZX/yNShbjqsohH1k95liqY9Gd8uOiE1S4vZc+9KQ1K4=" crossorigin="anonymous"></script>
<!-- pkgdown -->
<link href="../pkgdown.css" rel="stylesheet">
<script src="../pkgdown.js"></script>
<meta property="og:title" content="Sources for a Dataset — Source" />
<meta property="og:description" content="A Dataset can have one or more Sources. A Source contains one or more
Fragments, such as files, of a common type and partitioning.
SourceFactory is used to create a Source, inspect the Schema of the
fragments contained in it, and declare a partitioning.
FileSystemSourceFactory is a subclass of SourceFactory for
discovering files in the local file system, the only currently supported
file system.
In general, you'll deal with SourceFactory rather than Source itself.
Return the Source's Schema" />
<meta name="twitter:card" content="summary" />
<!-- mathjax -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script>
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
<!-- Matomo -->
<script>
var _paq = window._paq = window._paq || [];
/* tracker methods like "setCustomDimension" should be called before "trackPageView" */
_paq.push(["setDoNotTrack", true]);
_paq.push(["disableCookies"]);
_paq.push(['trackPageView']);
_paq.push(['enableLinkTracking']);
(function() {
var u="https://analytics.apache.org/";
_paq.push(['setTrackerUrl', u+'matomo.php']);
_paq.push(['setSiteId', '20']);
var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0];
g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s);
})();
</script>
<!-- End Matomo Code -->
</head>
<body>
<div class="container template-reference-topic">
<header>
<div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar" aria-expanded="false">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">Arrow R Package</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version">0.16.0</span>
</span>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<li>
<a href="https://arrow.apache.org/">❯❯❯</a>
</li>
<li>
<a href="../articles/arrow.html">Get started</a>
</li>
<li>
<a href="../reference/index.html">Reference</a>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">
Articles
<span class="caret"></span>
</a>
<ul class="dropdown-menu" role="menu">
<li>
<a href="../articles/dataset.html">Working with Arrow Datasets and dplyr</a>
</li>
<li>
<a href="../articles/install.html">Installing the Arrow Package on Linux</a>
</li>
</ul>
</li>
<li>
<a href="../news/index.html">Changelog</a>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">
Project docs
<span class="caret"></span>
</a>
<ul class="dropdown-menu" role="menu">
<li>
<a href="https://arrow.apache.org/docs/format/README.html">Specification</a>
</li>
<li>
<a href="https://arrow.apache.org/docs/c_glib">C GLib</a>
</li>
<li>
<a href="https://arrow.apache.org/docs/cpp">C++</a>
</li>
<li>
<a href="https://arrow.apache.org/docs/java">Java</a>
</li>
<li>
<a href="https://arrow.apache.org/docs/js">JavaScript</a>
</li>
<li>
<a href="https://arrow.apache.org/docs/python">Python</a>
</li>
<li>
<a href="../index.html">R</a>
</li>
</ul>
</li>
</ul>
<ul class="nav navbar-nav navbar-right">
<li>
<a href="https://github.com/apache/arrow">
<span class="fab fa fab fa-github fa-lg"></span>
</a>
</li>
</ul>
</div><!--/.nav-collapse -->
</div><!--/.container -->
</div><!--/.navbar -->
</header>
<div class="row">
<div class="col-md-9 contents">
<div class="page-header">
<h1>Sources for a Dataset</h1>
<small class="dont-index">Source: <a href='https://github.com/apache/arrow/blob/master/R/dataset.R'><code>R/dataset.R</code></a></small>
<div class="hidden name"><code>Source.Rd</code></div>
</div>
<div class="ref-description">
<p>A <a href='Dataset.html'>Dataset</a> can have one or more <code>Source</code>s. A <code>Source</code> contains one or more
<code>Fragments</code>, such as files, of a common type and partitioning.
<code>SourceFactory</code> is used to create a <code>Source</code>, inspect the <a href='Schema.html'>Schema</a> of the
fragments contained in it, and declare a partitioning.
<code>FileSystemSourceFactory</code> is a subclass of <code>SourceFactory</code> for
discovering files in the local file system, the only currently supported
file system.</p>
<p>In general, you'll deal with <code>SourceFactory</code> rather than <code>Source</code> itself.</p>
<p>Return the Source's <code>Schema</code></p>
</div>
<h2 class="hasAnchor" id="factory"><a class="anchor" href="#factory"></a>Factory</h2>
<p>For the <code>SourceFactory$create()</code> factory method, see <code><a href='open_source.html'>open_source()</a></code>, an
alias for it.</p>
<p><code>FileSystemSourceFactory$create()</code> is a lower-level factory method and
takes the following arguments:</p><ul>
<li><p><code>filesystem</code>: A <a href='FileSystem.html'>FileSystem</a></p></li>
<li><p><code>selector</code>: A <a href='FileSelector.html'>FileSelector</a></p></li>
<li><p><code>format</code>: A string identifier of the format of the files in <code>path</code>.
Currently supported options are "parquet", "arrow", and "ipc" (an alias for
the Arrow file format)</p></li>
</ul>
<h2 class="hasAnchor" id="methods"><a class="anchor" href="#methods"></a>Methods</h2>
<p><code>Source</code> has one defined method:</p><ul>
<li><p><code>$schema</code>: Active binding, returns the <a href='Schema.html'>Schema</a> of the <code>Source</code></p></li>
</ul>
<p><code>SourceFactory</code> and its subclasses have the following methods:</p><ul>
<li><p><code>$Inspect()</code>: Walks the files in the directory and returns a common <a href='Schema.html'>Schema</a></p></li>
<li><p><code>$Finish(schema)</code>: Returns a <code>Source</code></p></li>
</ul>
<h2 class="hasAnchor" id="see-also"><a class="anchor" href="#see-also"></a>See also</h2>
<div class='dont-index'><p><a href='Dataset.html'>Dataset</a> for what do do with a <code>Source</code></p></div>
</div>
<div class="col-md-3 hidden-xs hidden-sm" id="sidebar">
<h2>Contents</h2>
<ul class="nav nav-pills nav-stacked">
<li><a href="#factory">Factory</a></li>
<li><a href="#methods">Methods</a></li>
<li><a href="#see-also">See also</a></li>
</ul>
</div>
</div>
<footer>
<div class="copyright">
<p>Developed by Romain François, Jeroen Ooms, Neal Richardson, Apache Arrow.</p>
</div>
<div class="pkgdown">
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.4.1.</p>
</div>
</footer>
</div>
<script type="text/javascript" src="/docs/_static/versionwarning.js"></script> </body>
</html>