| <!DOCTYPE html ><html><head><meta http-equiv="X-UA-Compatible" content="IE=edge"/><meta content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" name="viewport"/><title>Spark 4.0.0-preview1 ScalaDoc - org.apache.spark.SparkContext</title><meta content="Spark 4.0.0 - preview1 ScalaDoc - org.apache.spark.SparkContext" name="description"/><meta content="Spark 4.0.0 preview1 ScalaDoc org.apache.spark.SparkContext" name="keywords"/><meta http-equiv="content-type" content="text/html; charset=UTF-8"/><link href="../../../lib/index.css" media="screen" type="text/css" rel="stylesheet"/><link href="../../../lib/template.css" media="screen" type="text/css" rel="stylesheet"/><link href="../../../lib/print.css" media="print" type="text/css" rel="stylesheet"/><link href="../../../lib/diagrams.css" media="screen" type="text/css" rel="stylesheet" id="diagrams-css"/><script type="text/javascript" src="../../../lib/jquery.min.js"></script><script type="text/javascript" src="../../../lib/index.js"></script><script type="text/javascript" src="../../../index.js"></script><script type="text/javascript" src="../../../lib/scheduler.js"></script><script type="text/javascript" src="../../../lib/template.js"></script><script type="text/javascript">/* this variable can be used by the JS to determine the path to the root document */ |
| var toRoot = '../../../';</script></head><body><div id="search"><span id="doc-title">Spark 4.0.0-preview1 ScalaDoc<span id="doc-version"></span></span> <span class="close-results"><span class="left"><</span> Back</span><div id="textfilter"><span class="input"><input autocapitalize="none" placeholder="Search" id="index-input" type="text" accesskey="/"/><i class="clear material-icons"></i><i id="search-icon" class="material-icons"></i></span></div></div><div id="search-results"><div id="search-progress"><div id="progress-fill"></div></div><div id="results-content"><div id="entity-results"></div><div id="member-results"></div></div></div><div id="content-scroll-container" style="-webkit-overflow-scrolling: touch;"><div id="content-container" style="-webkit-overflow-scrolling: touch;"><div id="subpackage-spacer"><div id="packages"><h1>Packages</h1><ul><li class="indented0 " name="_root_.root" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="_root_" class="anchorToMember"></a><a id="root:_root_" class="anchorToMember"></a> <span class="permalink"><a href="../../../index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="../../../index.html" title=""><span class="name">root</span></a></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="../../../index.html" name="_root_" id="_root_" class="extype">root</a></dd></dl></div></li><li class="indented1 " name="_root_.org" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="org" class="anchorToMember"></a><a id="org:org" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="../../index.html" title=""><span class="name">org</span></a></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="../../../index.html" name="_root_" id="_root_" class="extype">root</a></dd></dl></div></li><li class="indented2 " name="org.apache" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="apache" class="anchorToMember"></a><a id="apache:apache" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="../index.html" title=""><span class="name">apache</span></a></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="../../index.html" name="org" id="org" class="extype">org</a></dd></dl></div></li><li class="indented3 " name="org.apache.spark" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="spark" class="anchorToMember"></a><a id="spark:spark" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="index.html" title="Core Spark functionality."><span class="name">spark</span></a></span><p class="shortcomment cmt">Core Spark functionality.</p><div class="fullcomment"><div class="comment cmt"><p>Core Spark functionality. <a href="" name="org.apache.spark.SparkContext" id="org.apache.spark.SparkContext" class="extype">org.apache.spark.SparkContext</a> serves as the main entry point to |
| Spark, while <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">org.apache.spark.rdd.RDD</a> is the data type representing a distributed collection, |
| and provides most parallel operations.</p><p>In addition, <a href="rdd/PairRDDFunctions.html" name="org.apache.spark.rdd.PairRDDFunctions" id="org.apache.spark.rdd.PairRDDFunctions" class="extype">org.apache.spark.rdd.PairRDDFunctions</a> contains operations available only on RDDs |
| of key-value pairs, such as <code>groupByKey</code> and <code>join</code>; <a href="rdd/DoubleRDDFunctions.html" name="org.apache.spark.rdd.DoubleRDDFunctions" id="org.apache.spark.rdd.DoubleRDDFunctions" class="extype">org.apache.spark.rdd.DoubleRDDFunctions</a> |
| contains operations available only on RDDs of Doubles; and |
| <a href="rdd/SequenceFileRDDFunctions.html" name="org.apache.spark.rdd.SequenceFileRDDFunctions" id="org.apache.spark.rdd.SequenceFileRDDFunctions" class="extype">org.apache.spark.rdd.SequenceFileRDDFunctions</a> contains operations available on RDDs that can |
| be saved as SequenceFiles. These operations are automatically available on any RDD of the right |
| type (e.g. RDD[(Int, Int)] through implicit conversions.</p><p>Java programmers should reference the <a href="api/java/index.html" name="org.apache.spark.api.java" id="org.apache.spark.api.java" class="extype">org.apache.spark.api.java</a> package |
| for Spark programming APIs in Java.</p><p>Classes and methods marked with <span class="experimental badge" style="float: none;"> |
| Experimental</span> are user-facing features which have not been officially adopted by the |
| Spark project. These are subject to change or removal in minor releases.</p><p>Classes and methods marked with <span class="developer badge" style="float: none;"> |
| Developer API</span> are intended for advanced users want to extend Spark through lower |
| level interfaces. These are subject to changes or removal in minor releases. |
| </p></div><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="../index.html" name="org.apache" id="org.apache" class="extype">apache</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.api" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="api" class="anchorToMember"></a><a id="api:api" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/api/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="api/index.html" title=""><span class="name">api</span></a></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.broadcast" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="broadcast" class="anchorToMember"></a><a id="broadcast:broadcast" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/broadcast/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="broadcast/index.html" title="Spark's broadcast variables, used to broadcast immutable datasets to all nodes."><span class="name">broadcast</span></a></span><p class="shortcomment cmt">Spark's broadcast variables, used to broadcast immutable datasets to all nodes.</p><div class="fullcomment"><div class="comment cmt"><p>Spark's broadcast variables, used to broadcast immutable datasets to all nodes. |
| </p></div><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.graphx" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="graphx" class="anchorToMember"></a><a id="graphx:graphx" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/graphx/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="graphx/index.html" title="ALPHA COMPONENT GraphX is a graph processing framework built on top of Spark."><span class="name">graphx</span></a></span><p class="shortcomment cmt"><span class="badge" style="float: right;">ALPHA COMPONENT</span> |
| GraphX is a graph processing framework built on top of Spark.</p><div class="fullcomment"><div class="comment cmt"><p><span class="badge" style="float: right;">ALPHA COMPONENT</span> |
| GraphX is a graph processing framework built on top of Spark. |
| </p></div><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.input" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="input" class="anchorToMember"></a><a id="input:input" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/input/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="input/index.html" title=""><span class="name">input</span></a></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.io" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="io" class="anchorToMember"></a><a id="io:io" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/io/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="io/index.html" title="IO codecs used for compression."><span class="name">io</span></a></span><p class="shortcomment cmt">IO codecs used for compression.</p><div class="fullcomment"><div class="comment cmt"><p>IO codecs used for compression. See <a href="io/CompressionCodec.html" name="org.apache.spark.io.CompressionCodec" id="org.apache.spark.io.CompressionCodec" class="extype">org.apache.spark.io.CompressionCodec</a>. |
| </p></div><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.launcher" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="launcher" class="anchorToMember"></a><a id="launcher:launcher" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/launcher/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="launcher/index.html" title=""><span class="name">launcher</span></a></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.mapred" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="mapred" class="anchorToMember"></a><a id="mapred:mapred" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/mapred/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="mapred/index.html" title=""><span class="name">mapred</span></a></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.metrics" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="metrics" class="anchorToMember"></a><a id="metrics:metrics" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/metrics/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="metrics/index.html" title=""><span class="name">metrics</span></a></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.ml" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="ml" class="anchorToMember"></a><a id="ml:ml" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/ml/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="ml/index.html" title="DataFrame-based machine learning APIs to let users quickly assemble and configure practical machine learning pipelines."><span class="name">ml</span></a></span><p class="shortcomment cmt">DataFrame-based machine learning APIs to let users quickly assemble and configure practical |
| machine learning pipelines.</p><div class="fullcomment"><div class="comment cmt"><p>DataFrame-based machine learning APIs to let users quickly assemble and configure practical |
| machine learning pipelines. |
| </p></div><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.mllib" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="mllib" class="anchorToMember"></a><a id="mllib:mllib" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/mllib/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="mllib/index.html" title="RDD-based machine learning APIs (in maintenance mode)."><span class="name">mllib</span></a></span><p class="shortcomment cmt">RDD-based machine learning APIs (in maintenance mode).</p><div class="fullcomment"><div class="comment cmt"><p>RDD-based machine learning APIs (in maintenance mode).</p><p>The <code>spark.mllib</code> package is in maintenance mode as of the Spark 2.0.0 release to encourage |
| migration to the DataFrame-based APIs under the <a href="ml/index.html" name="org.apache.spark.ml" id="org.apache.spark.ml" class="extype">org.apache.spark.ml</a> package. |
| While in maintenance mode,</p><ul><li>no new features in the RDD-based <code>spark.mllib</code> package will be accepted, unless they block |
| implementing new features in the DataFrame-based <code>spark.ml</code> package;</li><li>bug fixes in the RDD-based APIs will still be accepted.</li></ul><p>The developers will continue adding more features to the DataFrame-based APIs in the 2.x series |
| to reach feature parity with the RDD-based APIs. |
| And once we reach feature parity, this package will be deprecated. |
| </p></div><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd><dt>See also</dt><dd><span class="cmt"><p><a href="https://issues.apache.org/jira/browse/SPARK-4591">SPARK-4591</a> to track |
| the progress of feature parity</p></span></dd></dl></div></li><li class="indented4 " name="org.apache.spark.partial" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="partial" class="anchorToMember"></a><a id="partial:partial" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/partial/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="partial/index.html" title="Support for approximate results."><span class="name">partial</span></a></span><p class="shortcomment cmt">Support for approximate results.</p><div class="fullcomment"><div class="comment cmt"><p>Support for approximate results. This provides convenient api and also implementation for |
| approximate calculation. |
| </p></div><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd><dt>See also</dt><dd><span class="cmt"><p><a href="rdd/RDD.html#countApprox(timeout:Long,confidence:Double):org.apache.spark.partial.PartialResult[org.apache.spark.partial.BoundedDouble]" name="org.apache.spark.rdd.RDD#countApprox" id="org.apache.spark.rdd.RDD#countApprox" class="extmbr">org.apache.spark.rdd.RDD.countApprox</a></p></span></dd></dl></div></li><li class="indented4 " name="org.apache.spark.paths" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="paths" class="anchorToMember"></a><a id="paths:paths" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/paths/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="paths/index.html" title=""><span class="name">paths</span></a></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.rdd" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="rdd" class="anchorToMember"></a><a id="rdd:rdd" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/rdd/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="rdd/index.html" title="Provides several RDD implementations."><span class="name">rdd</span></a></span><p class="shortcomment cmt">Provides several RDD implementations.</p><div class="fullcomment"><div class="comment cmt"><p>Provides several RDD implementations. See <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">org.apache.spark.rdd.RDD</a>. |
| </p></div><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.resource" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="resource" class="anchorToMember"></a><a id="resource:resource" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/resource/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="resource/index.html" title=""><span class="name">resource</span></a></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.scheduler" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="scheduler" class="anchorToMember"></a><a id="scheduler:scheduler" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/scheduler/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="scheduler/index.html" title="Spark's scheduling components."><span class="name">scheduler</span></a></span><p class="shortcomment cmt">Spark's scheduling components.</p><div class="fullcomment"><div class="comment cmt"><p>Spark's scheduling components. This includes the <code>org.apache.spark.scheduler.DAGScheduler</code> and |
| lower level <code>org.apache.spark.scheduler.TaskScheduler</code>. |
| </p></div><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.security" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="security" class="anchorToMember"></a><a id="security:security" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/security/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="security/index.html" title=""><span class="name">security</span></a></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.serializer" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="serializer" class="anchorToMember"></a><a id="serializer:serializer" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/serializer/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="serializer/index.html" title="Pluggable serializers for RDD and shuffle data."><span class="name">serializer</span></a></span><p class="shortcomment cmt">Pluggable serializers for RDD and shuffle data.</p><div class="fullcomment"><div class="comment cmt"><p>Pluggable serializers for RDD and shuffle data. |
| </p></div><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd><dt>See also</dt><dd><span class="cmt"><p><a href="serializer/Serializer.html" name="org.apache.spark.serializer.Serializer" id="org.apache.spark.serializer.Serializer" class="extype">org.apache.spark.serializer.Serializer</a></p></span></dd></dl></div></li><li class="indented4 " name="org.apache.spark.shuffle" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="shuffle" class="anchorToMember"></a><a id="shuffle:shuffle" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/shuffle/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="shuffle/index.html" title=""><span class="name">shuffle</span></a></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.sql" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="sql" class="anchorToMember"></a><a id="sql:sql" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/sql/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="sql/index.html" title="Allows the execution of relational queries, including those expressed in SQL using Spark."><span class="name">sql</span></a></span><p class="shortcomment cmt">Allows the execution of relational queries, including those expressed in SQL using Spark.</p><div class="fullcomment"><div class="comment cmt"><p>Allows the execution of relational queries, including those expressed in SQL using Spark. |
| </p></div><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.status" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="status" class="anchorToMember"></a><a id="status:status" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/status/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="status/index.html" title=""><span class="name">status</span></a></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.storage" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="storage" class="anchorToMember"></a><a id="storage:storage" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/storage/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="storage/index.html" title=""><span class="name">storage</span></a></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.streaming" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="streaming" class="anchorToMember"></a><a id="streaming:streaming" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/streaming/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="streaming/index.html" title="Spark Streaming functionality."><span class="name">streaming</span></a></span><p class="shortcomment cmt">Spark Streaming functionality.</p><div class="fullcomment"><div class="comment cmt"><p>Spark Streaming functionality. <a href="streaming/StreamingContext.html" name="org.apache.spark.streaming.StreamingContext" id="org.apache.spark.streaming.StreamingContext" class="extype">org.apache.spark.streaming.StreamingContext</a> serves as the main |
| entry point to Spark Streaming, while <a href="streaming/dstream/DStream.html" name="org.apache.spark.streaming.dstream.DStream" id="org.apache.spark.streaming.dstream.DStream" class="extype">org.apache.spark.streaming.dstream.DStream</a> is the data |
| type representing a continuous sequence of RDDs, representing a continuous stream of data.</p><p>In addition, <a href="streaming/dstream/PairDStreamFunctions.html" name="org.apache.spark.streaming.dstream.PairDStreamFunctions" id="org.apache.spark.streaming.dstream.PairDStreamFunctions" class="extype">org.apache.spark.streaming.dstream.PairDStreamFunctions</a> contains operations |
| available only on DStreams |
| of key-value pairs, such as <code>groupByKey</code> and <code>reduceByKey</code>. These operations are automatically |
| available on any DStream of the right type (e.g. DStream[(Int, Int)] through implicit |
| conversions.</p><p>For the Java API of Spark Streaming, take a look at the |
| <a href="streaming/api/java/JavaStreamingContext.html" name="org.apache.spark.streaming.api.java.JavaStreamingContext" id="org.apache.spark.streaming.api.java.JavaStreamingContext" class="extype">org.apache.spark.streaming.api.java.JavaStreamingContext</a> which serves as the entry point, and |
| the <a href="streaming/api/java/JavaDStream.html" name="org.apache.spark.streaming.api.java.JavaDStream" id="org.apache.spark.streaming.api.java.JavaDStream" class="extype">org.apache.spark.streaming.api.java.JavaDStream</a> and the |
| <a href="streaming/api/java/JavaPairDStream.html" name="org.apache.spark.streaming.api.java.JavaPairDStream" id="org.apache.spark.streaming.api.java.JavaPairDStream" class="extype">org.apache.spark.streaming.api.java.JavaPairDStream</a> which have the DStream functionality. |
| </p></div><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.types" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="types" class="anchorToMember"></a><a id="types:types" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/types/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="types/index.html" title=""><span class="name">types</span></a></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.ui" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="ui" class="anchorToMember"></a><a id="ui:ui" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/ui/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="ui/index.html" title=""><span class="name">ui</span></a></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.unsafe" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="unsafe" class="anchorToMember"></a><a id="unsafe:unsafe" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/unsafe/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="unsafe/index.html" title=""><span class="name">unsafe</span></a></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="indented4 " name="org.apache.spark.util" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="util" class="anchorToMember"></a><a id="util:util" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/util/index.html" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">package</span></span> <span class="symbol"><a href="util/index.html" title="Spark utilities."><span class="name">util</span></a></span><p class="shortcomment cmt">Spark utilities.</p><div class="fullcomment"><div class="comment cmt"><p>Spark utilities. |
| </p></div><dl class="attributes block"><dt>Definition Classes</dt><dd><a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></dd></dl></div></li><li class="current-entities indented3"><span class="separator"></span> <a href="Aggregator.html" title=":: DeveloperApi :: A set of functions used to aggregate data." class="class"></a><a href="Aggregator.html" title=":: DeveloperApi :: A set of functions used to aggregate data.">Aggregator</a></li><li class="current-entities indented3"><a href="BarrierTaskContext$.html" title="" class="object"></a> <a href="BarrierTaskContext.html" title=":: Experimental :: A TaskContext with extra contextual info and tooling for tasks in a barrier stage." class="class"></a><a href="BarrierTaskContext.html" title=":: Experimental :: A TaskContext with extra contextual info and tooling for tasks in a barrier stage.">BarrierTaskContext</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="BarrierTaskInfo.html" title=":: Experimental :: Carries all task infos of a barrier task." class="class"></a><a href="BarrierTaskInfo.html" title=":: Experimental :: Carries all task infos of a barrier task.">BarrierTaskInfo</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="ComplexFutureAction.html" title="A FutureAction for actions that could trigger multiple Spark jobs." class="class"></a><a href="ComplexFutureAction.html" title="A FutureAction for actions that could trigger multiple Spark jobs.">ComplexFutureAction</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="ContextAwareIterator.html" title=":: DeveloperApi :: A TaskContext aware iterator." class="class"></a><a href="ContextAwareIterator.html" title=":: DeveloperApi :: A TaskContext aware iterator.">ContextAwareIterator</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="Dependency.html" title=":: DeveloperApi :: Base class for dependencies." class="class"></a><a href="Dependency.html" title=":: DeveloperApi :: Base class for dependencies.">Dependency</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="ErrorClassesJsonReader.html" title="A reader to load error information from one or more JSON files." class="class"></a><a href="ErrorClassesJsonReader.html" title="A reader to load error information from one or more JSON files.">ErrorClassesJsonReader</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="ExceptionFailure.html" title=":: DeveloperApi :: Task failed due to a runtime exception." class="class"></a><a href="ExceptionFailure.html" title=":: DeveloperApi :: Task failed due to a runtime exception.">ExceptionFailure</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="ExecutorLostFailure.html" title=":: DeveloperApi :: The task failed because the executor that it was running on was lost." class="class"></a><a href="ExecutorLostFailure.html" title=":: DeveloperApi :: The task failed because the executor that it was running on was lost.">ExecutorLostFailure</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="FetchFailed.html" title=":: DeveloperApi :: Task failed to fetch shuffle data from a remote node." class="class"></a><a href="FetchFailed.html" title=":: DeveloperApi :: Task failed to fetch shuffle data from a remote node.">FetchFailed</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="FutureAction.html" title="A future for the result of an action to support cancellation." class="trait"></a><a href="FutureAction.html" title="A future for the result of an action to support cancellation.">FutureAction</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="HashPartitioner.html" title="A org.apache.spark.Partitioner that implements hash-based partitioning using Java's Object.hashCode." class="class"></a><a href="HashPartitioner.html" title="A org.apache.spark.Partitioner that implements hash-based partitioning using Java's Object.hashCode.">HashPartitioner</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="InterruptibleIterator.html" title=":: DeveloperApi :: An iterator that wraps around an existing iterator to provide task killing functionality." class="class"></a><a href="InterruptibleIterator.html" title=":: DeveloperApi :: An iterator that wraps around an existing iterator to provide task killing functionality.">InterruptibleIterator</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="JobExecutionStatus.html" title="" class="class"></a><a href="JobExecutionStatus.html" title="">JobExecutionStatus</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="JobSubmitter.html" title="Handle via which a "run" function passed to a ComplexFutureAction can submit jobs for execution." class="trait"></a><a href="JobSubmitter.html" title="Handle via which a "run" function passed to a ComplexFutureAction can submit jobs for execution.">JobSubmitter</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="NarrowDependency.html" title=":: DeveloperApi :: Base class for dependencies where each partition of the child RDD depends on a small number of partitions of the parent RDD." class="class"></a><a href="NarrowDependency.html" title=":: DeveloperApi :: Base class for dependencies where each partition of the child RDD depends on a small number of partitions of the parent RDD.">NarrowDependency</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="OneToOneDependency.html" title=":: DeveloperApi :: Represents a one-to-one dependency between partitions of the parent and child RDDs." class="class"></a><a href="OneToOneDependency.html" title=":: DeveloperApi :: Represents a one-to-one dependency between partitions of the parent and child RDDs.">OneToOneDependency</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="Partition.html" title="An identifier for a partition in an RDD." class="trait"></a><a href="Partition.html" title="An identifier for a partition in an RDD.">Partition</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="PartitionEvaluator.html" title="An evaluator for computing RDD partitions." class="trait"></a><a href="PartitionEvaluator.html" title="An evaluator for computing RDD partitions.">PartitionEvaluator</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="PartitionEvaluatorFactory.html" title="A factory to create PartitionEvaluator." class="trait"></a><a href="PartitionEvaluatorFactory.html" title="A factory to create PartitionEvaluator.">PartitionEvaluatorFactory</a></li><li class="current-entities indented3"><a href="Partitioner$.html" title="" class="object"></a> <a href="Partitioner.html" title="An object that defines how the elements in a key-value pair RDD are partitioned by key." class="class"></a><a href="Partitioner.html" title="An object that defines how the elements in a key-value pair RDD are partitioned by key.">Partitioner</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="QueryContext.html" title="Query context of a SparkThrowable." class="trait"></a><a href="QueryContext.html" title="Query context of a SparkThrowable.">QueryContext</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="QueryContextType.html" title="The type of QueryContext." class="class"></a><a href="QueryContextType.html" title="The type of QueryContext.">QueryContextType</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="RangeDependency.html" title=":: DeveloperApi :: Represents a one-to-one dependency between ranges of partitions in the parent and child RDDs." class="class"></a><a href="RangeDependency.html" title=":: DeveloperApi :: Represents a one-to-one dependency between ranges of partitions in the parent and child RDDs.">RangeDependency</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="RangePartitioner.html" title="A org.apache.spark.Partitioner that partitions sortable records by range into roughly equal ranges." class="class"></a><a href="RangePartitioner.html" title="A org.apache.spark.Partitioner that partitions sortable records by range into roughly equal ranges.">RangePartitioner</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="Resubmitted$.html" title=":: DeveloperApi :: A org.apache.spark.scheduler.ShuffleMapTask that completed successfully earlier, but we lost the executor before the stage completed." class="object"></a><a href="Resubmitted$.html" title=":: DeveloperApi :: A org.apache.spark.scheduler.ShuffleMapTask that completed successfully earlier, but we lost the executor before the stage completed.">Resubmitted</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="SerializableWritable.html" title="" class="class"></a><a href="SerializableWritable.html" title="">SerializableWritable</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="ShuffleDependency.html" title=":: DeveloperApi :: Represents a dependency on the output of a shuffle stage." class="class"></a><a href="ShuffleDependency.html" title=":: DeveloperApi :: Represents a dependency on the output of a shuffle stage.">ShuffleDependency</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="SimpleFutureAction.html" title="A FutureAction holding the result of an action that triggers a single job." class="class"></a><a href="SimpleFutureAction.html" title="A FutureAction holding the result of an action that triggers a single job.">SimpleFutureAction</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="SparkConf.html" title="Configuration for a Spark application." class="class"></a><a href="SparkConf.html" title="Configuration for a Spark application.">SparkConf</a></li><li class="current-entities indented3"><a href="SparkContext$.html" title="The SparkContext object contains a number of implicit conversions and parameters for use with various Spark features." class="object"></a> <a href="" title="Main entry point for Spark functionality." class="class"></a><a href="" title="Main entry point for Spark functionality.">SparkContext</a></li><li class="current-entities indented3"><a href="SparkEnv$.html" title="" class="object"></a> <a href="SparkEnv.html" title=":: DeveloperApi :: Holds all the runtime environment objects for a running Spark instance (either master or worker), including the serializer, RpcEnv, block manager, map output tracker, etc." class="class"></a><a href="SparkEnv.html" title=":: DeveloperApi :: Holds all the runtime environment objects for a running Spark instance (either master or worker), including the serializer, RpcEnv, block manager, map output tracker, etc.">SparkEnv</a></li><li class="current-entities indented3"><a href="SparkException$.html" title="" class="object"></a> <a href="SparkException.html" title="" class="class"></a><a href="SparkException.html" title="">SparkException</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="SparkExecutorInfo.html" title="Exposes information about Spark Executors." class="trait"></a><a href="SparkExecutorInfo.html" title="Exposes information about Spark Executors.">SparkExecutorInfo</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="SparkFiles$.html" title="Resolves paths to files added through SparkContext.addFile()." class="object"></a><a href="SparkFiles$.html" title="Resolves paths to files added through SparkContext.addFile().">SparkFiles</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="SparkFirehoseListener.html" title="Class that allows users to receive all SparkListener events." class="class"></a><a href="SparkFirehoseListener.html" title="Class that allows users to receive all SparkListener events.">SparkFirehoseListener</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="SparkJobInfo.html" title="Exposes information about Spark Jobs." class="trait"></a><a href="SparkJobInfo.html" title="Exposes information about Spark Jobs.">SparkJobInfo</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="SparkStageInfo.html" title="Exposes information about Spark Stages." class="trait"></a><a href="SparkStageInfo.html" title="Exposes information about Spark Stages.">SparkStageInfo</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="SparkStatusTracker.html" title="Low-level status reporting APIs for monitoring job and stage progress." class="class"></a><a href="SparkStatusTracker.html" title="Low-level status reporting APIs for monitoring job and stage progress.">SparkStatusTracker</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="SparkThrowable.html" title="Interface mixed into Throwables thrown from Spark." class="trait"></a><a href="SparkThrowable.html" title="Interface mixed into Throwables thrown from Spark.">SparkThrowable</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="Success$.html" title=":: DeveloperApi :: Task succeeded." class="object"></a><a href="Success$.html" title=":: DeveloperApi :: Task succeeded.">Success</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="TaskCommitDenied.html" title=":: DeveloperApi :: Task requested the driver to commit, but was denied." class="class"></a><a href="TaskCommitDenied.html" title=":: DeveloperApi :: Task requested the driver to commit, but was denied.">TaskCommitDenied</a></li><li class="current-entities indented3"><a href="TaskContext$.html" title="" class="object"></a> <a href="TaskContext.html" title="Contextual information about a task which can be read or mutated during execution." class="class"></a><a href="TaskContext.html" title="Contextual information about a task which can be read or mutated during execution.">TaskContext</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="TaskEndReason.html" title=":: DeveloperApi :: Various possible reasons why a task ended." class="trait"></a><a href="TaskEndReason.html" title=":: DeveloperApi :: Various possible reasons why a task ended.">TaskEndReason</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="TaskFailedReason.html" title=":: DeveloperApi :: Various possible reasons why a task failed." class="trait"></a><a href="TaskFailedReason.html" title=":: DeveloperApi :: Various possible reasons why a task failed.">TaskFailedReason</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="TaskKilled.html" title=":: DeveloperApi :: Task was killed intentionally and needs to be rescheduled." class="class"></a><a href="TaskKilled.html" title=":: DeveloperApi :: Task was killed intentionally and needs to be rescheduled.">TaskKilled</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="TaskKilledException.html" title=":: DeveloperApi :: Exception thrown when a task is explicitly killed (i.e., task failure is expected)." class="class"></a><a href="TaskKilledException.html" title=":: DeveloperApi :: Exception thrown when a task is explicitly killed (i.e., task failure is expected).">TaskKilledException</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="TaskResultLost$.html" title=":: DeveloperApi :: The task finished successfully, but the result was lost from the executor's block manager before it was fetched." class="object"></a><a href="TaskResultLost$.html" title=":: DeveloperApi :: The task finished successfully, but the result was lost from the executor's block manager before it was fetched.">TaskResultLost</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="UnknownReason$.html" title=":: DeveloperApi :: We don't know why the task ended -- for example, because of a ClassNotFound exception when deserializing the task result." class="object"></a><a href="UnknownReason$.html" title=":: DeveloperApi :: We don't know why the task ended -- for example, because of a ClassNotFound exception when deserializing the task result.">UnknownReason</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="WritableConverter$.html" title="" class="object"></a><a href="WritableConverter$.html" title="">WritableConverter</a></li><li class="current-entities indented3"><span class="separator"></span> <a href="WritableFactory$.html" title="" class="object"></a><a href="WritableFactory$.html" title="">WritableFactory</a></li></ul></div></div><div id="content"><body class="class type"><div id="definition"><a href="SparkContext$.html" title="See companion object"><div class="big-circle class-companion-object">c</div></a><p id="owner"><a href="../../index.html" name="org" id="org" class="extype">org</a>.<a href="../index.html" name="org.apache" id="org.apache" class="extype">apache</a>.<a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a></p><h1><a href="SparkContext$.html" title="See companion object">SparkContext</a><span class="permalink"><a href="../../../org/apache/spark/SparkContext.html" title="Permalink"><i class="material-icons"></i></a></span></h1><h3><span class="morelinks"><div>Companion <a href="SparkContext$.html" title="See companion object">object SparkContext</a></div></span></h3></div><h4 id="signature" class="signature"><span class="modifier_kind"><span class="modifier"></span> <span class="kind">class</span></span> <span class="symbol"><span class="name">SparkContext</span><span class="result"> extends <span name="org.apache.spark.internal.Logging" class="extype">Logging</span></span></span></h4><div id="comment" class="fullcommenttop"><div class="comment cmt"><p>Main entry point for Spark functionality. A SparkContext represents the connection to a Spark |
| cluster, and can be used to create RDDs, accumulators and broadcast variables on that cluster. |
| </p></div><dl class="attributes block"><dt>Source</dt><dd><a href="https://github.com/apache/spark/tree/v4.0.0-preview1/core/src/main/scala/org/apache/spark/SparkContext.scala" target="_blank">SparkContext.scala</a></dd><dt>Note</dt><dd><span class="cmt"><p>Only one <code>SparkContext</code> should be active per JVM. You must <code>stop()</code> the |
| active <code>SparkContext</code> before creating a new one.</p></span></dd></dl><div class="toggleContainer"><div class="toggle block"><span>Linear Supertypes</span><div class="superTypes hiddenContent"><span name="org.apache.spark.internal.Logging" class="extype">Logging</span>, <span name="scala.AnyRef" class="extype">AnyRef</span>, <span name="scala.Any" class="extype">Any</span></div></div></div></div><div id="mbrsel"><div class="toggle"></div><div id="memberfilter"><i class="material-icons arrow"></i><span class="input"><input placeholder="Filter all members" id="mbrsel-input" type="text" accesskey="/"/></span><i class="clear material-icons"></i></div><div id="filterby"><div id="order"><span class="filtertype">Ordering</span><ol><li class="alpha in"><span>Alphabetic</span></li><li class="inherit out"><span>By Inheritance</span></li></ol></div><div class="ancestors"><span class="filtertype">Inherited<br/></span><ol id="linearization"><li class="in" name="org.apache.spark.SparkContext"><span>SparkContext</span></li><li class="in" name="org.apache.spark.internal.Logging"><span>Logging</span></li><li class="in" name="scala.AnyRef"><span>AnyRef</span></li><li class="in" name="scala.Any"><span>Any</span></li></ol></div><div class="ancestors"><span class="filtertype"></span><ol><li class="hideall out"><span>Hide All</span></li><li class="showall in"><span>Show All</span></li></ol></div><div id="visbl"><span class="filtertype">Visibility</span><ol><li class="public in"><span>Public</span></li><li class="protected out"><span>Protected</span></li></ol></div></div></div><div id="template"><div id="allMembers"><div id="constructors" class="members"><h3>Instance Constructors</h3><ol><li class="indented0 " name="org.apache.spark.SparkContext#<init>" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="<init>(master:String,appName:String,sparkHome:String,jars:Seq[String],environment:scala.collection.Map[String,String]):org.apache.spark.SparkContext" class="anchorToMember"></a><a id="<init>:SparkContext" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#<init>(master:String,appName:String,sparkHome:String,jars:Seq[String],environment:scala.collection.Map[String,String]):org.apache.spark.SparkContext" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">new</span></span> <span class="symbol"><span class="name">SparkContext</span><span class="params">(<span name="master">master: <span name="scala.Predef.String" class="extype">String</span></span>, <span name="appName">appName: <span name="scala.Predef.String" class="extype">String</span></span>, <span name="sparkHome">sparkHome: <span name="scala.Predef.String" class="extype">String</span> = <span class="symbol">null</span></span>, <span name="jars">jars: <span name="scala.Seq" class="extype">Seq</span>[<span name="scala.Predef.String" class="extype">String</span>] = <span class="symbol">Nil</span></span>, <span name="environment">environment: <span name="scala.collection.Map" class="extype">Map</span>[<span name="scala.Predef.String" class="extype">String</span>, <span name="scala.Predef.String" class="extype">String</span>] = <span class="symbol">Map()</span></span>)</span></span><p class="shortcomment cmt">Alternative constructor that allows setting common Spark properties directly |
| </p><div class="fullcomment"><div class="comment cmt"><p>Alternative constructor that allows setting common Spark properties directly |
| </p></div><dl class="paramcmts block"><dt class="param">master</dt><dd class="cmt"><p>Cluster URL to connect to (e.g. spark://host:port, local[4]).</p></dd><dt class="param">appName</dt><dd class="cmt"><p>A name for your application, to display on the cluster web UI.</p></dd><dt class="param">sparkHome</dt><dd class="cmt"><p>Location where Spark is installed on cluster nodes.</p></dd><dt class="param">jars</dt><dd class="cmt"><p>Collection of JARs to send to the cluster. These can be paths on the local file |
| system or HDFS, HTTP, HTTPS, or FTP URLs.</p></dd><dt class="param">environment</dt><dd class="cmt"><p>Environment variables to set on worker nodes.</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#<init>" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="<init>(master:String,appName:String,conf:org.apache.spark.SparkConf):org.apache.spark.SparkContext" class="anchorToMember"></a><a id="<init>:SparkContext" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#<init>(master:String,appName:String,conf:org.apache.spark.SparkConf):org.apache.spark.SparkContext" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">new</span></span> <span class="symbol"><span class="name">SparkContext</span><span class="params">(<span name="master">master: <span name="scala.Predef.String" class="extype">String</span></span>, <span name="appName">appName: <span name="scala.Predef.String" class="extype">String</span></span>, <span name="conf">conf: <a href="SparkConf.html" name="org.apache.spark.SparkConf" id="org.apache.spark.SparkConf" class="extype">SparkConf</a></span>)</span></span><p class="shortcomment cmt">Alternative constructor that allows setting common Spark properties directly |
| </p><div class="fullcomment"><div class="comment cmt"><p>Alternative constructor that allows setting common Spark properties directly |
| </p></div><dl class="paramcmts block"><dt class="param">master</dt><dd class="cmt"><p>Cluster URL to connect to (e.g. spark://host:port, local[4]).</p></dd><dt class="param">appName</dt><dd class="cmt"><p>A name for your application, to display on the cluster web UI</p></dd><dt class="param">conf</dt><dd class="cmt"><p>a <a href="SparkConf.html" name="org.apache.spark.SparkConf" id="org.apache.spark.SparkConf" class="extype">org.apache.spark.SparkConf</a> object specifying other Spark parameters</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#<init>" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="<init>():org.apache.spark.SparkContext" class="anchorToMember"></a><a id="<init>:SparkContext" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#<init>():org.apache.spark.SparkContext" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">new</span></span> <span class="symbol"><span class="name">SparkContext</span><span class="params">()</span></span><p class="shortcomment cmt">Create a SparkContext that loads settings from system properties (for instance, when |
| launching with ./bin/spark-submit).</p></li><li class="indented0 " name="org.apache.spark.SparkContext#<init>" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="<init>(config:org.apache.spark.SparkConf):org.apache.spark.SparkContext" class="anchorToMember"></a><a id="<init>:SparkContext" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#<init>(config:org.apache.spark.SparkConf):org.apache.spark.SparkContext" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">new</span></span> <span class="symbol"><span class="name">SparkContext</span><span class="params">(<span name="config">config: <a href="SparkConf.html" name="org.apache.spark.SparkConf" id="org.apache.spark.SparkConf" class="extype">SparkConf</a></span>)</span></span><p class="shortcomment cmt"></p><div class="fullcomment"><div class="comment cmt"></div><dl class="paramcmts block"><dt class="param">config</dt><dd class="cmt"><p>a Spark Config object describing the application configuration. Any settings in |
| this config overrides the default configs as well as system properties.</p></dd></dl></div></li></ol></div><div id="types" class="types members"><h3>Type Members</h3><ol><li class="indented0 " name="org.apache.spark.internal.Logging.LogStringContext" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="LogStringContextextendsAnyRef" class="anchorToMember"></a><a id="LogStringContext:LogStringContext" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#LogStringContextextendsAnyRef" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier">implicit </span> <span class="kind">class</span></span> <span class="symbol"><span class="name">LogStringContext</span><span class="result"> extends <span name="scala.AnyRef" class="extype">AnyRef</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li></ol></div><div class="values members"><h3>Value Members</h3><ol><li class="indented0 " name="scala.AnyRef#!=" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="!=(x$1:Any):Boolean" class="anchorToMember"></a><a id="!=(Any):Boolean" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#!=(x$1:Any):Boolean" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier">final </span> <span class="kind">def</span></span> <span class="symbol"><span class="name" title="gt4s: $bang$eq">!=</span><span class="params">(<span name="arg0">arg0: <span name="scala.Any" class="extype">Any</span></span>)</span><span class="result">: <span name="scala.Boolean" class="extype">Boolean</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd>AnyRef → Any</dd></dl></div></li><li class="indented0 " name="scala.AnyRef###" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="##:Int" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html###:Int" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier">final </span> <span class="kind">def</span></span> <span class="symbol"><span class="name" title="gt4s: $hash$hash">##</span><span class="result">: <span name="scala.Int" class="extype">Int</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd>AnyRef → Any</dd></dl></div></li><li class="indented0 " name="scala.AnyRef#==" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="==(x$1:Any):Boolean" class="anchorToMember"></a><a id="==(Any):Boolean" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#==(x$1:Any):Boolean" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier">final </span> <span class="kind">def</span></span> <span class="symbol"><span class="name" title="gt4s: $eq$eq">==</span><span class="params">(<span name="arg0">arg0: <span name="scala.Any" class="extype">Any</span></span>)</span><span class="result">: <span name="scala.Boolean" class="extype">Boolean</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd>AnyRef → Any</dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#addArchive" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="addArchive(path:String):Unit" class="anchorToMember"></a><a id="addArchive(String):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#addArchive(path:String):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">addArchive</span><span class="params">(<span name="path">path: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">:: Experimental :: |
| Add an archive to be downloaded and unpacked with this Spark job on every node.</p><div class="fullcomment"><div class="comment cmt"><p>:: Experimental :: |
| Add an archive to be downloaded and unpacked with this Spark job on every node.</p><p>If an archive is added during execution, it will not be available until the next TaskSet |
| starts. |
| </p></div><dl class="paramcmts block"><dt class="param">path</dt><dd class="cmt"><p>can be either a local file, a file in HDFS (or other Hadoop-supported |
| filesystems), or an HTTP, HTTPS or FTP URI. To access the file in Spark jobs, |
| use <code>SparkFiles.get(paths-to-files)</code> to find its download/unpacked location. |
| The given path should be one of .zip, .tar, .tar.gz, .tgz and .jar.</p></dd></dl><dl class="attributes block"><dt>Annotations</dt><dd><span class="name">@Experimental</span><span class="args">()</span> </dd><dt>Since</dt><dd><p>3.1.0</p></dd><dt>Note</dt><dd><span class="cmt"><p>A path can be added only once. Subsequent additions of the same path are ignored.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#addFile" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="addFile(path:String,recursive:Boolean):Unit" class="anchorToMember"></a><a id="addFile(String,Boolean):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#addFile(path:String,recursive:Boolean):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">addFile</span><span class="params">(<span name="path">path: <span name="scala.Predef.String" class="extype">String</span></span>, <span name="recursive">recursive: <span name="scala.Boolean" class="extype">Boolean</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Add a file to be downloaded with this Spark job on every node.</p><div class="fullcomment"><div class="comment cmt"><p>Add a file to be downloaded with this Spark job on every node.</p><p>If a file is added during execution, it will not be available until the next TaskSet starts. |
| </p></div><dl class="paramcmts block"><dt class="param">path</dt><dd class="cmt"><p>can be either a local file, a file in HDFS (or other Hadoop-supported |
| filesystems), or an HTTP, HTTPS or FTP URI. To access the file in Spark jobs, |
| use <code>SparkFiles.get(fileName)</code> to find its download location.</p></dd><dt class="param">recursive</dt><dd class="cmt"><p>if true, a directory can be given in <code>path</code>. Currently directories are |
| only supported for Hadoop-supported filesystems.</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>A path can be added only once. Subsequent additions of the same path are ignored.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#addFile" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="addFile(path:String):Unit" class="anchorToMember"></a><a id="addFile(String):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#addFile(path:String):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">addFile</span><span class="params">(<span name="path">path: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Add a file to be downloaded with this Spark job on every node.</p><div class="fullcomment"><div class="comment cmt"><p>Add a file to be downloaded with this Spark job on every node.</p><p>If a file is added during execution, it will not be available until the next TaskSet starts. |
| </p></div><dl class="paramcmts block"><dt class="param">path</dt><dd class="cmt"><p>can be either a local file, a file in HDFS (or other Hadoop-supported |
| filesystems), or an HTTP, HTTPS or FTP URI. To access the file in Spark jobs, |
| use <code>SparkFiles.get(fileName)</code> to find its download location.</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>A path can be added only once. Subsequent additions of the same path are ignored.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#addJar" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="addJar(path:String):Unit" class="anchorToMember"></a><a id="addJar(String):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#addJar(path:String):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">addJar</span><span class="params">(<span name="path">path: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Adds a JAR dependency for all tasks to be executed on this <code>SparkContext</code> in the future.</p><div class="fullcomment"><div class="comment cmt"><p>Adds a JAR dependency for all tasks to be executed on this <code>SparkContext</code> in the future.</p><p>If a jar is added during execution, it will not be available until the next TaskSet starts. |
| </p></div><dl class="paramcmts block"><dt class="param">path</dt><dd class="cmt"><p>can be either a local file, a file in HDFS (or other Hadoop-supported filesystems), |
| an HTTP, HTTPS or FTP URI, or local:/path for a file on every worker node.</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>A path can be added only once. Subsequent additions of the same path are ignored.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#addJobTag" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="addJobTag(tag:String):Unit" class="anchorToMember"></a><a id="addJobTag(String):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#addJobTag(tag:String):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">addJobTag</span><span class="params">(<span name="tag">tag: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Add a tag to be assigned to all the jobs started by this thread.</p><div class="fullcomment"><div class="comment cmt"><p>Add a tag to be assigned to all the jobs started by this thread.</p><p>Often, a unit of execution in an application consists of multiple Spark actions or jobs. |
| Application programmers can use this method to group all those jobs together and give a |
| group tag. The application can use <code>org.apache.spark.sql.SparkSession.interruptTag</code> to cancel |
| all running executions with this tag. For example:</p><pre><span class="cmt">// In the main thread:</span> |
| sc.addJobTag(<span class="lit">"myjobs"</span>) |
| sc.parallelize(<span class="num">1</span> to <span class="num">10000</span>, <span class="num">2</span>).map { i <span class="kw">=></span> Thread.sleep(<span class="num">10</span>); i }.count() |
| |
| <span class="cmt">// In a separate thread:</span> |
| spark.cancelJobsWithTag(<span class="lit">"myjobs"</span>)</pre><p>There may be multiple tags present at the same time, so different parts of application may use |
| different tags to perform cancellation at different levels of granularity. |
| </p></div><dl class="paramcmts block"><dt class="param">tag</dt><dd class="cmt"><p>The tag to be added. Cannot contain ',' (comma) character.</p></dd></dl><dl class="attributes block"><dt>Since</dt><dd><p>3.5.0</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#addSparkListener" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="addSparkListener(listener:org.apache.spark.scheduler.SparkListenerInterface):Unit" class="anchorToMember"></a><a id="addSparkListener(SparkListenerInterface):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#addSparkListener(listener:org.apache.spark.scheduler.SparkListenerInterface):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">addSparkListener</span><span class="params">(<span name="listener">listener: <span name="org.apache.spark.scheduler.SparkListenerInterface" class="extype">SparkListenerInterface</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">:: DeveloperApi :: |
| Register a listener to receive up-calls from events that happen during execution.</p><div class="fullcomment"><div class="comment cmt"><p>:: DeveloperApi :: |
| Register a listener to receive up-calls from events that happen during execution. |
| </p></div><dl class="attributes block"><dt>Annotations</dt><dd><span class="name">@DeveloperApi</span><span class="args">()</span> </dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#appName" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="appName:String" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#appName:String" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">appName</span><span class="result">: <span name="scala.Predef.String" class="extype">String</span></span></span></li><li class="indented0 " name="org.apache.spark.SparkContext#applicationAttemptId" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="applicationAttemptId:Option[String]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#applicationAttemptId:Option[String]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">applicationAttemptId</span><span class="result">: <span name="scala.Option" class="extype">Option</span>[<span name="scala.Predef.String" class="extype">String</span>]</span></span></li><li class="indented0 " name="org.apache.spark.SparkContext#applicationId" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="applicationId:String" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#applicationId:String" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">applicationId</span><span class="result">: <span name="scala.Predef.String" class="extype">String</span></span></span><p class="shortcomment cmt">A unique identifier for the Spark application.</p><div class="fullcomment"><div class="comment cmt"><p>A unique identifier for the Spark application. |
| Its format depends on the scheduler implementation. |
| (i.e. |
| in case of local spark app something like 'local-1433865536131' |
| in case of YARN something like 'application_1433865536131_34483' |
| ) |
| </p></div></div></li><li class="indented0 " name="org.apache.spark.SparkContext#archives" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="archives:Seq[String]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#archives:Seq[String]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">archives</span><span class="result">: <span name="scala.Seq" class="extype">Seq</span>[<span name="scala.Predef.String" class="extype">String</span>]</span></span></li><li class="indented0 " name="scala.Any#asInstanceOf" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="asInstanceOf[T0]:T0" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#asInstanceOf[T0]:T0" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier">final </span> <span class="kind">def</span></span> <span class="symbol"><span class="name">asInstanceOf</span><span class="tparams">[<span name="T0">T0</span>]</span><span class="result">: <span name="scala.Any.asInstanceOf.T0" class="extype">T0</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd>Any</dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#binaryFiles" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="binaryFiles(path:String,minPartitions:Int):org.apache.spark.rdd.RDD[(String,org.apache.spark.input.PortableDataStream)]" class="anchorToMember"></a><a id="binaryFiles(String,Int):RDD[(String,PortableDataStream)]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#binaryFiles(path:String,minPartitions:Int):org.apache.spark.rdd.RDD[(String,org.apache.spark.input.PortableDataStream)]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">binaryFiles</span><span class="params">(<span name="path">path: <span name="scala.Predef.String" class="extype">String</span></span>, <span name="minPartitions">minPartitions: <span name="scala.Int" class="extype">Int</span> = <span class="symbol"><span class="name"><a href="#defaultMinPartitions:Int">defaultMinPartitions</a></span></span></span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[(<span name="scala.Predef.String" class="extype">String</span>, <a href="input/PortableDataStream.html" name="org.apache.spark.input.PortableDataStream" id="org.apache.spark.input.PortableDataStream" class="extype">PortableDataStream</a>)]</span></span><p class="shortcomment cmt">Get an RDD for a Hadoop-readable dataset as PortableDataStream for each file |
| (useful for binary data)</p><div class="fullcomment"><div class="comment cmt"><p>Get an RDD for a Hadoop-readable dataset as PortableDataStream for each file |
| (useful for binary data)</p><p>For example, if you have the following files:</p><pre>hdfs:<span class="cmt">//a-hdfs-path/part-00000</span> |
| hdfs:<span class="cmt">//a-hdfs-path/part-00001</span> |
| ... |
| hdfs:<span class="cmt">//a-hdfs-path/part-nnnnn</span></pre><p>Do |
| <code>val rdd = sparkContext.binaryFiles("hdfs://a-hdfs-path")</code>,</p><p>then <code>rdd</code> contains</p><pre>(a-hdfs-path/part-<span class="num">00000</span>, its content) |
| (a-hdfs-path/part-<span class="num">00001</span>, its content) |
| ... |
| (a-hdfs-path/part-nnnnn, its content)</pre></div><dl class="paramcmts block"><dt class="param">path</dt><dd class="cmt"><p>Directory to the input data files, the path can be comma separated paths as the |
| list of inputs.</p></dd><dt class="param">minPartitions</dt><dd class="cmt"><p>A suggestion value of the minimal splitting number for input data.</p></dd><dt>returns</dt><dd class="cmt"><p>RDD representing tuples of file path and corresponding file content</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>Small files are preferred; very large files may cause bad performance.</p></span>, <span class="cmt"><p>On some filesystems, <code>.../path/*</code> can be a more efficient way to read all files |
| in a directory rather than <code>.../path/</code> or <code>.../path</code></p></span>, <span class="cmt"><p>Partitioning is determined by data locality. This may result in too few partitions |
| by default.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#binaryRecords" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="binaryRecords(path:String,recordLength:Int,conf:org.apache.hadoop.conf.Configuration):org.apache.spark.rdd.RDD[Array[Byte]]" class="anchorToMember"></a><a id="binaryRecords(String,Int,Configuration):RDD[Array[Byte]]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#binaryRecords(path:String,recordLength:Int,conf:org.apache.hadoop.conf.Configuration):org.apache.spark.rdd.RDD[Array[Byte]]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">binaryRecords</span><span class="params">(<span name="path">path: <span name="scala.Predef.String" class="extype">String</span></span>, <span name="recordLength">recordLength: <span name="scala.Int" class="extype">Int</span></span>, <span name="conf">conf: <span name="org.apache.hadoop.conf.Configuration" class="extype">Configuration</span> = <span class="symbol"><span class="name"><a href="#hadoopConfiguration:org.apache.hadoop.conf.Configuration">hadoopConfiguration</a></span></span></span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="scala.Array" class="extype">Array</span>[<span name="scala.Byte" class="extype">Byte</span>]]</span></span><p class="shortcomment cmt">Load data from a flat binary file, assuming the length of each record is constant.</p><div class="fullcomment"><div class="comment cmt"><p>Load data from a flat binary file, assuming the length of each record is constant. |
| </p></div><dl class="paramcmts block"><dt class="param">path</dt><dd class="cmt"><p>Directory to the input data files, the path can be comma separated paths as the |
| list of inputs.</p></dd><dt class="param">recordLength</dt><dd class="cmt"><p>The length at which to split the records</p></dd><dt class="param">conf</dt><dd class="cmt"><p>Configuration for setting up the dataset.</p></dd><dt>returns</dt><dd class="cmt"><p>An RDD of data with values, represented as byte arrays</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>We ensure that the byte array for each record in the resulting RDD |
| has the provided record length.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#broadcast" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="broadcast[T](value:T)(implicitevidence$9:scala.reflect.ClassTag[T]):org.apache.spark.broadcast.Broadcast[T]" class="anchorToMember"></a><a id="broadcast[T](T)(ClassTag[T]):Broadcast[T]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#broadcast[T](value:T)(implicitevidence$9:scala.reflect.ClassTag[T]):org.apache.spark.broadcast.Broadcast[T]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">broadcast</span><span class="tparams">[<span name="T">T</span>]</span><span class="params">(<span name="value">value: <span name="org.apache.spark.SparkContext.broadcast.T" class="extype">T</span></span>)</span><span class="params">(<span class="implicit">implicit </span><span name="arg0">arg0: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.broadcast.T" class="extype">T</span>]</span>)</span><span class="result">: <a href="broadcast/Broadcast.html" name="org.apache.spark.broadcast.Broadcast" id="org.apache.spark.broadcast.Broadcast" class="extype">Broadcast</a>[<span name="org.apache.spark.SparkContext.broadcast.T" class="extype">T</span>]</span></span><p class="shortcomment cmt">Broadcast a read-only variable to the cluster, returning a |
| <a href="broadcast/Broadcast.html" name="org.apache.spark.broadcast.Broadcast" id="org.apache.spark.broadcast.Broadcast" class="extype">org.apache.spark.broadcast.Broadcast</a> object for reading it in distributed functions.</p><div class="fullcomment"><div class="comment cmt"><p>Broadcast a read-only variable to the cluster, returning a |
| <a href="broadcast/Broadcast.html" name="org.apache.spark.broadcast.Broadcast" id="org.apache.spark.broadcast.Broadcast" class="extype">org.apache.spark.broadcast.Broadcast</a> object for reading it in distributed functions. |
| The variable will be sent to each executor only once. |
| </p></div><dl class="paramcmts block"><dt class="param">value</dt><dd class="cmt"><p>value to broadcast to the Spark nodes</p></dd><dt>returns</dt><dd class="cmt"><p><code>Broadcast</code> object, a read-only variable cached on each machine</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#cancelAllJobs" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="cancelAllJobs():Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#cancelAllJobs():Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">cancelAllJobs</span><span class="params">()</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Cancel all jobs that have been scheduled or are running.</p></li><li class="indented0 " name="org.apache.spark.SparkContext#cancelJob" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="cancelJob(jobId:Int):Unit" class="anchorToMember"></a><a id="cancelJob(Int):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#cancelJob(jobId:Int):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">cancelJob</span><span class="params">(<span name="jobId">jobId: <span name="scala.Int" class="extype">Int</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Cancel a given job if it's scheduled or running.</p><div class="fullcomment"><div class="comment cmt"><p>Cancel a given job if it's scheduled or running. |
| </p></div><dl class="paramcmts block"><dt class="param">jobId</dt><dd class="cmt"><p>the job ID to cancel</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>Throws <code>InterruptedException</code> if the cancel message cannot be sent</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#cancelJob" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="cancelJob(jobId:Int,reason:String):Unit" class="anchorToMember"></a><a id="cancelJob(Int,String):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#cancelJob(jobId:Int,reason:String):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">cancelJob</span><span class="params">(<span name="jobId">jobId: <span name="scala.Int" class="extype">Int</span></span>, <span name="reason">reason: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Cancel a given job if it's scheduled or running.</p><div class="fullcomment"><div class="comment cmt"><p>Cancel a given job if it's scheduled or running. |
| </p></div><dl class="paramcmts block"><dt class="param">jobId</dt><dd class="cmt"><p>the job ID to cancel</p></dd><dt class="param">reason</dt><dd class="cmt"><p>optional reason for cancellation</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>Throws <code>InterruptedException</code> if the cancel message cannot be sent</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#cancelJobGroup" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="cancelJobGroup(groupId:String):Unit" class="anchorToMember"></a><a id="cancelJobGroup(String):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#cancelJobGroup(groupId:String):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">cancelJobGroup</span><span class="params">(<span name="groupId">groupId: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Cancel active jobs for the specified group.</p><div class="fullcomment"><div class="comment cmt"><p>Cancel active jobs for the specified group. See <code>org.apache.spark.SparkContext.setJobGroup</code> |
| for more information. |
| </p></div></div></li><li class="indented0 " name="org.apache.spark.SparkContext#cancelJobGroupAndFutureJobs" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="cancelJobGroupAndFutureJobs(groupId:String):Unit" class="anchorToMember"></a><a id="cancelJobGroupAndFutureJobs(String):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#cancelJobGroupAndFutureJobs(groupId:String):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">cancelJobGroupAndFutureJobs</span><span class="params">(<span name="groupId">groupId: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Cancel active jobs for the specified group, as well as the future jobs in this job group.</p><div class="fullcomment"><div class="comment cmt"><p>Cancel active jobs for the specified group, as well as the future jobs in this job group. |
| Note: the maximum number of job groups that can be tracked is set by |
| 'spark.scheduler.numCancelledJobGroupsToTrack'. Once the limit is reached and a new job group |
| is to be added, the oldest job group tracked will be discarded. |
| </p></div></div></li><li class="indented0 " name="org.apache.spark.SparkContext#cancelJobsWithTag" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="cancelJobsWithTag(tag:String):Unit" class="anchorToMember"></a><a id="cancelJobsWithTag(String):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#cancelJobsWithTag(tag:String):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">cancelJobsWithTag</span><span class="params">(<span name="tag">tag: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Cancel active jobs that have the specified tag.</p><div class="fullcomment"><div class="comment cmt"><p>Cancel active jobs that have the specified tag. See <code>org.apache.spark.SparkContext.addJobTag</code>. |
| </p></div><dl class="paramcmts block"><dt class="param">tag</dt><dd class="cmt"><p>The tag to be cancelled. Cannot contain ',' (comma) character.</p></dd></dl><dl class="attributes block"><dt>Since</dt><dd><p>3.5.0</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#cancelStage" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="cancelStage(stageId:Int):Unit" class="anchorToMember"></a><a id="cancelStage(Int):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#cancelStage(stageId:Int):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">cancelStage</span><span class="params">(<span name="stageId">stageId: <span name="scala.Int" class="extype">Int</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Cancel a given stage and all jobs associated with it.</p><div class="fullcomment"><div class="comment cmt"><p>Cancel a given stage and all jobs associated with it. |
| </p></div><dl class="paramcmts block"><dt class="param">stageId</dt><dd class="cmt"><p>the stage ID to cancel</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>Throws <code>InterruptedException</code> if the cancel message cannot be sent</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#cancelStage" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="cancelStage(stageId:Int,reason:String):Unit" class="anchorToMember"></a><a id="cancelStage(Int,String):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#cancelStage(stageId:Int,reason:String):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">cancelStage</span><span class="params">(<span name="stageId">stageId: <span name="scala.Int" class="extype">Int</span></span>, <span name="reason">reason: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Cancel a given stage and all jobs associated with it.</p><div class="fullcomment"><div class="comment cmt"><p>Cancel a given stage and all jobs associated with it. |
| </p></div><dl class="paramcmts block"><dt class="param">stageId</dt><dd class="cmt"><p>the stage ID to cancel</p></dd><dt class="param">reason</dt><dd class="cmt"><p>reason for cancellation</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>Throws <code>InterruptedException</code> if the cancel message cannot be sent</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#checkpointFile" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="checkpointFile[T](path:String)(implicitevidence$5:scala.reflect.ClassTag[T]):org.apache.spark.rdd.RDD[T]" class="anchorToMember"></a><a id="checkpointFile[T](String)(ClassTag[T]):RDD[T]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#checkpointFile[T](path:String)(implicitevidence$5:scala.reflect.ClassTag[T]):org.apache.spark.rdd.RDD[T]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">checkpointFile</span><span class="tparams">[<span name="T">T</span>]</span><span class="params">(<span name="path">path: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="params">(<span class="implicit">implicit </span><span name="arg0">arg0: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.checkpointFile.T" class="extype">T</span>]</span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="org.apache.spark.SparkContext.checkpointFile.T" class="extype">T</span>]</span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected[<a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a>] </dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#clearCallSite" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="clearCallSite():Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#clearCallSite():Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">clearCallSite</span><span class="params">()</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Clear the thread-local property for overriding the call sites |
| of actions and RDDs.</p></li><li class="indented0 " name="org.apache.spark.SparkContext#clearJobGroup" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="clearJobGroup():Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#clearJobGroup():Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">clearJobGroup</span><span class="params">()</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Clear the current thread's job group ID and its description.</p></li><li class="indented0 " name="org.apache.spark.SparkContext#clearJobTags" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="clearJobTags():Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#clearJobTags():Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">clearJobTags</span><span class="params">()</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Clear the current thread's job tags.</p><div class="fullcomment"><div class="comment cmt"><p>Clear the current thread's job tags. |
| </p></div><dl class="attributes block"><dt>Since</dt><dd><p>3.5.0</p></dd></dl></div></li><li class="indented0 " name="scala.AnyRef#clone" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="clone():Object" class="anchorToMember"></a><a id="clone():AnyRef" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#clone():Object" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">clone</span><span class="params">()</span><span class="result">: <span name="scala.AnyRef" class="extype">AnyRef</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected[<span name="java.lang" class="extype">lang</span>] </dd><dt>Definition Classes</dt><dd>AnyRef</dd><dt>Annotations</dt><dd><span class="name">@throws</span><span class="args">(<span><span class="defval">classOf[java.lang.CloneNotSupportedException]</span></span>)</span> <span class="name">@IntrinsicCandidate</span><span class="args">()</span> <span class="name">@native</span><span class="args">()</span> </dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#collectionAccumulator" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="collectionAccumulator[T](name:String):org.apache.spark.util.CollectionAccumulator[T]" class="anchorToMember"></a><a id="collectionAccumulator[T](String):CollectionAccumulator[T]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#collectionAccumulator[T](name:String):org.apache.spark.util.CollectionAccumulator[T]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">collectionAccumulator</span><span class="tparams">[<span name="T">T</span>]</span><span class="params">(<span name="name">name: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <a href="util/CollectionAccumulator.html" name="org.apache.spark.util.CollectionAccumulator" id="org.apache.spark.util.CollectionAccumulator" class="extype">CollectionAccumulator</a>[<span name="org.apache.spark.SparkContext.collectionAccumulator.T" class="extype">T</span>]</span></span><p class="shortcomment cmt">Create and register a <code>CollectionAccumulator</code>, which starts with empty list and accumulates |
| inputs by adding them into the list.</p></li><li class="indented0 " name="org.apache.spark.SparkContext#collectionAccumulator" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="collectionAccumulator[T]:org.apache.spark.util.CollectionAccumulator[T]" class="anchorToMember"></a><a id="collectionAccumulator[T]:CollectionAccumulator[T]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#collectionAccumulator[T]:org.apache.spark.util.CollectionAccumulator[T]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">collectionAccumulator</span><span class="tparams">[<span name="T">T</span>]</span><span class="result">: <a href="util/CollectionAccumulator.html" name="org.apache.spark.util.CollectionAccumulator" id="org.apache.spark.util.CollectionAccumulator" class="extype">CollectionAccumulator</a>[<span name="org.apache.spark.SparkContext.collectionAccumulator.T" class="extype">T</span>]</span></span><p class="shortcomment cmt">Create and register a <code>CollectionAccumulator</code>, which starts with empty list and accumulates |
| inputs by adding them into the list.</p></li><li class="indented0 " name="org.apache.spark.SparkContext#defaultMinPartitions" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="defaultMinPartitions:Int" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#defaultMinPartitions:Int" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">defaultMinPartitions</span><span class="result">: <span name="scala.Int" class="extype">Int</span></span></span><p class="shortcomment cmt">Default min number of partitions for Hadoop RDDs when not given by user |
| Notice that we use math.min so the "defaultMinPartitions" cannot be higher than 2.</p><div class="fullcomment"><div class="comment cmt"><p>Default min number of partitions for Hadoop RDDs when not given by user |
| Notice that we use math.min so the "defaultMinPartitions" cannot be higher than 2. |
| The reasons for this are discussed in https://github.com/mesos/spark/pull/718 |
| </p></div></div></li><li class="indented0 " name="org.apache.spark.SparkContext#defaultParallelism" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="defaultParallelism:Int" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#defaultParallelism:Int" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">defaultParallelism</span><span class="result">: <span name="scala.Int" class="extype">Int</span></span></span><p class="shortcomment cmt">Default level of parallelism to use when not given by user (e.g.</p><div class="fullcomment"><div class="comment cmt"><p>Default level of parallelism to use when not given by user (e.g. parallelize and makeRDD).</p></div></div></li><li class="indented0 " name="org.apache.spark.SparkContext#deployMode" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="deployMode:String" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#deployMode:String" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">deployMode</span><span class="result">: <span name="scala.Predef.String" class="extype">String</span></span></span></li><li class="indented0 " name="org.apache.spark.SparkContext#doubleAccumulator" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="doubleAccumulator(name:String):org.apache.spark.util.DoubleAccumulator" class="anchorToMember"></a><a id="doubleAccumulator(String):DoubleAccumulator" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#doubleAccumulator(name:String):org.apache.spark.util.DoubleAccumulator" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">doubleAccumulator</span><span class="params">(<span name="name">name: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <a href="util/DoubleAccumulator.html" name="org.apache.spark.util.DoubleAccumulator" id="org.apache.spark.util.DoubleAccumulator" class="extype">DoubleAccumulator</a></span></span><p class="shortcomment cmt">Create and register a double accumulator, which starts with 0 and accumulates inputs by <code>add</code>.</p></li><li class="indented0 " name="org.apache.spark.SparkContext#doubleAccumulator" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="doubleAccumulator:org.apache.spark.util.DoubleAccumulator" class="anchorToMember"></a><a id="doubleAccumulator:DoubleAccumulator" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#doubleAccumulator:org.apache.spark.util.DoubleAccumulator" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">doubleAccumulator</span><span class="result">: <a href="util/DoubleAccumulator.html" name="org.apache.spark.util.DoubleAccumulator" id="org.apache.spark.util.DoubleAccumulator" class="extype">DoubleAccumulator</a></span></span><p class="shortcomment cmt">Create and register a double accumulator, which starts with 0 and accumulates inputs by <code>add</code>.</p></li><li class="indented0 " name="org.apache.spark.SparkContext#emptyRDD" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="emptyRDD[T](implicitevidence$8:scala.reflect.ClassTag[T]):org.apache.spark.rdd.RDD[T]" class="anchorToMember"></a><a id="emptyRDD[T](ClassTag[T]):RDD[T]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#emptyRDD[T](implicitevidence$8:scala.reflect.ClassTag[T]):org.apache.spark.rdd.RDD[T]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">emptyRDD</span><span class="tparams">[<span name="T">T</span>]</span><span class="params">(<span class="implicit">implicit </span><span name="arg0">arg0: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.emptyRDD.T" class="extype">T</span>]</span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="org.apache.spark.SparkContext.emptyRDD.T" class="extype">T</span>]</span></span><p class="shortcomment cmt">Get an RDD that has no partitions or elements.</p></li><li class="indented0 " name="scala.AnyRef#eq" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="eq(x$1:AnyRef):Boolean" class="anchorToMember"></a><a id="eq(AnyRef):Boolean" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#eq(x$1:AnyRef):Boolean" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier">final </span> <span class="kind">def</span></span> <span class="symbol"><span class="name">eq</span><span class="params">(<span name="arg0">arg0: <span name="scala.AnyRef" class="extype">AnyRef</span></span>)</span><span class="result">: <span name="scala.Boolean" class="extype">Boolean</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd>AnyRef</dd></dl></div></li><li class="indented0 " name="scala.AnyRef#equals" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="equals(x$1:Object):Boolean" class="anchorToMember"></a><a id="equals(AnyRef):Boolean" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#equals(x$1:Object):Boolean" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">equals</span><span class="params">(<span name="arg0">arg0: <span name="scala.AnyRef" class="extype">AnyRef</span></span>)</span><span class="result">: <span name="scala.Boolean" class="extype">Boolean</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd>AnyRef → Any</dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#files" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="files:Seq[String]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#files:Seq[String]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">files</span><span class="result">: <span name="scala.Seq" class="extype">Seq</span>[<span name="scala.Predef.String" class="extype">String</span>]</span></span></li><li class="indented0 " name="org.apache.spark.SparkContext#getAllPools" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="getAllPools:Seq[org.apache.spark.scheduler.Schedulable]" class="anchorToMember"></a><a id="getAllPools:Seq[Schedulable]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#getAllPools:Seq[org.apache.spark.scheduler.Schedulable]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">getAllPools</span><span class="result">: <span name="scala.Seq" class="extype">Seq</span>[<span name="org.apache.spark.scheduler.Schedulable" class="extype">Schedulable</span>]</span></span><p class="shortcomment cmt">:: DeveloperApi :: |
| Return pools for fair scheduler |
| </p><div class="fullcomment"><div class="comment cmt"><p>:: DeveloperApi :: |
| Return pools for fair scheduler |
| </p></div><dl class="attributes block"><dt>Annotations</dt><dd><span class="name">@DeveloperApi</span><span class="args">()</span> </dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#getCheckpointDir" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="getCheckpointDir:Option[String]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#getCheckpointDir:Option[String]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">getCheckpointDir</span><span class="result">: <span name="scala.Option" class="extype">Option</span>[<span name="scala.Predef.String" class="extype">String</span>]</span></span></li><li class="indented0 " name="scala.AnyRef#getClass" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="getClass():Class[_]" class="anchorToMember"></a><a id="getClass():Class[_<:AnyRef]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#getClass():Class[_]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier">final </span> <span class="kind">def</span></span> <span class="symbol"><span class="name">getClass</span><span class="params">()</span><span class="result">: <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Class.html#java.lang.Class" name="java.lang.Class" id="java.lang.Class" class="extype">Class</a>[_ <: <span name="scala.AnyRef" class="extype">AnyRef</span>]</span></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd>AnyRef → Any</dd><dt>Annotations</dt><dd><span class="name">@IntrinsicCandidate</span><span class="args">()</span> <span class="name">@native</span><span class="args">()</span> </dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#getConf" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="getConf:org.apache.spark.SparkConf" class="anchorToMember"></a><a id="getConf:SparkConf" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#getConf:org.apache.spark.SparkConf" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">getConf</span><span class="result">: <a href="SparkConf.html" name="org.apache.spark.SparkConf" id="org.apache.spark.SparkConf" class="extype">SparkConf</a></span></span><p class="shortcomment cmt">Return a copy of this SparkContext's configuration.</p><div class="fullcomment"><div class="comment cmt"><p>Return a copy of this SparkContext's configuration. The configuration <i>cannot</i> be |
| changed at runtime. |
| </p></div></div></li><li class="indented0 " name="org.apache.spark.SparkContext#getExecutorMemoryStatus" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="getExecutorMemoryStatus:scala.collection.Map[String,(Long,Long)]" class="anchorToMember"></a><a id="getExecutorMemoryStatus:Map[String,(Long,Long)]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#getExecutorMemoryStatus:scala.collection.Map[String,(Long,Long)]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">getExecutorMemoryStatus</span><span class="result">: <span name="scala.collection.Map" class="extype">Map</span>[<span name="scala.Predef.String" class="extype">String</span>, (<span name="scala.Long" class="extype">Long</span>, <span name="scala.Long" class="extype">Long</span>)]</span></span><p class="shortcomment cmt">Return a map from the block manager to the max memory available for caching and the remaining |
| memory available for caching.</p></li><li class="indented0 " name="org.apache.spark.SparkContext#getJobTags" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="getJobTags():Set[String]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#getJobTags():Set[String]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">getJobTags</span><span class="params">()</span><span class="result">: <span name="scala.Predef.Set" class="extype">Set</span>[<span name="scala.Predef.String" class="extype">String</span>]</span></span><p class="shortcomment cmt">Get the tags that are currently set to be assigned to all the jobs started by this thread.</p><div class="fullcomment"><div class="comment cmt"><p>Get the tags that are currently set to be assigned to all the jobs started by this thread. |
| </p></div><dl class="attributes block"><dt>Since</dt><dd><p>3.5.0</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#getLocalProperty" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="getLocalProperty(key:String):String" class="anchorToMember"></a><a id="getLocalProperty(String):String" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#getLocalProperty(key:String):String" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">getLocalProperty</span><span class="params">(<span name="key">key: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Predef.String" class="extype">String</span></span></span><p class="shortcomment cmt">Get a local property set in this thread, or null if it is missing.</p><div class="fullcomment"><div class="comment cmt"><p>Get a local property set in this thread, or null if it is missing. See |
| <code>org.apache.spark.SparkContext.setLocalProperty</code>. |
| </p></div></div></li><li class="indented0 " name="org.apache.spark.SparkContext#getPersistentRDDs" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="getPersistentRDDs:scala.collection.Map[Int,org.apache.spark.rdd.RDD[_]]" class="anchorToMember"></a><a id="getPersistentRDDs:Map[Int,RDD[_]]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#getPersistentRDDs:scala.collection.Map[Int,org.apache.spark.rdd.RDD[_]]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">getPersistentRDDs</span><span class="result">: <span name="scala.collection.Map" class="extype">Map</span>[<span name="scala.Int" class="extype">Int</span>, <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[_]]</span></span><p class="shortcomment cmt">Returns an immutable map of RDDs that have marked themselves as persistent via cache() call.</p><div class="fullcomment"><div class="comment cmt"><p>Returns an immutable map of RDDs that have marked themselves as persistent via cache() call. |
| </p></div><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>This does not necessarily mean the caching or computation was successful.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#getPoolForName" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="getPoolForName(pool:String):Option[org.apache.spark.scheduler.Schedulable]" class="anchorToMember"></a><a id="getPoolForName(String):Option[Schedulable]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#getPoolForName(pool:String):Option[org.apache.spark.scheduler.Schedulable]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">getPoolForName</span><span class="params">(<span name="pool">pool: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Option" class="extype">Option</span>[<span name="org.apache.spark.scheduler.Schedulable" class="extype">Schedulable</span>]</span></span><p class="shortcomment cmt">:: DeveloperApi :: |
| Return the pool associated with the given name, if one exists |
| </p><div class="fullcomment"><div class="comment cmt"><p>:: DeveloperApi :: |
| Return the pool associated with the given name, if one exists |
| </p></div><dl class="attributes block"><dt>Annotations</dt><dd><span class="name">@DeveloperApi</span><span class="args">()</span> </dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#getRDDStorageInfo" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="getRDDStorageInfo:Array[org.apache.spark.storage.RDDInfo]" class="anchorToMember"></a><a id="getRDDStorageInfo:Array[RDDInfo]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#getRDDStorageInfo:Array[org.apache.spark.storage.RDDInfo]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">getRDDStorageInfo</span><span class="result">: <span name="scala.Array" class="extype">Array</span>[<a href="storage/RDDInfo.html" name="org.apache.spark.storage.RDDInfo" id="org.apache.spark.storage.RDDInfo" class="extype">RDDInfo</a>]</span></span><p class="shortcomment cmt">:: DeveloperApi :: |
| Return information about what RDDs are cached, if they are in mem or on disk, how much space |
| they take, etc.</p><div class="fullcomment"><div class="comment cmt"><p>:: DeveloperApi :: |
| Return information about what RDDs are cached, if they are in mem or on disk, how much space |
| they take, etc. |
| </p></div><dl class="attributes block"><dt>Annotations</dt><dd><span class="name">@DeveloperApi</span><span class="args">()</span> </dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#getSchedulingMode" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="getSchedulingMode:org.apache.spark.scheduler.SchedulingMode.SchedulingMode" class="anchorToMember"></a><a id="getSchedulingMode:SchedulingMode" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#getSchedulingMode:org.apache.spark.scheduler.SchedulingMode.SchedulingMode" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">getSchedulingMode</span><span class="result">: <a href="scheduler/SchedulingMode$.html#SchedulingMode=org.apache.spark.scheduler.SchedulingMode.Value" name="org.apache.spark.scheduler.SchedulingMode.SchedulingMode" id="org.apache.spark.scheduler.SchedulingMode.SchedulingMode" class="extmbr">SchedulingMode</a></span></span><p class="shortcomment cmt">Return current scheduling mode |
| </p></li><li class="indented0 " name="org.apache.spark.SparkContext#hadoopConfiguration" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="hadoopConfiguration:org.apache.hadoop.conf.Configuration" class="anchorToMember"></a><a id="hadoopConfiguration:Configuration" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#hadoopConfiguration:org.apache.hadoop.conf.Configuration" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">hadoopConfiguration</span><span class="result">: <span name="org.apache.hadoop.conf.Configuration" class="extype">Configuration</span></span></span><p class="shortcomment cmt">A default Hadoop Configuration for the Hadoop code (e.g.</p><div class="fullcomment"><div class="comment cmt"><p>A default Hadoop Configuration for the Hadoop code (e.g. file systems) that we reuse. |
| </p></div><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>As it will be reused in all Hadoop RDDs, it's better not to modify it unless you |
| plan to set some global configurations for all Hadoop RDDs.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#hadoopFile" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="hadoopFile[K,V,F<:org.apache.hadoop.mapred.InputFormat[K,V]](path:String)(implicitkm:scala.reflect.ClassTag[K],implicitvm:scala.reflect.ClassTag[V],implicitfm:scala.reflect.ClassTag[F]):org.apache.spark.rdd.RDD[(K,V)]" class="anchorToMember"></a><a id="hadoopFile[K,V,F<:InputFormat[K,V]](String)(ClassTag[K],ClassTag[V],ClassTag[F]):RDD[(K,V)]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#hadoopFile[K,V,F<:org.apache.hadoop.mapred.InputFormat[K,V]](path:String)(implicitkm:scala.reflect.ClassTag[K],implicitvm:scala.reflect.ClassTag[V],implicitfm:scala.reflect.ClassTag[F]):org.apache.spark.rdd.RDD[(K,V)]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">hadoopFile</span><span class="tparams">[<span name="K">K</span>, <span name="V">V</span>, <span name="F">F <: <span name="org.apache.hadoop.mapred.InputFormat" class="extype">InputFormat</span>[<span name="org.apache.spark.SparkContext.hadoopFile.K" class="extype">K</span>, <span name="org.apache.spark.SparkContext.hadoopFile.V" class="extype">V</span>]</span>]</span><span class="params">(<span name="path">path: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="params">(<span class="implicit">implicit </span><span name="km">km: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.hadoopFile.K" class="extype">K</span>]</span>, <span name="vm">vm: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.hadoopFile.V" class="extype">V</span>]</span>, <span name="fm">fm: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.hadoopFile.F" class="extype">F</span>]</span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[(<span name="org.apache.spark.SparkContext.hadoopFile.K" class="extype">K</span>, <span name="org.apache.spark.SparkContext.hadoopFile.V" class="extype">V</span>)]</span></span><p class="shortcomment cmt">Smarter version of hadoopFile() that uses class tags to figure out the classes of keys, |
| values and the InputFormat so that users don't need to pass them directly.</p><div class="fullcomment"><div class="comment cmt"><p>Smarter version of hadoopFile() that uses class tags to figure out the classes of keys, |
| values and the InputFormat so that users don't need to pass them directly. Instead, callers |
| can just write, for example,</p><pre><span class="kw">val</span> file = sparkContext.hadoopFile[LongWritable, Text, TextInputFormat](path)</pre></div><dl class="paramcmts block"><dt class="param">path</dt><dd class="cmt"><p>directory to the input data files, the path can be comma separated paths as |
| a list of inputs</p></dd><dt>returns</dt><dd class="cmt"><p>RDD of tuples of key and corresponding value</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>Because Hadoop's RecordReader class re-uses the same Writable object for each |
| record, directly caching the returned RDD or directly passing it to an aggregation or shuffle |
| operation will create many references to the same object. |
| If you plan to directly cache, sort, or aggregate Hadoop writable objects, you should first |
| copy them using a <code>map</code> function.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#hadoopFile" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="hadoopFile[K,V,F<:org.apache.hadoop.mapred.InputFormat[K,V]](path:String,minPartitions:Int)(implicitkm:scala.reflect.ClassTag[K],implicitvm:scala.reflect.ClassTag[V],implicitfm:scala.reflect.ClassTag[F]):org.apache.spark.rdd.RDD[(K,V)]" class="anchorToMember"></a><a id="hadoopFile[K,V,F<:InputFormat[K,V]](String,Int)(ClassTag[K],ClassTag[V],ClassTag[F]):RDD[(K,V)]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#hadoopFile[K,V,F<:org.apache.hadoop.mapred.InputFormat[K,V]](path:String,minPartitions:Int)(implicitkm:scala.reflect.ClassTag[K],implicitvm:scala.reflect.ClassTag[V],implicitfm:scala.reflect.ClassTag[F]):org.apache.spark.rdd.RDD[(K,V)]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">hadoopFile</span><span class="tparams">[<span name="K">K</span>, <span name="V">V</span>, <span name="F">F <: <span name="org.apache.hadoop.mapred.InputFormat" class="extype">InputFormat</span>[<span name="org.apache.spark.SparkContext.hadoopFile.K" class="extype">K</span>, <span name="org.apache.spark.SparkContext.hadoopFile.V" class="extype">V</span>]</span>]</span><span class="params">(<span name="path">path: <span name="scala.Predef.String" class="extype">String</span></span>, <span name="minPartitions">minPartitions: <span name="scala.Int" class="extype">Int</span></span>)</span><span class="params">(<span class="implicit">implicit </span><span name="km">km: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.hadoopFile.K" class="extype">K</span>]</span>, <span name="vm">vm: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.hadoopFile.V" class="extype">V</span>]</span>, <span name="fm">fm: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.hadoopFile.F" class="extype">F</span>]</span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[(<span name="org.apache.spark.SparkContext.hadoopFile.K" class="extype">K</span>, <span name="org.apache.spark.SparkContext.hadoopFile.V" class="extype">V</span>)]</span></span><p class="shortcomment cmt">Smarter version of hadoopFile() that uses class tags to figure out the classes of keys, |
| values and the InputFormat so that users don't need to pass them directly.</p><div class="fullcomment"><div class="comment cmt"><p>Smarter version of hadoopFile() that uses class tags to figure out the classes of keys, |
| values and the InputFormat so that users don't need to pass them directly. Instead, callers |
| can just write, for example,</p><pre><span class="kw">val</span> file = sparkContext.hadoopFile[LongWritable, Text, TextInputFormat](path, minPartitions)</pre></div><dl class="paramcmts block"><dt class="param">path</dt><dd class="cmt"><p>directory to the input data files, the path can be comma separated paths |
| as a list of inputs</p></dd><dt class="param">minPartitions</dt><dd class="cmt"><p>suggested minimum number of partitions for the resulting RDD</p></dd><dt>returns</dt><dd class="cmt"><p>RDD of tuples of key and corresponding value</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>Because Hadoop's RecordReader class re-uses the same Writable object for each |
| record, directly caching the returned RDD or directly passing it to an aggregation or shuffle |
| operation will create many references to the same object. |
| If you plan to directly cache, sort, or aggregate Hadoop writable objects, you should first |
| copy them using a <code>map</code> function.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#hadoopFile" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="hadoopFile[K,V](path:String,inputFormatClass:Class[_<:org.apache.hadoop.mapred.InputFormat[K,V]],keyClass:Class[K],valueClass:Class[V],minPartitions:Int):org.apache.spark.rdd.RDD[(K,V)]" class="anchorToMember"></a><a id="hadoopFile[K,V](String,Class[_<:InputFormat[K,V]],Class[K],Class[V],Int):RDD[(K,V)]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#hadoopFile[K,V](path:String,inputFormatClass:Class[_<:org.apache.hadoop.mapred.InputFormat[K,V]],keyClass:Class[K],valueClass:Class[V],minPartitions:Int):org.apache.spark.rdd.RDD[(K,V)]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">hadoopFile</span><span class="tparams">[<span name="K">K</span>, <span name="V">V</span>]</span><span class="params">(<span name="path">path: <span name="scala.Predef.String" class="extype">String</span></span>, <span name="inputFormatClass">inputFormatClass: <span name="scala.Predef.Class" class="extype">Class</span>[_ <: <span name="org.apache.hadoop.mapred.InputFormat" class="extype">InputFormat</span>[<span name="org.apache.spark.SparkContext.hadoopFile.K" class="extype">K</span>, <span name="org.apache.spark.SparkContext.hadoopFile.V" class="extype">V</span>]]</span>, <span name="keyClass">keyClass: <span name="scala.Predef.Class" class="extype">Class</span>[<span name="org.apache.spark.SparkContext.hadoopFile.K" class="extype">K</span>]</span>, <span name="valueClass">valueClass: <span name="scala.Predef.Class" class="extype">Class</span>[<span name="org.apache.spark.SparkContext.hadoopFile.V" class="extype">V</span>]</span>, <span name="minPartitions">minPartitions: <span name="scala.Int" class="extype">Int</span> = <span class="symbol"><span class="name"><a href="#defaultMinPartitions:Int">defaultMinPartitions</a></span></span></span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[(<span name="org.apache.spark.SparkContext.hadoopFile.K" class="extype">K</span>, <span name="org.apache.spark.SparkContext.hadoopFile.V" class="extype">V</span>)]</span></span><p class="shortcomment cmt">Get an RDD for a Hadoop file with an arbitrary InputFormat |
| </p><div class="fullcomment"><div class="comment cmt"><p>Get an RDD for a Hadoop file with an arbitrary InputFormat |
| </p></div><dl class="paramcmts block"><dt class="param">path</dt><dd class="cmt"><p>directory to the input data files, the path can be comma separated paths |
| as a list of inputs</p></dd><dt class="param">inputFormatClass</dt><dd class="cmt"><p>storage format of the data to be read</p></dd><dt class="param">keyClass</dt><dd class="cmt"><p><code>Class</code> of the key associated with the <code>inputFormatClass</code> parameter</p></dd><dt class="param">valueClass</dt><dd class="cmt"><p><code>Class</code> of the value associated with the <code>inputFormatClass</code> parameter</p></dd><dt class="param">minPartitions</dt><dd class="cmt"><p>suggested minimum number of partitions for the resulting RDD</p></dd><dt>returns</dt><dd class="cmt"><p>RDD of tuples of key and corresponding value</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>Because Hadoop's RecordReader class re-uses the same Writable object for each |
| record, directly caching the returned RDD or directly passing it to an aggregation or shuffle |
| operation will create many references to the same object. |
| If you plan to directly cache, sort, or aggregate Hadoop writable objects, you should first |
| copy them using a <code>map</code> function.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#hadoopRDD" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="hadoopRDD[K,V](conf:org.apache.hadoop.mapred.JobConf,inputFormatClass:Class[_<:org.apache.hadoop.mapred.InputFormat[K,V]],keyClass:Class[K],valueClass:Class[V],minPartitions:Int):org.apache.spark.rdd.RDD[(K,V)]" class="anchorToMember"></a><a id="hadoopRDD[K,V](JobConf,Class[_<:InputFormat[K,V]],Class[K],Class[V],Int):RDD[(K,V)]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#hadoopRDD[K,V](conf:org.apache.hadoop.mapred.JobConf,inputFormatClass:Class[_<:org.apache.hadoop.mapred.InputFormat[K,V]],keyClass:Class[K],valueClass:Class[V],minPartitions:Int):org.apache.spark.rdd.RDD[(K,V)]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">hadoopRDD</span><span class="tparams">[<span name="K">K</span>, <span name="V">V</span>]</span><span class="params">(<span name="conf">conf: <span name="org.apache.hadoop.mapred.JobConf" class="extype">JobConf</span></span>, <span name="inputFormatClass">inputFormatClass: <span name="scala.Predef.Class" class="extype">Class</span>[_ <: <span name="org.apache.hadoop.mapred.InputFormat" class="extype">InputFormat</span>[<span name="org.apache.spark.SparkContext.hadoopRDD.K" class="extype">K</span>, <span name="org.apache.spark.SparkContext.hadoopRDD.V" class="extype">V</span>]]</span>, <span name="keyClass">keyClass: <span name="scala.Predef.Class" class="extype">Class</span>[<span name="org.apache.spark.SparkContext.hadoopRDD.K" class="extype">K</span>]</span>, <span name="valueClass">valueClass: <span name="scala.Predef.Class" class="extype">Class</span>[<span name="org.apache.spark.SparkContext.hadoopRDD.V" class="extype">V</span>]</span>, <span name="minPartitions">minPartitions: <span name="scala.Int" class="extype">Int</span> = <span class="symbol"><span class="name"><a href="#defaultMinPartitions:Int">defaultMinPartitions</a></span></span></span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[(<span name="org.apache.spark.SparkContext.hadoopRDD.K" class="extype">K</span>, <span name="org.apache.spark.SparkContext.hadoopRDD.V" class="extype">V</span>)]</span></span><p class="shortcomment cmt">Get an RDD for a Hadoop-readable dataset from a Hadoop JobConf given its InputFormat and other |
| necessary info (e.g.</p><div class="fullcomment"><div class="comment cmt"><p>Get an RDD for a Hadoop-readable dataset from a Hadoop JobConf given its InputFormat and other |
| necessary info (e.g. file name for a filesystem-based dataset, table name for HyperTable), |
| using the older MapReduce API (<code>org.apache.hadoop.mapred</code>). |
| </p></div><dl class="paramcmts block"><dt class="param">conf</dt><dd class="cmt"><p>JobConf for setting up the dataset. Note: This will be put into a Broadcast. |
| Therefore if you plan to reuse this conf to create multiple RDDs, you need to make |
| sure you won't modify the conf. A safe approach is always creating a new conf for |
| a new RDD.</p></dd><dt class="param">inputFormatClass</dt><dd class="cmt"><p>storage format of the data to be read</p></dd><dt class="param">keyClass</dt><dd class="cmt"><p><code>Class</code> of the key associated with the <code>inputFormatClass</code> parameter</p></dd><dt class="param">valueClass</dt><dd class="cmt"><p><code>Class</code> of the value associated with the <code>inputFormatClass</code> parameter</p></dd><dt class="param">minPartitions</dt><dd class="cmt"><p>Minimum number of Hadoop Splits to generate.</p></dd><dt>returns</dt><dd class="cmt"><p>RDD of tuples of key and corresponding value</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>Because Hadoop's RecordReader class re-uses the same Writable object for each |
| record, directly caching the returned RDD or directly passing it to an aggregation or shuffle |
| operation will create many references to the same object. |
| If you plan to directly cache, sort, or aggregate Hadoop writable objects, you should first |
| copy them using a <code>map</code> function.</p></span></dd></dl></div></li><li class="indented0 " name="scala.AnyRef#hashCode" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="hashCode():Int" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#hashCode():Int" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">hashCode</span><span class="params">()</span><span class="result">: <span name="scala.Int" class="extype">Int</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd>AnyRef → Any</dd><dt>Annotations</dt><dd><span class="name">@IntrinsicCandidate</span><span class="args">()</span> <span class="name">@native</span><span class="args">()</span> </dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#initializeLogIfNecessary" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="initializeLogIfNecessary(isInterpreter:Boolean,silent:Boolean):Boolean" class="anchorToMember"></a><a id="initializeLogIfNecessary(Boolean,Boolean):Boolean" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#initializeLogIfNecessary(isInterpreter:Boolean,silent:Boolean):Boolean" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">initializeLogIfNecessary</span><span class="params">(<span name="isInterpreter">isInterpreter: <span name="scala.Boolean" class="extype">Boolean</span></span>, <span name="silent">silent: <span name="scala.Boolean" class="extype">Boolean</span></span>)</span><span class="result">: <span name="scala.Boolean" class="extype">Boolean</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#initializeLogIfNecessary" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="initializeLogIfNecessary(isInterpreter:Boolean):Unit" class="anchorToMember"></a><a id="initializeLogIfNecessary(Boolean):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#initializeLogIfNecessary(isInterpreter:Boolean):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">initializeLogIfNecessary</span><span class="params">(<span name="isInterpreter">isInterpreter: <span name="scala.Boolean" class="extype">Boolean</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="scala.Any#isInstanceOf" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="isInstanceOf[T0]:Boolean" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#isInstanceOf[T0]:Boolean" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier">final </span> <span class="kind">def</span></span> <span class="symbol"><span class="name">isInstanceOf</span><span class="tparams">[<span name="T0">T0</span>]</span><span class="result">: <span name="scala.Boolean" class="extype">Boolean</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd>Any</dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#isLocal" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="isLocal:Boolean" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#isLocal:Boolean" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">isLocal</span><span class="result">: <span name="scala.Boolean" class="extype">Boolean</span></span></span></li><li class="indented0 " name="org.apache.spark.SparkContext#isStopped" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="isStopped:Boolean" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#isStopped:Boolean" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">isStopped</span><span class="result">: <span name="scala.Boolean" class="extype">Boolean</span></span></span><p class="shortcomment cmt"></p><div class="fullcomment"><div class="comment cmt"></div><dl class="paramcmts block"><dt>returns</dt><dd class="cmt"><p>true if context is stopped or in the midst of stopping.</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#isTraceEnabled" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="isTraceEnabled():Boolean" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#isTraceEnabled():Boolean" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">isTraceEnabled</span><span class="params">()</span><span class="result">: <span name="scala.Boolean" class="extype">Boolean</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#jars" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="jars:Seq[String]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#jars:Seq[String]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">jars</span><span class="result">: <span name="scala.Seq" class="extype">Seq</span>[<span name="scala.Predef.String" class="extype">String</span>]</span></span></li><li class="indented0 " name="org.apache.spark.SparkContext#killExecutor" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="killExecutor(executorId:String):Boolean" class="anchorToMember"></a><a id="killExecutor(String):Boolean" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#killExecutor(executorId:String):Boolean" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">killExecutor</span><span class="params">(<span name="executorId">executorId: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Boolean" class="extype">Boolean</span></span></span><p class="shortcomment cmt">:: DeveloperApi :: |
| Request that the cluster manager kill the specified executor.</p><div class="fullcomment"><div class="comment cmt"><p>:: DeveloperApi :: |
| Request that the cluster manager kill the specified executor. |
| </p></div><dl class="paramcmts block"><dt>returns</dt><dd class="cmt"><p>whether the request is received.</p></dd></dl><dl class="attributes block"><dt>Annotations</dt><dd><span class="name">@DeveloperApi</span><span class="args">()</span> </dd><dt>Note</dt><dd><span class="cmt"><p>This is an indication to the cluster manager that the application wishes to adjust |
| its resource usage downwards. If the application wishes to replace the executor it kills |
| through this method with a new one, it should follow up explicitly with a call to |
| {{SparkContext#requestExecutors}}.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#killExecutors" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="killExecutors(executorIds:Seq[String]):Boolean" class="anchorToMember"></a><a id="killExecutors(Seq[String]):Boolean" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#killExecutors(executorIds:Seq[String]):Boolean" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">killExecutors</span><span class="params">(<span name="executorIds">executorIds: <span name="scala.Seq" class="extype">Seq</span>[<span name="scala.Predef.String" class="extype">String</span>]</span>)</span><span class="result">: <span name="scala.Boolean" class="extype">Boolean</span></span></span><p class="shortcomment cmt">:: DeveloperApi :: |
| Request that the cluster manager kill the specified executors.</p><div class="fullcomment"><div class="comment cmt"><p>:: DeveloperApi :: |
| Request that the cluster manager kill the specified executors.</p><p>This is not supported when dynamic allocation is turned on. |
| </p></div><dl class="paramcmts block"><dt>returns</dt><dd class="cmt"><p>whether the request is received.</p></dd></dl><dl class="attributes block"><dt>Annotations</dt><dd><span class="name">@DeveloperApi</span><span class="args">()</span> </dd><dt>Note</dt><dd><span class="cmt"><p>This is an indication to the cluster manager that the application wishes to adjust |
| its resource usage downwards. If the application wishes to replace the executors it kills |
| through this method with new ones, it should follow up explicitly with a call to |
| {{SparkContext#requestExecutors}}.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#killTaskAttempt" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="killTaskAttempt(taskId:Long,interruptThread:Boolean,reason:String):Boolean" class="anchorToMember"></a><a id="killTaskAttempt(Long,Boolean,String):Boolean" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#killTaskAttempt(taskId:Long,interruptThread:Boolean,reason:String):Boolean" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">killTaskAttempt</span><span class="params">(<span name="taskId">taskId: <span name="scala.Long" class="extype">Long</span></span>, <span name="interruptThread">interruptThread: <span name="scala.Boolean" class="extype">Boolean</span> = <span class="symbol">true</span></span>, <span name="reason">reason: <span name="scala.Predef.String" class="extype">String</span> = <span class="defval">"killed via SparkContext.killTaskAttempt"</span></span>)</span><span class="result">: <span name="scala.Boolean" class="extype">Boolean</span></span></span><p class="shortcomment cmt">Kill and reschedule the given task attempt.</p><div class="fullcomment"><div class="comment cmt"><p>Kill and reschedule the given task attempt. Task ids can be obtained from the Spark UI |
| or through SparkListener.onTaskStart. |
| </p></div><dl class="paramcmts block"><dt class="param">taskId</dt><dd class="cmt"><p>the task ID to kill. This id uniquely identifies the task attempt.</p></dd><dt class="param">interruptThread</dt><dd class="cmt"><p>whether to interrupt the thread running the task.</p></dd><dt class="param">reason</dt><dd class="cmt"><p>the reason for killing the task, which should be a short string. If a task |
| is killed multiple times with different reasons, only one reason will be reported.</p></dd><dt>returns</dt><dd class="cmt"><p>Whether the task was successfully killed.</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#listArchives" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="listArchives():Seq[String]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#listArchives():Seq[String]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">listArchives</span><span class="params">()</span><span class="result">: <span name="scala.Seq" class="extype">Seq</span>[<span name="scala.Predef.String" class="extype">String</span>]</span></span><p class="shortcomment cmt">:: Experimental :: |
| Returns a list of archive paths that are added to resources.</p><div class="fullcomment"><div class="comment cmt"><p>:: Experimental :: |
| Returns a list of archive paths that are added to resources. |
| </p></div><dl class="attributes block"><dt>Annotations</dt><dd><span class="name">@Experimental</span><span class="args">()</span> </dd><dt>Since</dt><dd><p>3.1.0</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#listFiles" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="listFiles():Seq[String]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#listFiles():Seq[String]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">listFiles</span><span class="params">()</span><span class="result">: <span name="scala.Seq" class="extype">Seq</span>[<span name="scala.Predef.String" class="extype">String</span>]</span></span><p class="shortcomment cmt">Returns a list of file paths that are added to resources.</p></li><li class="indented0 " name="org.apache.spark.SparkContext#listJars" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="listJars():Seq[String]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#listJars():Seq[String]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">listJars</span><span class="params">()</span><span class="result">: <span name="scala.Seq" class="extype">Seq</span>[<span name="scala.Predef.String" class="extype">String</span>]</span></span><p class="shortcomment cmt">Returns a list of jar files that are added to resources.</p></li><li class="indented0 " name="org.apache.spark.SparkContext#localProperties" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="localProperties:InheritableThreadLocal[java.util.Properties]" class="anchorToMember"></a><a id="localProperties:InheritableThreadLocal[Properties]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#localProperties:InheritableThreadLocal[java.util.Properties]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">val</span></span> <span class="symbol"><span class="name">localProperties</span><span class="result">: <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/InheritableThreadLocal.html#java.lang.InheritableThreadLocal" name="java.lang.InheritableThreadLocal" id="java.lang.InheritableThreadLocal" class="extype">InheritableThreadLocal</a>[<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/Properties.html#java.util.Properties" name="java.util.Properties" id="java.util.Properties" class="extype">Properties</a>]</span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected[<a href="index.html" name="org.apache.spark" id="org.apache.spark" class="extype">spark</a>] </dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#log" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="log:org.slf4j.Logger" class="anchorToMember"></a><a id="log:Logger" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#log:org.slf4j.Logger" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">log</span><span class="result">: <span name="org.slf4j.Logger" class="extype">Logger</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#logDebug" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="logDebug(msg:=>String,throwable:Throwable):Unit" class="anchorToMember"></a><a id="logDebug(=>String,Throwable):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#logDebug(msg:=>String,throwable:Throwable):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">logDebug</span><span class="params">(<span name="msg">msg: => <span name="scala.Predef.String" class="extype">String</span></span>, <span name="throwable">throwable: <span name="scala.Throwable" class="extype">Throwable</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#logDebug" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="logDebug(entry:org.apache.spark.internal.LogEntry,throwable:Throwable):Unit" class="anchorToMember"></a><a id="logDebug(LogEntry,Throwable):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#logDebug(entry:org.apache.spark.internal.LogEntry,throwable:Throwable):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">logDebug</span><span class="params">(<span name="entry">entry: <span name="org.apache.spark.internal.LogEntry" class="extype">LogEntry</span></span>, <span name="throwable">throwable: <span name="scala.Throwable" class="extype">Throwable</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#logDebug" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="logDebug(entry:org.apache.spark.internal.LogEntry):Unit" class="anchorToMember"></a><a id="logDebug(LogEntry):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#logDebug(entry:org.apache.spark.internal.LogEntry):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">logDebug</span><span class="params">(<span name="entry">entry: <span name="org.apache.spark.internal.LogEntry" class="extype">LogEntry</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#logDebug" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="logDebug(msg:=>String):Unit" class="anchorToMember"></a><a id="logDebug(=>String):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#logDebug(msg:=>String):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">logDebug</span><span class="params">(<span name="msg">msg: => <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#logError" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="logError(msg:=>String,throwable:Throwable):Unit" class="anchorToMember"></a><a id="logError(=>String,Throwable):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#logError(msg:=>String,throwable:Throwable):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">logError</span><span class="params">(<span name="msg">msg: => <span name="scala.Predef.String" class="extype">String</span></span>, <span name="throwable">throwable: <span name="scala.Throwable" class="extype">Throwable</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#logError" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="logError(entry:org.apache.spark.internal.LogEntry,throwable:Throwable):Unit" class="anchorToMember"></a><a id="logError(LogEntry,Throwable):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#logError(entry:org.apache.spark.internal.LogEntry,throwable:Throwable):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">logError</span><span class="params">(<span name="entry">entry: <span name="org.apache.spark.internal.LogEntry" class="extype">LogEntry</span></span>, <span name="throwable">throwable: <span name="scala.Throwable" class="extype">Throwable</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#logError" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="logError(entry:org.apache.spark.internal.LogEntry):Unit" class="anchorToMember"></a><a id="logError(LogEntry):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#logError(entry:org.apache.spark.internal.LogEntry):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">logError</span><span class="params">(<span name="entry">entry: <span name="org.apache.spark.internal.LogEntry" class="extype">LogEntry</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#logError" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="logError(msg:=>String):Unit" class="anchorToMember"></a><a id="logError(=>String):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#logError(msg:=>String):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">logError</span><span class="params">(<span name="msg">msg: => <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#logInfo" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="logInfo(msg:=>String,throwable:Throwable):Unit" class="anchorToMember"></a><a id="logInfo(=>String,Throwable):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#logInfo(msg:=>String,throwable:Throwable):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">logInfo</span><span class="params">(<span name="msg">msg: => <span name="scala.Predef.String" class="extype">String</span></span>, <span name="throwable">throwable: <span name="scala.Throwable" class="extype">Throwable</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#logInfo" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="logInfo(entry:org.apache.spark.internal.LogEntry,throwable:Throwable):Unit" class="anchorToMember"></a><a id="logInfo(LogEntry,Throwable):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#logInfo(entry:org.apache.spark.internal.LogEntry,throwable:Throwable):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">logInfo</span><span class="params">(<span name="entry">entry: <span name="org.apache.spark.internal.LogEntry" class="extype">LogEntry</span></span>, <span name="throwable">throwable: <span name="scala.Throwable" class="extype">Throwable</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#logInfo" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="logInfo(entry:org.apache.spark.internal.LogEntry):Unit" class="anchorToMember"></a><a id="logInfo(LogEntry):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#logInfo(entry:org.apache.spark.internal.LogEntry):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">logInfo</span><span class="params">(<span name="entry">entry: <span name="org.apache.spark.internal.LogEntry" class="extype">LogEntry</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#logInfo" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="logInfo(msg:=>String):Unit" class="anchorToMember"></a><a id="logInfo(=>String):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#logInfo(msg:=>String):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">logInfo</span><span class="params">(<span name="msg">msg: => <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#logName" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="logName:String" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#logName:String" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">logName</span><span class="result">: <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html#java.lang.String" name="java.lang.String" id="java.lang.String" class="extype">String</a></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#logTrace" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="logTrace(msg:=>String,throwable:Throwable):Unit" class="anchorToMember"></a><a id="logTrace(=>String,Throwable):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#logTrace(msg:=>String,throwable:Throwable):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">logTrace</span><span class="params">(<span name="msg">msg: => <span name="scala.Predef.String" class="extype">String</span></span>, <span name="throwable">throwable: <span name="scala.Throwable" class="extype">Throwable</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#logTrace" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="logTrace(entry:org.apache.spark.internal.LogEntry,throwable:Throwable):Unit" class="anchorToMember"></a><a id="logTrace(LogEntry,Throwable):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#logTrace(entry:org.apache.spark.internal.LogEntry,throwable:Throwable):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">logTrace</span><span class="params">(<span name="entry">entry: <span name="org.apache.spark.internal.LogEntry" class="extype">LogEntry</span></span>, <span name="throwable">throwable: <span name="scala.Throwable" class="extype">Throwable</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#logTrace" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="logTrace(entry:org.apache.spark.internal.LogEntry):Unit" class="anchorToMember"></a><a id="logTrace(LogEntry):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#logTrace(entry:org.apache.spark.internal.LogEntry):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">logTrace</span><span class="params">(<span name="entry">entry: <span name="org.apache.spark.internal.LogEntry" class="extype">LogEntry</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#logTrace" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="logTrace(msg:=>String):Unit" class="anchorToMember"></a><a id="logTrace(=>String):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#logTrace(msg:=>String):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">logTrace</span><span class="params">(<span name="msg">msg: => <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#logWarning" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="logWarning(msg:=>String,throwable:Throwable):Unit" class="anchorToMember"></a><a id="logWarning(=>String,Throwable):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#logWarning(msg:=>String,throwable:Throwable):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">logWarning</span><span class="params">(<span name="msg">msg: => <span name="scala.Predef.String" class="extype">String</span></span>, <span name="throwable">throwable: <span name="scala.Throwable" class="extype">Throwable</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#logWarning" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="logWarning(entry:org.apache.spark.internal.LogEntry,throwable:Throwable):Unit" class="anchorToMember"></a><a id="logWarning(LogEntry,Throwable):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#logWarning(entry:org.apache.spark.internal.LogEntry,throwable:Throwable):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">logWarning</span><span class="params">(<span name="entry">entry: <span name="org.apache.spark.internal.LogEntry" class="extype">LogEntry</span></span>, <span name="throwable">throwable: <span name="scala.Throwable" class="extype">Throwable</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#logWarning" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="logWarning(entry:org.apache.spark.internal.LogEntry):Unit" class="anchorToMember"></a><a id="logWarning(LogEntry):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#logWarning(entry:org.apache.spark.internal.LogEntry):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">logWarning</span><span class="params">(<span name="entry">entry: <span name="org.apache.spark.internal.LogEntry" class="extype">LogEntry</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#logWarning" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="logWarning(msg:=>String):Unit" class="anchorToMember"></a><a id="logWarning(=>String):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#logWarning(msg:=>String):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">logWarning</span><span class="params">(<span name="msg">msg: => <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#longAccumulator" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="longAccumulator(name:String):org.apache.spark.util.LongAccumulator" class="anchorToMember"></a><a id="longAccumulator(String):LongAccumulator" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#longAccumulator(name:String):org.apache.spark.util.LongAccumulator" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">longAccumulator</span><span class="params">(<span name="name">name: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <a href="util/LongAccumulator.html" name="org.apache.spark.util.LongAccumulator" id="org.apache.spark.util.LongAccumulator" class="extype">LongAccumulator</a></span></span><p class="shortcomment cmt">Create and register a long accumulator, which starts with 0 and accumulates inputs by <code>add</code>.</p></li><li class="indented0 " name="org.apache.spark.SparkContext#longAccumulator" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="longAccumulator:org.apache.spark.util.LongAccumulator" class="anchorToMember"></a><a id="longAccumulator:LongAccumulator" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#longAccumulator:org.apache.spark.util.LongAccumulator" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">longAccumulator</span><span class="result">: <a href="util/LongAccumulator.html" name="org.apache.spark.util.LongAccumulator" id="org.apache.spark.util.LongAccumulator" class="extype">LongAccumulator</a></span></span><p class="shortcomment cmt">Create and register a long accumulator, which starts with 0 and accumulates inputs by <code>add</code>.</p></li><li class="indented0 " name="org.apache.spark.SparkContext#makeRDD" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="makeRDD[T](seq:Seq[(T,Seq[String])])(implicitevidence$3:scala.reflect.ClassTag[T]):org.apache.spark.rdd.RDD[T]" class="anchorToMember"></a><a id="makeRDD[T](Seq[(T,Seq[String])])(ClassTag[T]):RDD[T]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#makeRDD[T](seq:Seq[(T,Seq[String])])(implicitevidence$3:scala.reflect.ClassTag[T]):org.apache.spark.rdd.RDD[T]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">makeRDD</span><span class="tparams">[<span name="T">T</span>]</span><span class="params">(<span name="seq">seq: <span name="scala.Seq" class="extype">Seq</span>[(<span name="org.apache.spark.SparkContext.makeRDD.T" class="extype">T</span>, <span name="scala.Seq" class="extype">Seq</span>[<span name="scala.Predef.String" class="extype">String</span>])]</span>)</span><span class="params">(<span class="implicit">implicit </span><span name="arg0">arg0: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.makeRDD.T" class="extype">T</span>]</span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="org.apache.spark.SparkContext.makeRDD.T" class="extype">T</span>]</span></span><p class="shortcomment cmt">Distribute a local Scala collection to form an RDD, with one or more |
| location preferences (hostnames of Spark nodes) for each object.</p><div class="fullcomment"><div class="comment cmt"><p>Distribute a local Scala collection to form an RDD, with one or more |
| location preferences (hostnames of Spark nodes) for each object. |
| Create a new partition for each collection item.</p></div><dl class="paramcmts block"><dt class="param">seq</dt><dd class="cmt"><p>list of tuples of data and location preferences (hostnames of Spark nodes)</p></dd><dt>returns</dt><dd class="cmt"><p>RDD representing data partitioned according to location preferences</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#makeRDD" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="makeRDD[T](seq:Seq[T],numSlices:Int)(implicitevidence$2:scala.reflect.ClassTag[T]):org.apache.spark.rdd.RDD[T]" class="anchorToMember"></a><a id="makeRDD[T](Seq[T],Int)(ClassTag[T]):RDD[T]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#makeRDD[T](seq:Seq[T],numSlices:Int)(implicitevidence$2:scala.reflect.ClassTag[T]):org.apache.spark.rdd.RDD[T]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">makeRDD</span><span class="tparams">[<span name="T">T</span>]</span><span class="params">(<span name="seq">seq: <span name="scala.Seq" class="extype">Seq</span>[<span name="org.apache.spark.SparkContext.makeRDD.T" class="extype">T</span>]</span>, <span name="numSlices">numSlices: <span name="scala.Int" class="extype">Int</span> = <span class="symbol"><span class="name"><a href="#defaultParallelism:Int">defaultParallelism</a></span></span></span>)</span><span class="params">(<span class="implicit">implicit </span><span name="arg0">arg0: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.makeRDD.T" class="extype">T</span>]</span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="org.apache.spark.SparkContext.makeRDD.T" class="extype">T</span>]</span></span><p class="shortcomment cmt">Distribute a local Scala collection to form an RDD.</p><div class="fullcomment"><div class="comment cmt"><p>Distribute a local Scala collection to form an RDD.</p><p>This method is identical to <code>parallelize</code>.</p></div><dl class="paramcmts block"><dt class="param">seq</dt><dd class="cmt"><p>Scala collection to distribute</p></dd><dt class="param">numSlices</dt><dd class="cmt"><p>number of partitions to divide the collection into</p></dd><dt>returns</dt><dd class="cmt"><p>RDD representing distributed collection</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#master" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="master:String" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#master:String" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">master</span><span class="result">: <span name="scala.Predef.String" class="extype">String</span></span></span></li><li class="indented0 " name="scala.AnyRef#ne" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="ne(x$1:AnyRef):Boolean" class="anchorToMember"></a><a id="ne(AnyRef):Boolean" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#ne(x$1:AnyRef):Boolean" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier">final </span> <span class="kind">def</span></span> <span class="symbol"><span class="name">ne</span><span class="params">(<span name="arg0">arg0: <span name="scala.AnyRef" class="extype">AnyRef</span></span>)</span><span class="result">: <span name="scala.Boolean" class="extype">Boolean</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd>AnyRef</dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#newAPIHadoopFile" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="newAPIHadoopFile[K,V,F<:org.apache.hadoop.mapreduce.InputFormat[K,V]](path:String,fClass:Class[F],kClass:Class[K],vClass:Class[V],conf:org.apache.hadoop.conf.Configuration):org.apache.spark.rdd.RDD[(K,V)]" class="anchorToMember"></a><a id="newAPIHadoopFile[K,V,F<:InputFormat[K,V]](String,Class[F],Class[K],Class[V],Configuration):RDD[(K,V)]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#newAPIHadoopFile[K,V,F<:org.apache.hadoop.mapreduce.InputFormat[K,V]](path:String,fClass:Class[F],kClass:Class[K],vClass:Class[V],conf:org.apache.hadoop.conf.Configuration):org.apache.spark.rdd.RDD[(K,V)]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">newAPIHadoopFile</span><span class="tparams">[<span name="K">K</span>, <span name="V">V</span>, <span name="F">F <: <span name="org.apache.hadoop.mapreduce.InputFormat" class="extype">InputFormat</span>[<span name="org.apache.spark.SparkContext.newAPIHadoopFile.K" class="extype">K</span>, <span name="org.apache.spark.SparkContext.newAPIHadoopFile.V" class="extype">V</span>]</span>]</span><span class="params">(<span name="path">path: <span name="scala.Predef.String" class="extype">String</span></span>, <span name="fClass">fClass: <span name="scala.Predef.Class" class="extype">Class</span>[<span name="org.apache.spark.SparkContext.newAPIHadoopFile.F" class="extype">F</span>]</span>, <span name="kClass">kClass: <span name="scala.Predef.Class" class="extype">Class</span>[<span name="org.apache.spark.SparkContext.newAPIHadoopFile.K" class="extype">K</span>]</span>, <span name="vClass">vClass: <span name="scala.Predef.Class" class="extype">Class</span>[<span name="org.apache.spark.SparkContext.newAPIHadoopFile.V" class="extype">V</span>]</span>, <span name="conf">conf: <span name="org.apache.hadoop.conf.Configuration" class="extype">Configuration</span> = <span class="symbol"><span class="name"><a href="#hadoopConfiguration:org.apache.hadoop.conf.Configuration">hadoopConfiguration</a></span></span></span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[(<span name="org.apache.spark.SparkContext.newAPIHadoopFile.K" class="extype">K</span>, <span name="org.apache.spark.SparkContext.newAPIHadoopFile.V" class="extype">V</span>)]</span></span><p class="shortcomment cmt">Get an RDD for a given Hadoop file with an arbitrary new API InputFormat |
| and extra configuration options to pass to the input format.</p><div class="fullcomment"><div class="comment cmt"><p>Get an RDD for a given Hadoop file with an arbitrary new API InputFormat |
| and extra configuration options to pass to the input format. |
| </p></div><dl class="paramcmts block"><dt class="param">path</dt><dd class="cmt"><p>directory to the input data files, the path can be comma separated paths |
| as a list of inputs</p></dd><dt class="param">fClass</dt><dd class="cmt"><p>storage format of the data to be read</p></dd><dt class="param">kClass</dt><dd class="cmt"><p><code>Class</code> of the key associated with the <code>fClass</code> parameter</p></dd><dt class="param">vClass</dt><dd class="cmt"><p><code>Class</code> of the value associated with the <code>fClass</code> parameter</p></dd><dt class="param">conf</dt><dd class="cmt"><p>Hadoop configuration</p></dd><dt>returns</dt><dd class="cmt"><p>RDD of tuples of key and corresponding value</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>Because Hadoop's RecordReader class re-uses the same Writable object for each |
| record, directly caching the returned RDD or directly passing it to an aggregation or shuffle |
| operation will create many references to the same object. |
| If you plan to directly cache, sort, or aggregate Hadoop writable objects, you should first |
| copy them using a <code>map</code> function.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#newAPIHadoopFile" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="newAPIHadoopFile[K,V,F<:org.apache.hadoop.mapreduce.InputFormat[K,V]](path:String)(implicitkm:scala.reflect.ClassTag[K],implicitvm:scala.reflect.ClassTag[V],implicitfm:scala.reflect.ClassTag[F]):org.apache.spark.rdd.RDD[(K,V)]" class="anchorToMember"></a><a id="newAPIHadoopFile[K,V,F<:InputFormat[K,V]](String)(ClassTag[K],ClassTag[V],ClassTag[F]):RDD[(K,V)]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#newAPIHadoopFile[K,V,F<:org.apache.hadoop.mapreduce.InputFormat[K,V]](path:String)(implicitkm:scala.reflect.ClassTag[K],implicitvm:scala.reflect.ClassTag[V],implicitfm:scala.reflect.ClassTag[F]):org.apache.spark.rdd.RDD[(K,V)]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">newAPIHadoopFile</span><span class="tparams">[<span name="K">K</span>, <span name="V">V</span>, <span name="F">F <: <span name="org.apache.hadoop.mapreduce.InputFormat" class="extype">InputFormat</span>[<span name="org.apache.spark.SparkContext.newAPIHadoopFile.K" class="extype">K</span>, <span name="org.apache.spark.SparkContext.newAPIHadoopFile.V" class="extype">V</span>]</span>]</span><span class="params">(<span name="path">path: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="params">(<span class="implicit">implicit </span><span name="km">km: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.newAPIHadoopFile.K" class="extype">K</span>]</span>, <span name="vm">vm: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.newAPIHadoopFile.V" class="extype">V</span>]</span>, <span name="fm">fm: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.newAPIHadoopFile.F" class="extype">F</span>]</span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[(<span name="org.apache.spark.SparkContext.newAPIHadoopFile.K" class="extype">K</span>, <span name="org.apache.spark.SparkContext.newAPIHadoopFile.V" class="extype">V</span>)]</span></span><p class="shortcomment cmt">Smarter version of <code>newApiHadoopFile</code> that uses class tags to figure out the classes of keys, |
| values and the <code>org.apache.hadoop.mapreduce.InputFormat</code> (new MapReduce API) so that user |
| don't need to pass them directly.</p><div class="fullcomment"><div class="comment cmt"><p>Smarter version of <code>newApiHadoopFile</code> that uses class tags to figure out the classes of keys, |
| values and the <code>org.apache.hadoop.mapreduce.InputFormat</code> (new MapReduce API) so that user |
| don't need to pass them directly. Instead, callers can just write, for example: |
| <code><code><code> |
| val file = sparkContext.hadoopFile[LongWritable, Text, TextInputFormat](path) |
| <code><code><code> |
| </code></code></code></code></code></code></p></div><dl class="paramcmts block"><dt class="param">path</dt><dd class="cmt"><p>directory to the input data files, the path can be comma separated paths |
| as a list of inputs</p></dd><dt>returns</dt><dd class="cmt"><p>RDD of tuples of key and corresponding value</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>Because Hadoop's RecordReader class re-uses the same Writable object for each |
| record, directly caching the returned RDD or directly passing it to an aggregation or shuffle |
| operation will create many references to the same object. |
| If you plan to directly cache, sort, or aggregate Hadoop writable objects, you should first |
| copy them using a <code>map</code> function.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#newAPIHadoopRDD" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="newAPIHadoopRDD[K,V,F<:org.apache.hadoop.mapreduce.InputFormat[K,V]](conf:org.apache.hadoop.conf.Configuration,fClass:Class[F],kClass:Class[K],vClass:Class[V]):org.apache.spark.rdd.RDD[(K,V)]" class="anchorToMember"></a><a id="newAPIHadoopRDD[K,V,F<:InputFormat[K,V]](Configuration,Class[F],Class[K],Class[V]):RDD[(K,V)]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#newAPIHadoopRDD[K,V,F<:org.apache.hadoop.mapreduce.InputFormat[K,V]](conf:org.apache.hadoop.conf.Configuration,fClass:Class[F],kClass:Class[K],vClass:Class[V]):org.apache.spark.rdd.RDD[(K,V)]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">newAPIHadoopRDD</span><span class="tparams">[<span name="K">K</span>, <span name="V">V</span>, <span name="F">F <: <span name="org.apache.hadoop.mapreduce.InputFormat" class="extype">InputFormat</span>[<span name="org.apache.spark.SparkContext.newAPIHadoopRDD.K" class="extype">K</span>, <span name="org.apache.spark.SparkContext.newAPIHadoopRDD.V" class="extype">V</span>]</span>]</span><span class="params">(<span name="conf">conf: <span name="org.apache.hadoop.conf.Configuration" class="extype">Configuration</span> = <span class="symbol"><span class="name"><a href="#hadoopConfiguration:org.apache.hadoop.conf.Configuration">hadoopConfiguration</a></span></span></span>, <span name="fClass">fClass: <span name="scala.Predef.Class" class="extype">Class</span>[<span name="org.apache.spark.SparkContext.newAPIHadoopRDD.F" class="extype">F</span>]</span>, <span name="kClass">kClass: <span name="scala.Predef.Class" class="extype">Class</span>[<span name="org.apache.spark.SparkContext.newAPIHadoopRDD.K" class="extype">K</span>]</span>, <span name="vClass">vClass: <span name="scala.Predef.Class" class="extype">Class</span>[<span name="org.apache.spark.SparkContext.newAPIHadoopRDD.V" class="extype">V</span>]</span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[(<span name="org.apache.spark.SparkContext.newAPIHadoopRDD.K" class="extype">K</span>, <span name="org.apache.spark.SparkContext.newAPIHadoopRDD.V" class="extype">V</span>)]</span></span><p class="shortcomment cmt">Get an RDD for a given Hadoop file with an arbitrary new API InputFormat |
| and extra configuration options to pass to the input format.</p><div class="fullcomment"><div class="comment cmt"><p>Get an RDD for a given Hadoop file with an arbitrary new API InputFormat |
| and extra configuration options to pass to the input format. |
| </p></div><dl class="paramcmts block"><dt class="param">conf</dt><dd class="cmt"><p>Configuration for setting up the dataset. Note: This will be put into a Broadcast. |
| Therefore if you plan to reuse this conf to create multiple RDDs, you need to make |
| sure you won't modify the conf. A safe approach is always creating a new conf for |
| a new RDD.</p></dd><dt class="param">fClass</dt><dd class="cmt"><p>storage format of the data to be read</p></dd><dt class="param">kClass</dt><dd class="cmt"><p><code>Class</code> of the key associated with the <code>fClass</code> parameter</p></dd><dt class="param">vClass</dt><dd class="cmt"><p><code>Class</code> of the value associated with the <code>fClass</code> parameter</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>Because Hadoop's RecordReader class re-uses the same Writable object for each |
| record, directly caching the returned RDD or directly passing it to an aggregation or shuffle |
| operation will create many references to the same object. |
| If you plan to directly cache, sort, or aggregate Hadoop writable objects, you should first |
| copy them using a <code>map</code> function.</p></span></dd></dl></div></li><li class="indented0 " name="scala.AnyRef#notify" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="notify():Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#notify():Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier">final </span> <span class="kind">def</span></span> <span class="symbol"><span class="name">notify</span><span class="params">()</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd>AnyRef</dd><dt>Annotations</dt><dd><span class="name">@IntrinsicCandidate</span><span class="args">()</span> <span class="name">@native</span><span class="args">()</span> </dd></dl></div></li><li class="indented0 " name="scala.AnyRef#notifyAll" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="notifyAll():Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#notifyAll():Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier">final </span> <span class="kind">def</span></span> <span class="symbol"><span class="name">notifyAll</span><span class="params">()</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd>AnyRef</dd><dt>Annotations</dt><dd><span class="name">@IntrinsicCandidate</span><span class="args">()</span> <span class="name">@native</span><span class="args">()</span> </dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#objectFile" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="objectFile[T](path:String,minPartitions:Int)(implicitevidence$4:scala.reflect.ClassTag[T]):org.apache.spark.rdd.RDD[T]" class="anchorToMember"></a><a id="objectFile[T](String,Int)(ClassTag[T]):RDD[T]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#objectFile[T](path:String,minPartitions:Int)(implicitevidence$4:scala.reflect.ClassTag[T]):org.apache.spark.rdd.RDD[T]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">objectFile</span><span class="tparams">[<span name="T">T</span>]</span><span class="params">(<span name="path">path: <span name="scala.Predef.String" class="extype">String</span></span>, <span name="minPartitions">minPartitions: <span name="scala.Int" class="extype">Int</span> = <span class="symbol"><span class="name"><a href="#defaultMinPartitions:Int">defaultMinPartitions</a></span></span></span>)</span><span class="params">(<span class="implicit">implicit </span><span name="arg0">arg0: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.objectFile.T" class="extype">T</span>]</span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="org.apache.spark.SparkContext.objectFile.T" class="extype">T</span>]</span></span><p class="shortcomment cmt">Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and |
| BytesWritable values that contain a serialized partition.</p><div class="fullcomment"><div class="comment cmt"><p>Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and |
| BytesWritable values that contain a serialized partition. This is still an experimental |
| storage format and may not be supported exactly as is in future Spark releases. It will also |
| be pretty slow if you use the default serializer (Java serialization), |
| though the nice thing about it is that there's very little effort required to save arbitrary |
| objects. |
| </p></div><dl class="paramcmts block"><dt class="param">path</dt><dd class="cmt"><p>directory to the input data files, the path can be comma separated paths |
| as a list of inputs</p></dd><dt class="param">minPartitions</dt><dd class="cmt"><p>suggested minimum number of partitions for the resulting RDD</p></dd><dt>returns</dt><dd class="cmt"><p>RDD representing deserialized data from the file(s)</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#parallelize" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="parallelize[T](seq:Seq[T],numSlices:Int)(implicitevidence$1:scala.reflect.ClassTag[T]):org.apache.spark.rdd.RDD[T]" class="anchorToMember"></a><a id="parallelize[T](Seq[T],Int)(ClassTag[T]):RDD[T]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#parallelize[T](seq:Seq[T],numSlices:Int)(implicitevidence$1:scala.reflect.ClassTag[T]):org.apache.spark.rdd.RDD[T]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">parallelize</span><span class="tparams">[<span name="T">T</span>]</span><span class="params">(<span name="seq">seq: <span name="scala.Seq" class="extype">Seq</span>[<span name="org.apache.spark.SparkContext.parallelize.T" class="extype">T</span>]</span>, <span name="numSlices">numSlices: <span name="scala.Int" class="extype">Int</span> = <span class="symbol"><span class="name"><a href="#defaultParallelism:Int">defaultParallelism</a></span></span></span>)</span><span class="params">(<span class="implicit">implicit </span><span name="arg0">arg0: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.parallelize.T" class="extype">T</span>]</span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="org.apache.spark.SparkContext.parallelize.T" class="extype">T</span>]</span></span><p class="shortcomment cmt">Distribute a local Scala collection to form an RDD.</p><div class="fullcomment"><div class="comment cmt"><p>Distribute a local Scala collection to form an RDD. |
| </p></div><dl class="paramcmts block"><dt class="param">seq</dt><dd class="cmt"><p>Scala collection to distribute</p></dd><dt class="param">numSlices</dt><dd class="cmt"><p>number of partitions to divide the collection into</p></dd><dt>returns</dt><dd class="cmt"><p>RDD representing distributed collection</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>Parallelize acts lazily. If <code>seq</code> is a mutable collection and is altered after the call |
| to parallelize and before the first action on the RDD, the resultant RDD will reflect the |
| modified collection. Pass a copy of the argument to avoid this.</p></span>, <span class="cmt"><p>avoid using <code>parallelize(Seq())</code> to create an empty <code>RDD</code>. Consider <code>emptyRDD</code> for an |
| RDD with no partitions, or <code>parallelize(Seq[T]())</code> for an RDD of <code>T</code> with empty partitions.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#range" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="range(start:Long,end:Long,step:Long,numSlices:Int):org.apache.spark.rdd.RDD[Long]" class="anchorToMember"></a><a id="range(Long,Long,Long,Int):RDD[Long]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#range(start:Long,end:Long,step:Long,numSlices:Int):org.apache.spark.rdd.RDD[Long]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">range</span><span class="params">(<span name="start">start: <span name="scala.Long" class="extype">Long</span></span>, <span name="end">end: <span name="scala.Long" class="extype">Long</span></span>, <span name="step">step: <span name="scala.Long" class="extype">Long</span> = <span class="symbol">1</span></span>, <span name="numSlices">numSlices: <span name="scala.Int" class="extype">Int</span> = <span class="symbol"><span class="name"><a href="#defaultParallelism:Int">defaultParallelism</a></span></span></span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="scala.Long" class="extype">Long</span>]</span></span><p class="shortcomment cmt">Creates a new RDD[Long] containing elements from <code>start</code> to <code>end</code>(exclusive), increased by |
| <code>step</code> every element.</p><div class="fullcomment"><div class="comment cmt"><p>Creates a new RDD[Long] containing elements from <code>start</code> to <code>end</code>(exclusive), increased by |
| <code>step</code> every element. |
| </p></div><dl class="paramcmts block"><dt class="param">start</dt><dd class="cmt"><p>the start value.</p></dd><dt class="param">end</dt><dd class="cmt"><p>the end value.</p></dd><dt class="param">step</dt><dd class="cmt"><p>the incremental step</p></dd><dt class="param">numSlices</dt><dd class="cmt"><p>number of partitions to divide the collection into</p></dd><dt>returns</dt><dd class="cmt"><p>RDD representing distributed range</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>if we need to cache this RDD, we should make sure each partition does not exceed limit.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#register" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="register(acc:org.apache.spark.util.AccumulatorV2[_,_],name:String):Unit" class="anchorToMember"></a><a id="register(AccumulatorV2[_,_],String):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#register(acc:org.apache.spark.util.AccumulatorV2[_,_],name:String):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">register</span><span class="params">(<span name="acc">acc: <a href="util/AccumulatorV2.html" name="org.apache.spark.util.AccumulatorV2" id="org.apache.spark.util.AccumulatorV2" class="extype">AccumulatorV2</a>[_, _]</span>, <span name="name">name: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Register the given accumulator with given name.</p><div class="fullcomment"><div class="comment cmt"><p>Register the given accumulator with given name. |
| </p></div><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>Accumulators must be registered before use, or it will throw exception.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#register" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="register(acc:org.apache.spark.util.AccumulatorV2[_,_]):Unit" class="anchorToMember"></a><a id="register(AccumulatorV2[_,_]):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#register(acc:org.apache.spark.util.AccumulatorV2[_,_]):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">register</span><span class="params">(<span name="acc">acc: <a href="util/AccumulatorV2.html" name="org.apache.spark.util.AccumulatorV2" id="org.apache.spark.util.AccumulatorV2" class="extype">AccumulatorV2</a>[_, _]</span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Register the given accumulator.</p><div class="fullcomment"><div class="comment cmt"><p>Register the given accumulator. |
| </p></div><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>Accumulators must be registered before use, or it will throw exception.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#removeJobTag" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="removeJobTag(tag:String):Unit" class="anchorToMember"></a><a id="removeJobTag(String):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#removeJobTag(tag:String):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">removeJobTag</span><span class="params">(<span name="tag">tag: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Remove a tag previously added to be assigned to all the jobs started by this thread.</p><div class="fullcomment"><div class="comment cmt"><p>Remove a tag previously added to be assigned to all the jobs started by this thread. |
| Noop if such a tag was not added earlier. |
| </p></div><dl class="paramcmts block"><dt class="param">tag</dt><dd class="cmt"><p>The tag to be removed. Cannot contain ',' (comma) character.</p></dd></dl><dl class="attributes block"><dt>Since</dt><dd><p>3.5.0</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#removeSparkListener" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="removeSparkListener(listener:org.apache.spark.scheduler.SparkListenerInterface):Unit" class="anchorToMember"></a><a id="removeSparkListener(SparkListenerInterface):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#removeSparkListener(listener:org.apache.spark.scheduler.SparkListenerInterface):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">removeSparkListener</span><span class="params">(<span name="listener">listener: <span name="org.apache.spark.scheduler.SparkListenerInterface" class="extype">SparkListenerInterface</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">:: DeveloperApi :: |
| Deregister the listener from Spark's listener bus.</p><div class="fullcomment"><div class="comment cmt"><p>:: DeveloperApi :: |
| Deregister the listener from Spark's listener bus. |
| </p></div><dl class="attributes block"><dt>Annotations</dt><dd><span class="name">@DeveloperApi</span><span class="args">()</span> </dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#requestExecutors" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="requestExecutors(numAdditionalExecutors:Int):Boolean" class="anchorToMember"></a><a id="requestExecutors(Int):Boolean" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#requestExecutors(numAdditionalExecutors:Int):Boolean" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">requestExecutors</span><span class="params">(<span name="numAdditionalExecutors">numAdditionalExecutors: <span name="scala.Int" class="extype">Int</span></span>)</span><span class="result">: <span name="scala.Boolean" class="extype">Boolean</span></span></span><p class="shortcomment cmt">:: DeveloperApi :: |
| Request an additional number of executors from the cluster manager.</p><div class="fullcomment"><div class="comment cmt"><p>:: DeveloperApi :: |
| Request an additional number of executors from the cluster manager.</p></div><dl class="paramcmts block"><dt>returns</dt><dd class="cmt"><p>whether the request is received.</p></dd></dl><dl class="attributes block"><dt>Annotations</dt><dd><span class="name">@DeveloperApi</span><span class="args">()</span> </dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#requestTotalExecutors" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="requestTotalExecutors(numExecutors:Int,localityAwareTasks:Int,hostToLocalTaskCount:scala.collection.immutable.Map[String,Int]):Boolean" class="anchorToMember"></a><a id="requestTotalExecutors(Int,Int,Map[String,Int]):Boolean" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#requestTotalExecutors(numExecutors:Int,localityAwareTasks:Int,hostToLocalTaskCount:scala.collection.immutable.Map[String,Int]):Boolean" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">requestTotalExecutors</span><span class="params">(<span name="numExecutors">numExecutors: <span name="scala.Int" class="extype">Int</span></span>, <span name="localityAwareTasks">localityAwareTasks: <span name="scala.Int" class="extype">Int</span></span>, <span name="hostToLocalTaskCount">hostToLocalTaskCount: <span name="scala.collection.immutable.Map" class="extype">Map</span>[<span name="scala.Predef.String" class="extype">String</span>, <span name="scala.Int" class="extype">Int</span>]</span>)</span><span class="result">: <span name="scala.Boolean" class="extype">Boolean</span></span></span><p class="shortcomment cmt">Update the cluster manager on our scheduling needs.</p><div class="fullcomment"><div class="comment cmt"><p>Update the cluster manager on our scheduling needs. Three bits of information are included |
| to help it make decisions. This applies to the default ResourceProfile.</p></div><dl class="paramcmts block"><dt class="param">numExecutors</dt><dd class="cmt"><p>The total number of executors we'd like to have. The cluster manager |
| shouldn't kill any running executor to reach this number, but, |
| if all existing executors were to die, this is the number of executors |
| we'd want to be allocated.</p></dd><dt class="param">localityAwareTasks</dt><dd class="cmt"><p>The number of tasks in all active stages that have a locality |
| preferences. This includes running, pending, and completed tasks.</p></dd><dt class="param">hostToLocalTaskCount</dt><dd class="cmt"><p>A map of hosts to the number of tasks from all active stages |
| that would like to like to run on that host. |
| This includes running, pending, and completed tasks.</p></dd><dt>returns</dt><dd class="cmt"><p>whether the request is acknowledged by the cluster manager.</p></dd></dl><dl class="attributes block"><dt>Annotations</dt><dd><span class="name">@DeveloperApi</span><span class="args">()</span> </dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#resources" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="resources:scala.collection.Map[String,org.apache.spark.resource.ResourceInformation]" class="anchorToMember"></a><a id="resources:Map[String,ResourceInformation]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#resources:scala.collection.Map[String,org.apache.spark.resource.ResourceInformation]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">resources</span><span class="result">: <span name="scala.collection.Map" class="extype">Map</span>[<span name="scala.Predef.String" class="extype">String</span>, <a href="resource/ResourceInformation.html" name="org.apache.spark.resource.ResourceInformation" id="org.apache.spark.resource.ResourceInformation" class="extype">ResourceInformation</a>]</span></span></li><li class="indented0 " name="org.apache.spark.SparkContext#runApproximateJob" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="runApproximateJob[T,U,R](rdd:org.apache.spark.rdd.RDD[T],func:(org.apache.spark.TaskContext,Iterator[T])=>U,evaluator:org.apache.spark.partial.ApproximateEvaluator[U,R],timeout:Long):org.apache.spark.partial.PartialResult[R]" class="anchorToMember"></a><a id="runApproximateJob[T,U,R](RDD[T],(TaskContext,Iterator[T])=>U,ApproximateEvaluator[U,R],Long):PartialResult[R]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#runApproximateJob[T,U,R](rdd:org.apache.spark.rdd.RDD[T],func:(org.apache.spark.TaskContext,Iterator[T])=>U,evaluator:org.apache.spark.partial.ApproximateEvaluator[U,R],timeout:Long):org.apache.spark.partial.PartialResult[R]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">runApproximateJob</span><span class="tparams">[<span name="T">T</span>, <span name="U">U</span>, <span name="R">R</span>]</span><span class="params">(<span name="rdd">rdd: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="org.apache.spark.SparkContext.runApproximateJob.T" class="extype">T</span>]</span>, <span name="func">func: (<a href="TaskContext.html" name="org.apache.spark.TaskContext" id="org.apache.spark.TaskContext" class="extype">TaskContext</a>, <span name="scala.Iterator" class="extype">Iterator</span>[<span name="org.apache.spark.SparkContext.runApproximateJob.T" class="extype">T</span>]) => <span name="org.apache.spark.SparkContext.runApproximateJob.U" class="extype">U</span></span>, <span name="evaluator">evaluator: <span name="org.apache.spark.partial.ApproximateEvaluator" class="extype">ApproximateEvaluator</span>[<span name="org.apache.spark.SparkContext.runApproximateJob.U" class="extype">U</span>, <span name="org.apache.spark.SparkContext.runApproximateJob.R" class="extype">R</span>]</span>, <span name="timeout">timeout: <span name="scala.Long" class="extype">Long</span></span>)</span><span class="result">: <a href="partial/PartialResult.html" name="org.apache.spark.partial.PartialResult" id="org.apache.spark.partial.PartialResult" class="extype">PartialResult</a>[<span name="org.apache.spark.SparkContext.runApproximateJob.R" class="extype">R</span>]</span></span><p class="shortcomment cmt">:: DeveloperApi :: |
| Run a job that can return approximate results.</p><div class="fullcomment"><div class="comment cmt"><p>:: DeveloperApi :: |
| Run a job that can return approximate results. |
| </p></div><dl class="paramcmts block"><dt class="param">rdd</dt><dd class="cmt"><p>target RDD to run tasks on</p></dd><dt class="param">func</dt><dd class="cmt"><p>a function to run on each partition of the RDD</p></dd><dt class="param">evaluator</dt><dd class="cmt"><p><code>ApproximateEvaluator</code> to receive the partial results</p></dd><dt class="param">timeout</dt><dd class="cmt"><p>maximum time to wait for the job, in milliseconds</p></dd><dt>returns</dt><dd class="cmt"><p>partial result (how partial depends on whether the job was finished before or |
| after timeout)</p></dd></dl><dl class="attributes block"><dt>Annotations</dt><dd><span class="name">@DeveloperApi</span><span class="args">()</span> </dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#runJob" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="runJob[T,U](rdd:org.apache.spark.rdd.RDD[T],processPartition:Iterator[T]=>U,resultHandler:(Int,U)=>Unit)(implicitevidence$17:scala.reflect.ClassTag[U]):Unit" class="anchorToMember"></a><a id="runJob[T,U](RDD[T],(Iterator[T])=>U,(Int,U)=>Unit)(ClassTag[U]):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#runJob[T,U](rdd:org.apache.spark.rdd.RDD[T],processPartition:Iterator[T]=>U,resultHandler:(Int,U)=>Unit)(implicitevidence$17:scala.reflect.ClassTag[U]):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">runJob</span><span class="tparams">[<span name="T">T</span>, <span name="U">U</span>]</span><span class="params">(<span name="rdd">rdd: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="org.apache.spark.SparkContext.runJob.T" class="extype">T</span>]</span>, <span name="processPartition">processPartition: (<span name="scala.Iterator" class="extype">Iterator</span>[<span name="org.apache.spark.SparkContext.runJob.T" class="extype">T</span>]) => <span name="org.apache.spark.SparkContext.runJob.U" class="extype">U</span></span>, <span name="resultHandler">resultHandler: (<span name="scala.Int" class="extype">Int</span>, <span name="org.apache.spark.SparkContext.runJob.U" class="extype">U</span>) => <span name="scala.Unit" class="extype">Unit</span></span>)</span><span class="params">(<span class="implicit">implicit </span><span name="arg0">arg0: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.runJob.U" class="extype">U</span>]</span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Run a job on all partitions in an RDD and pass the results to a handler function.</p><div class="fullcomment"><div class="comment cmt"><p>Run a job on all partitions in an RDD and pass the results to a handler function. |
| </p></div><dl class="paramcmts block"><dt class="param">rdd</dt><dd class="cmt"><p>target RDD to run tasks on</p></dd><dt class="param">processPartition</dt><dd class="cmt"><p>a function to run on each partition of the RDD</p></dd><dt class="param">resultHandler</dt><dd class="cmt"><p>callback to pass each result to</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#runJob" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="runJob[T,U](rdd:org.apache.spark.rdd.RDD[T],processPartition:(org.apache.spark.TaskContext,Iterator[T])=>U,resultHandler:(Int,U)=>Unit)(implicitevidence$16:scala.reflect.ClassTag[U]):Unit" class="anchorToMember"></a><a id="runJob[T,U](RDD[T],(TaskContext,Iterator[T])=>U,(Int,U)=>Unit)(ClassTag[U]):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#runJob[T,U](rdd:org.apache.spark.rdd.RDD[T],processPartition:(org.apache.spark.TaskContext,Iterator[T])=>U,resultHandler:(Int,U)=>Unit)(implicitevidence$16:scala.reflect.ClassTag[U]):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">runJob</span><span class="tparams">[<span name="T">T</span>, <span name="U">U</span>]</span><span class="params">(<span name="rdd">rdd: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="org.apache.spark.SparkContext.runJob.T" class="extype">T</span>]</span>, <span name="processPartition">processPartition: (<a href="TaskContext.html" name="org.apache.spark.TaskContext" id="org.apache.spark.TaskContext" class="extype">TaskContext</a>, <span name="scala.Iterator" class="extype">Iterator</span>[<span name="org.apache.spark.SparkContext.runJob.T" class="extype">T</span>]) => <span name="org.apache.spark.SparkContext.runJob.U" class="extype">U</span></span>, <span name="resultHandler">resultHandler: (<span name="scala.Int" class="extype">Int</span>, <span name="org.apache.spark.SparkContext.runJob.U" class="extype">U</span>) => <span name="scala.Unit" class="extype">Unit</span></span>)</span><span class="params">(<span class="implicit">implicit </span><span name="arg0">arg0: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.runJob.U" class="extype">U</span>]</span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Run a job on all partitions in an RDD and pass the results to a handler function.</p><div class="fullcomment"><div class="comment cmt"><p>Run a job on all partitions in an RDD and pass the results to a handler function. The function |
| that is run against each partition additionally takes <code>TaskContext</code> argument. |
| </p></div><dl class="paramcmts block"><dt class="param">rdd</dt><dd class="cmt"><p>target RDD to run tasks on</p></dd><dt class="param">processPartition</dt><dd class="cmt"><p>a function to run on each partition of the RDD</p></dd><dt class="param">resultHandler</dt><dd class="cmt"><p>callback to pass each result to</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#runJob" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="runJob[T,U](rdd:org.apache.spark.rdd.RDD[T],func:Iterator[T]=>U)(implicitevidence$15:scala.reflect.ClassTag[U]):Array[U]" class="anchorToMember"></a><a id="runJob[T,U](RDD[T],(Iterator[T])=>U)(ClassTag[U]):Array[U]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#runJob[T,U](rdd:org.apache.spark.rdd.RDD[T],func:Iterator[T]=>U)(implicitevidence$15:scala.reflect.ClassTag[U]):Array[U]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">runJob</span><span class="tparams">[<span name="T">T</span>, <span name="U">U</span>]</span><span class="params">(<span name="rdd">rdd: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="org.apache.spark.SparkContext.runJob.T" class="extype">T</span>]</span>, <span name="func">func: (<span name="scala.Iterator" class="extype">Iterator</span>[<span name="org.apache.spark.SparkContext.runJob.T" class="extype">T</span>]) => <span name="org.apache.spark.SparkContext.runJob.U" class="extype">U</span></span>)</span><span class="params">(<span class="implicit">implicit </span><span name="arg0">arg0: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.runJob.U" class="extype">U</span>]</span>)</span><span class="result">: <span name="scala.Array" class="extype">Array</span>[<span name="org.apache.spark.SparkContext.runJob.U" class="extype">U</span>]</span></span><p class="shortcomment cmt">Run a job on all partitions in an RDD and return the results in an array.</p><div class="fullcomment"><div class="comment cmt"><p>Run a job on all partitions in an RDD and return the results in an array. |
| </p></div><dl class="paramcmts block"><dt class="param">rdd</dt><dd class="cmt"><p>target RDD to run tasks on</p></dd><dt class="param">func</dt><dd class="cmt"><p>a function to run on each partition of the RDD</p></dd><dt>returns</dt><dd class="cmt"><p>in-memory collection with a result of the job (each collection element will contain |
| a result from one partition)</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#runJob" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="runJob[T,U](rdd:org.apache.spark.rdd.RDD[T],func:(org.apache.spark.TaskContext,Iterator[T])=>U)(implicitevidence$14:scala.reflect.ClassTag[U]):Array[U]" class="anchorToMember"></a><a id="runJob[T,U](RDD[T],(TaskContext,Iterator[T])=>U)(ClassTag[U]):Array[U]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#runJob[T,U](rdd:org.apache.spark.rdd.RDD[T],func:(org.apache.spark.TaskContext,Iterator[T])=>U)(implicitevidence$14:scala.reflect.ClassTag[U]):Array[U]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">runJob</span><span class="tparams">[<span name="T">T</span>, <span name="U">U</span>]</span><span class="params">(<span name="rdd">rdd: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="org.apache.spark.SparkContext.runJob.T" class="extype">T</span>]</span>, <span name="func">func: (<a href="TaskContext.html" name="org.apache.spark.TaskContext" id="org.apache.spark.TaskContext" class="extype">TaskContext</a>, <span name="scala.Iterator" class="extype">Iterator</span>[<span name="org.apache.spark.SparkContext.runJob.T" class="extype">T</span>]) => <span name="org.apache.spark.SparkContext.runJob.U" class="extype">U</span></span>)</span><span class="params">(<span class="implicit">implicit </span><span name="arg0">arg0: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.runJob.U" class="extype">U</span>]</span>)</span><span class="result">: <span name="scala.Array" class="extype">Array</span>[<span name="org.apache.spark.SparkContext.runJob.U" class="extype">U</span>]</span></span><p class="shortcomment cmt">Run a job on all partitions in an RDD and return the results in an array.</p><div class="fullcomment"><div class="comment cmt"><p>Run a job on all partitions in an RDD and return the results in an array. The function |
| that is run against each partition additionally takes <code>TaskContext</code> argument. |
| </p></div><dl class="paramcmts block"><dt class="param">rdd</dt><dd class="cmt"><p>target RDD to run tasks on</p></dd><dt class="param">func</dt><dd class="cmt"><p>a function to run on each partition of the RDD</p></dd><dt>returns</dt><dd class="cmt"><p>in-memory collection with a result of the job (each collection element will contain |
| a result from one partition)</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#runJob" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="runJob[T,U](rdd:org.apache.spark.rdd.RDD[T],func:Iterator[T]=>U,partitions:Seq[Int])(implicitevidence$13:scala.reflect.ClassTag[U]):Array[U]" class="anchorToMember"></a><a id="runJob[T,U](RDD[T],(Iterator[T])=>U,Seq[Int])(ClassTag[U]):Array[U]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#runJob[T,U](rdd:org.apache.spark.rdd.RDD[T],func:Iterator[T]=>U,partitions:Seq[Int])(implicitevidence$13:scala.reflect.ClassTag[U]):Array[U]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">runJob</span><span class="tparams">[<span name="T">T</span>, <span name="U">U</span>]</span><span class="params">(<span name="rdd">rdd: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="org.apache.spark.SparkContext.runJob.T" class="extype">T</span>]</span>, <span name="func">func: (<span name="scala.Iterator" class="extype">Iterator</span>[<span name="org.apache.spark.SparkContext.runJob.T" class="extype">T</span>]) => <span name="org.apache.spark.SparkContext.runJob.U" class="extype">U</span></span>, <span name="partitions">partitions: <span name="scala.Seq" class="extype">Seq</span>[<span name="scala.Int" class="extype">Int</span>]</span>)</span><span class="params">(<span class="implicit">implicit </span><span name="arg0">arg0: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.runJob.U" class="extype">U</span>]</span>)</span><span class="result">: <span name="scala.Array" class="extype">Array</span>[<span name="org.apache.spark.SparkContext.runJob.U" class="extype">U</span>]</span></span><p class="shortcomment cmt">Run a function on a given set of partitions in an RDD and return the results as an array.</p><div class="fullcomment"><div class="comment cmt"><p>Run a function on a given set of partitions in an RDD and return the results as an array. |
| </p></div><dl class="paramcmts block"><dt class="param">rdd</dt><dd class="cmt"><p>target RDD to run tasks on</p></dd><dt class="param">func</dt><dd class="cmt"><p>a function to run on each partition of the RDD</p></dd><dt class="param">partitions</dt><dd class="cmt"><p>set of partitions to run on; some jobs may not want to compute on all |
| partitions of the target RDD, e.g. for operations like <code>first()</code></p></dd><dt>returns</dt><dd class="cmt"><p>in-memory collection with a result of the job (each collection element will contain |
| a result from one partition)</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#runJob" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="runJob[T,U](rdd:org.apache.spark.rdd.RDD[T],func:(org.apache.spark.TaskContext,Iterator[T])=>U,partitions:Seq[Int])(implicitevidence$12:scala.reflect.ClassTag[U]):Array[U]" class="anchorToMember"></a><a id="runJob[T,U](RDD[T],(TaskContext,Iterator[T])=>U,Seq[Int])(ClassTag[U]):Array[U]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#runJob[T,U](rdd:org.apache.spark.rdd.RDD[T],func:(org.apache.spark.TaskContext,Iterator[T])=>U,partitions:Seq[Int])(implicitevidence$12:scala.reflect.ClassTag[U]):Array[U]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">runJob</span><span class="tparams">[<span name="T">T</span>, <span name="U">U</span>]</span><span class="params">(<span name="rdd">rdd: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="org.apache.spark.SparkContext.runJob.T" class="extype">T</span>]</span>, <span name="func">func: (<a href="TaskContext.html" name="org.apache.spark.TaskContext" id="org.apache.spark.TaskContext" class="extype">TaskContext</a>, <span name="scala.Iterator" class="extype">Iterator</span>[<span name="org.apache.spark.SparkContext.runJob.T" class="extype">T</span>]) => <span name="org.apache.spark.SparkContext.runJob.U" class="extype">U</span></span>, <span name="partitions">partitions: <span name="scala.Seq" class="extype">Seq</span>[<span name="scala.Int" class="extype">Int</span>]</span>)</span><span class="params">(<span class="implicit">implicit </span><span name="arg0">arg0: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.runJob.U" class="extype">U</span>]</span>)</span><span class="result">: <span name="scala.Array" class="extype">Array</span>[<span name="org.apache.spark.SparkContext.runJob.U" class="extype">U</span>]</span></span><p class="shortcomment cmt">Run a function on a given set of partitions in an RDD and return the results as an array.</p><div class="fullcomment"><div class="comment cmt"><p>Run a function on a given set of partitions in an RDD and return the results as an array. |
| The function that is run against each partition additionally takes <code>TaskContext</code> argument. |
| </p></div><dl class="paramcmts block"><dt class="param">rdd</dt><dd class="cmt"><p>target RDD to run tasks on</p></dd><dt class="param">func</dt><dd class="cmt"><p>a function to run on each partition of the RDD</p></dd><dt class="param">partitions</dt><dd class="cmt"><p>set of partitions to run on; some jobs may not want to compute on all |
| partitions of the target RDD, e.g. for operations like <code>first()</code></p></dd><dt>returns</dt><dd class="cmt"><p>in-memory collection with a result of the job (each collection element will contain |
| a result from one partition)</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#runJob" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="runJob[T,U](rdd:org.apache.spark.rdd.RDD[T],func:(org.apache.spark.TaskContext,Iterator[T])=>U,partitions:Seq[Int],resultHandler:(Int,U)=>Unit)(implicitevidence$11:scala.reflect.ClassTag[U]):Unit" class="anchorToMember"></a><a id="runJob[T,U](RDD[T],(TaskContext,Iterator[T])=>U,Seq[Int],(Int,U)=>Unit)(ClassTag[U]):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#runJob[T,U](rdd:org.apache.spark.rdd.RDD[T],func:(org.apache.spark.TaskContext,Iterator[T])=>U,partitions:Seq[Int],resultHandler:(Int,U)=>Unit)(implicitevidence$11:scala.reflect.ClassTag[U]):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">runJob</span><span class="tparams">[<span name="T">T</span>, <span name="U">U</span>]</span><span class="params">(<span name="rdd">rdd: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="org.apache.spark.SparkContext.runJob.T" class="extype">T</span>]</span>, <span name="func">func: (<a href="TaskContext.html" name="org.apache.spark.TaskContext" id="org.apache.spark.TaskContext" class="extype">TaskContext</a>, <span name="scala.Iterator" class="extype">Iterator</span>[<span name="org.apache.spark.SparkContext.runJob.T" class="extype">T</span>]) => <span name="org.apache.spark.SparkContext.runJob.U" class="extype">U</span></span>, <span name="partitions">partitions: <span name="scala.Seq" class="extype">Seq</span>[<span name="scala.Int" class="extype">Int</span>]</span>, <span name="resultHandler">resultHandler: (<span name="scala.Int" class="extype">Int</span>, <span name="org.apache.spark.SparkContext.runJob.U" class="extype">U</span>) => <span name="scala.Unit" class="extype">Unit</span></span>)</span><span class="params">(<span class="implicit">implicit </span><span name="arg0">arg0: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.runJob.U" class="extype">U</span>]</span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Run a function on a given set of partitions in an RDD and pass the results to the given |
| handler function.</p><div class="fullcomment"><div class="comment cmt"><p>Run a function on a given set of partitions in an RDD and pass the results to the given |
| handler function. This is the main entry point for all actions in Spark. |
| </p></div><dl class="paramcmts block"><dt class="param">rdd</dt><dd class="cmt"><p>target RDD to run tasks on</p></dd><dt class="param">func</dt><dd class="cmt"><p>a function to run on each partition of the RDD</p></dd><dt class="param">partitions</dt><dd class="cmt"><p>set of partitions to run on; some jobs may not want to compute on all |
| partitions of the target RDD, e.g. for operations like <code>first()</code></p></dd><dt class="param">resultHandler</dt><dd class="cmt"><p>callback to pass each result to</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#sequenceFile" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="sequenceFile[K,V](path:String,minPartitions:Int)(implicitkm:scala.reflect.ClassTag[K],implicitvm:scala.reflect.ClassTag[V],implicitkcf:()=>org.apache.spark.WritableConverter[K],implicitvcf:()=>org.apache.spark.WritableConverter[V]):org.apache.spark.rdd.RDD[(K,V)]" class="anchorToMember"></a><a id="sequenceFile[K,V](String,Int)(ClassTag[K],ClassTag[V],()=>WritableConverter[K],()=>WritableConverter[V]):RDD[(K,V)]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#sequenceFile[K,V](path:String,minPartitions:Int)(implicitkm:scala.reflect.ClassTag[K],implicitvm:scala.reflect.ClassTag[V],implicitkcf:()=>org.apache.spark.WritableConverter[K],implicitvcf:()=>org.apache.spark.WritableConverter[V]):org.apache.spark.rdd.RDD[(K,V)]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">sequenceFile</span><span class="tparams">[<span name="K">K</span>, <span name="V">V</span>]</span><span class="params">(<span name="path">path: <span name="scala.Predef.String" class="extype">String</span></span>, <span name="minPartitions">minPartitions: <span name="scala.Int" class="extype">Int</span> = <span class="symbol"><span class="name"><a href="#defaultMinPartitions:Int">defaultMinPartitions</a></span></span></span>)</span><span class="params">(<span class="implicit">implicit </span><span name="km">km: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.sequenceFile.K" class="extype">K</span>]</span>, <span name="vm">vm: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.sequenceFile.V" class="extype">V</span>]</span>, <span name="kcf">kcf: () => <span name="org.apache.spark.WritableConverter" class="extype">WritableConverter</span>[<span name="org.apache.spark.SparkContext.sequenceFile.K" class="extype">K</span>]</span>, <span name="vcf">vcf: () => <span name="org.apache.spark.WritableConverter" class="extype">WritableConverter</span>[<span name="org.apache.spark.SparkContext.sequenceFile.V" class="extype">V</span>]</span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[(<span name="org.apache.spark.SparkContext.sequenceFile.K" class="extype">K</span>, <span name="org.apache.spark.SparkContext.sequenceFile.V" class="extype">V</span>)]</span></span><p class="shortcomment cmt">Version of sequenceFile() for types implicitly convertible to Writables through a |
| WritableConverter.</p><div class="fullcomment"><div class="comment cmt"><p>Version of sequenceFile() for types implicitly convertible to Writables through a |
| WritableConverter. For example, to access a SequenceFile where the keys are Text and the |
| values are IntWritable, you could simply write</p><pre>sparkContext.sequenceFile[<span class="std">String</span>, <span class="std">Int</span>](path, ...)</pre><p>WritableConverters are provided in a somewhat strange way (by an implicit function) to support |
| both subclasses of Writable and types for which we define a converter (e.g. Int to |
| IntWritable). The most natural thing would've been to have implicit objects for the |
| converters, but then we couldn't have an object for every subclass of Writable (you can't |
| have a parameterized singleton object). We use functions instead to create a new converter |
| for the appropriate type. In addition, we pass the converter a ClassTag of its type to |
| allow it to figure out the Writable class to use in the subclass case. |
| </p></div><dl class="paramcmts block"><dt class="param">path</dt><dd class="cmt"><p>directory to the input data files, the path can be comma separated paths |
| as a list of inputs</p></dd><dt class="param">minPartitions</dt><dd class="cmt"><p>suggested minimum number of partitions for the resulting RDD</p></dd><dt>returns</dt><dd class="cmt"><p>RDD of tuples of key and corresponding value</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>Because Hadoop's RecordReader class re-uses the same Writable object for each |
| record, directly caching the returned RDD or directly passing it to an aggregation or shuffle |
| operation will create many references to the same object. |
| If you plan to directly cache, sort, or aggregate Hadoop writable objects, you should first |
| copy them using a <code>map</code> function.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#sequenceFile" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="sequenceFile[K,V](path:String,keyClass:Class[K],valueClass:Class[V]):org.apache.spark.rdd.RDD[(K,V)]" class="anchorToMember"></a><a id="sequenceFile[K,V](String,Class[K],Class[V]):RDD[(K,V)]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#sequenceFile[K,V](path:String,keyClass:Class[K],valueClass:Class[V]):org.apache.spark.rdd.RDD[(K,V)]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">sequenceFile</span><span class="tparams">[<span name="K">K</span>, <span name="V">V</span>]</span><span class="params">(<span name="path">path: <span name="scala.Predef.String" class="extype">String</span></span>, <span name="keyClass">keyClass: <span name="scala.Predef.Class" class="extype">Class</span>[<span name="org.apache.spark.SparkContext.sequenceFile.K" class="extype">K</span>]</span>, <span name="valueClass">valueClass: <span name="scala.Predef.Class" class="extype">Class</span>[<span name="org.apache.spark.SparkContext.sequenceFile.V" class="extype">V</span>]</span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[(<span name="org.apache.spark.SparkContext.sequenceFile.K" class="extype">K</span>, <span name="org.apache.spark.SparkContext.sequenceFile.V" class="extype">V</span>)]</span></span><p class="shortcomment cmt">Get an RDD for a Hadoop SequenceFile with given key and value types.</p><div class="fullcomment"><div class="comment cmt"><p>Get an RDD for a Hadoop SequenceFile with given key and value types. |
| </p></div><dl class="paramcmts block"><dt class="param">path</dt><dd class="cmt"><p>directory to the input data files, the path can be comma separated paths |
| as a list of inputs</p></dd><dt class="param">keyClass</dt><dd class="cmt"><p><code>Class</code> of the key associated with <code>SequenceFileInputFormat</code></p></dd><dt class="param">valueClass</dt><dd class="cmt"><p><code>Class</code> of the value associated with <code>SequenceFileInputFormat</code></p></dd><dt>returns</dt><dd class="cmt"><p>RDD of tuples of key and corresponding value</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>Because Hadoop's RecordReader class re-uses the same Writable object for each |
| record, directly caching the returned RDD or directly passing it to an aggregation or shuffle |
| operation will create many references to the same object. |
| If you plan to directly cache, sort, or aggregate Hadoop writable objects, you should first |
| copy them using a <code>map</code> function.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#sequenceFile" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="sequenceFile[K,V](path:String,keyClass:Class[K],valueClass:Class[V],minPartitions:Int):org.apache.spark.rdd.RDD[(K,V)]" class="anchorToMember"></a><a id="sequenceFile[K,V](String,Class[K],Class[V],Int):RDD[(K,V)]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#sequenceFile[K,V](path:String,keyClass:Class[K],valueClass:Class[V],minPartitions:Int):org.apache.spark.rdd.RDD[(K,V)]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">sequenceFile</span><span class="tparams">[<span name="K">K</span>, <span name="V">V</span>]</span><span class="params">(<span name="path">path: <span name="scala.Predef.String" class="extype">String</span></span>, <span name="keyClass">keyClass: <span name="scala.Predef.Class" class="extype">Class</span>[<span name="org.apache.spark.SparkContext.sequenceFile.K" class="extype">K</span>]</span>, <span name="valueClass">valueClass: <span name="scala.Predef.Class" class="extype">Class</span>[<span name="org.apache.spark.SparkContext.sequenceFile.V" class="extype">V</span>]</span>, <span name="minPartitions">minPartitions: <span name="scala.Int" class="extype">Int</span></span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[(<span name="org.apache.spark.SparkContext.sequenceFile.K" class="extype">K</span>, <span name="org.apache.spark.SparkContext.sequenceFile.V" class="extype">V</span>)]</span></span><p class="shortcomment cmt">Get an RDD for a Hadoop SequenceFile with given key and value types.</p><div class="fullcomment"><div class="comment cmt"><p>Get an RDD for a Hadoop SequenceFile with given key and value types. |
| </p></div><dl class="paramcmts block"><dt class="param">path</dt><dd class="cmt"><p>directory to the input data files, the path can be comma separated paths |
| as a list of inputs</p></dd><dt class="param">keyClass</dt><dd class="cmt"><p><code>Class</code> of the key associated with <code>SequenceFileInputFormat</code></p></dd><dt class="param">valueClass</dt><dd class="cmt"><p><code>Class</code> of the value associated with <code>SequenceFileInputFormat</code></p></dd><dt class="param">minPartitions</dt><dd class="cmt"><p>suggested minimum number of partitions for the resulting RDD</p></dd><dt>returns</dt><dd class="cmt"><p>RDD of tuples of key and corresponding value</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>Because Hadoop's RecordReader class re-uses the same Writable object for each |
| record, directly caching the returned RDD or directly passing it to an aggregation or shuffle |
| operation will create many references to the same object. |
| If you plan to directly cache, sort, or aggregate Hadoop writable objects, you should first |
| copy them using a <code>map</code> function.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#setCallSite" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="setCallSite(shortCallSite:String):Unit" class="anchorToMember"></a><a id="setCallSite(String):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#setCallSite(shortCallSite:String):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">setCallSite</span><span class="params">(<span name="shortCallSite">shortCallSite: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Set the thread-local property for overriding the call sites |
| of actions and RDDs.</p></li><li class="indented0 " name="org.apache.spark.SparkContext#setCheckpointDir" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="setCheckpointDir(directory:String):Unit" class="anchorToMember"></a><a id="setCheckpointDir(String):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#setCheckpointDir(directory:String):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">setCheckpointDir</span><span class="params">(<span name="directory">directory: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Set the directory under which RDDs are going to be checkpointed.</p><div class="fullcomment"><div class="comment cmt"><p>Set the directory under which RDDs are going to be checkpointed.</p></div><dl class="paramcmts block"><dt class="param">directory</dt><dd class="cmt"><p>path to the directory where checkpoint files will be stored |
| (must be HDFS path if running in cluster)</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#setInterruptOnCancel" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="setInterruptOnCancel(interruptOnCancel:Boolean):Unit" class="anchorToMember"></a><a id="setInterruptOnCancel(Boolean):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#setInterruptOnCancel(interruptOnCancel:Boolean):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">setInterruptOnCancel</span><span class="params">(<span name="interruptOnCancel">interruptOnCancel: <span name="scala.Boolean" class="extype">Boolean</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Set the behavior of job cancellation from jobs started in this thread.</p><div class="fullcomment"><div class="comment cmt"><p>Set the behavior of job cancellation from jobs started in this thread. |
| </p></div><dl class="paramcmts block"><dt class="param">interruptOnCancel</dt><dd class="cmt"><p>If true, then job cancellation will result in <code>Thread.interrupt()</code> |
| being called on the job's executor threads. This is useful to help ensure that the tasks |
| are actually stopped in a timely manner, but is off by default due to HDFS-1208, where HDFS |
| may respond to Thread.interrupt() by marking nodes as dead.</p></dd></dl><dl class="attributes block"><dt>Since</dt><dd><p>3.5.0</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#setJobDescription" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="setJobDescription(value:String):Unit" class="anchorToMember"></a><a id="setJobDescription(String):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#setJobDescription(value:String):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">setJobDescription</span><span class="params">(<span name="value">value: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Set a human readable description of the current job.</p></li><li class="indented0 " name="org.apache.spark.SparkContext#setJobGroup" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="setJobGroup(groupId:String,description:String,interruptOnCancel:Boolean):Unit" class="anchorToMember"></a><a id="setJobGroup(String,String,Boolean):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#setJobGroup(groupId:String,description:String,interruptOnCancel:Boolean):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">setJobGroup</span><span class="params">(<span name="groupId">groupId: <span name="scala.Predef.String" class="extype">String</span></span>, <span name="description">description: <span name="scala.Predef.String" class="extype">String</span></span>, <span name="interruptOnCancel">interruptOnCancel: <span name="scala.Boolean" class="extype">Boolean</span> = <span class="symbol">false</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Assigns a group ID to all the jobs started by this thread until the group ID is set to a |
| different value or cleared.</p><div class="fullcomment"><div class="comment cmt"><p>Assigns a group ID to all the jobs started by this thread until the group ID is set to a |
| different value or cleared.</p><p>Often, a unit of execution in an application consists of multiple Spark actions or jobs. |
| Application programmers can use this method to group all those jobs together and give a |
| group description. Once set, the Spark web UI will associate such jobs with this group.</p><p>The application can also use <code>org.apache.spark.SparkContext.cancelJobGroup</code> to cancel all |
| running jobs in this group. For example,</p><pre><span class="cmt">// In the main thread:</span> |
| sc.setJobGroup(<span class="lit">"some_job_to_cancel"</span>, <span class="lit">"some job description"</span>) |
| sc.parallelize(<span class="num">1</span> to <span class="num">10000</span>, <span class="num">2</span>).map { i <span class="kw">=></span> Thread.sleep(<span class="num">10</span>); i }.count() |
| |
| <span class="cmt">// In a separate thread:</span> |
| sc.cancelJobGroup(<span class="lit">"some_job_to_cancel"</span>)</pre></div><dl class="paramcmts block"><dt class="param">interruptOnCancel</dt><dd class="cmt"><p>If true, then job cancellation will result in <code>Thread.interrupt()</code> |
| being called on the job's executor threads. This is useful to help ensure that the tasks |
| are actually stopped in a timely manner, but is off by default due to HDFS-1208, where HDFS |
| may respond to Thread.interrupt() by marking nodes as dead.</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#setLocalProperty" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="setLocalProperty(key:String,value:String):Unit" class="anchorToMember"></a><a id="setLocalProperty(String,String):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#setLocalProperty(key:String,value:String):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">setLocalProperty</span><span class="params">(<span name="key">key: <span name="scala.Predef.String" class="extype">String</span></span>, <span name="value">value: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Set a local property that affects jobs submitted from this thread, such as the Spark fair |
| scheduler pool.</p><div class="fullcomment"><div class="comment cmt"><p>Set a local property that affects jobs submitted from this thread, such as the Spark fair |
| scheduler pool. User-defined properties may also be set here. These properties are propagated |
| through to worker tasks and can be accessed there via |
| <a href="TaskContext.html#getLocalProperty(key:String):String" name="org.apache.spark.TaskContext#getLocalProperty" id="org.apache.spark.TaskContext#getLocalProperty" class="extmbr">org.apache.spark.TaskContext#getLocalProperty</a>.</p><p>These properties are inherited by child threads spawned from this thread. This |
| may have unexpected consequences when working with thread pools. The standard java |
| implementation of thread pools have worker threads spawn other worker threads. |
| As a result, local properties may propagate unpredictably.</p><p>To remove/unset property simply set <code>value</code> to null e.g. sc.setLocalProperty("key", null) |
| </p></div></div></li><li class="indented0 " name="org.apache.spark.SparkContext#setLogLevel" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="setLogLevel(logLevel:String):Unit" class="anchorToMember"></a><a id="setLogLevel(String):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#setLogLevel(logLevel:String):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">setLogLevel</span><span class="params">(<span name="logLevel">logLevel: <span name="scala.Predef.String" class="extype">String</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Control our logLevel.</p><div class="fullcomment"><div class="comment cmt"><p>Control our logLevel. This overrides any user-defined log settings.</p></div><dl class="paramcmts block"><dt class="param">logLevel</dt><dd class="cmt"><p>The desired log level as a string. |
| Valid log levels include: ALL, DEBUG, ERROR, FATAL, INFO, OFF, TRACE, WARN</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#sparkUser" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="sparkUser:String" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#sparkUser:String" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">val</span></span> <span class="symbol"><span class="name">sparkUser</span><span class="result">: <span name="scala.Predef.String" class="extype">String</span></span></span></li><li class="indented0 " name="org.apache.spark.SparkContext#startTime" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="startTime:Long" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#startTime:Long" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">val</span></span> <span class="symbol"><span class="name">startTime</span><span class="result">: <span name="scala.Long" class="extype">Long</span></span></span></li><li class="indented0 " name="org.apache.spark.SparkContext#statusTracker" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="statusTracker:org.apache.spark.SparkStatusTracker" class="anchorToMember"></a><a id="statusTracker:SparkStatusTracker" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#statusTracker:org.apache.spark.SparkStatusTracker" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">statusTracker</span><span class="result">: <a href="SparkStatusTracker.html" name="org.apache.spark.SparkStatusTracker" id="org.apache.spark.SparkStatusTracker" class="extype">SparkStatusTracker</a></span></span></li><li class="indented0 " name="org.apache.spark.SparkContext#stop" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="stop(exitCode:Int):Unit" class="anchorToMember"></a><a id="stop(Int):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#stop(exitCode:Int):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">stop</span><span class="params">(<span name="exitCode">exitCode: <span name="scala.Int" class="extype">Int</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Shut down the SparkContext with exit code that will passed to scheduler backend.</p><div class="fullcomment"><div class="comment cmt"><p>Shut down the SparkContext with exit code that will passed to scheduler backend. |
| In client mode, client side may call <code>SparkContext.stop()</code> to clean up but exit with |
| code not equal to 0. This behavior cause resource scheduler such as <code>ApplicationMaster</code> |
| exit with success status but client side exited with failed status. Spark can call |
| this method to stop SparkContext and pass client side correct exit code to scheduler backend. |
| Then scheduler backend should send the exit code to corresponding resource scheduler |
| to keep consistent. |
| </p></div><dl class="paramcmts block"><dt class="param">exitCode</dt><dd class="cmt"><p>Specified exit code that will passed to scheduler backend in client mode.</p></dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#stop" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="stop():Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#stop():Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">stop</span><span class="params">()</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><p class="shortcomment cmt">Shut down the SparkContext.</p></li><li class="indented0 " name="org.apache.spark.SparkContext#submitJob" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="submitJob[T,U,R](rdd:org.apache.spark.rdd.RDD[T],processPartition:Iterator[T]=>U,partitions:Seq[Int],resultHandler:(Int,U)=>Unit,resultFunc:=>R):org.apache.spark.SimpleFutureAction[R]" class="anchorToMember"></a><a id="submitJob[T,U,R](RDD[T],(Iterator[T])=>U,Seq[Int],(Int,U)=>Unit,=>R):SimpleFutureAction[R]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#submitJob[T,U,R](rdd:org.apache.spark.rdd.RDD[T],processPartition:Iterator[T]=>U,partitions:Seq[Int],resultHandler:(Int,U)=>Unit,resultFunc:=>R):org.apache.spark.SimpleFutureAction[R]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">submitJob</span><span class="tparams">[<span name="T">T</span>, <span name="U">U</span>, <span name="R">R</span>]</span><span class="params">(<span name="rdd">rdd: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="org.apache.spark.SparkContext.submitJob.T" class="extype">T</span>]</span>, <span name="processPartition">processPartition: (<span name="scala.Iterator" class="extype">Iterator</span>[<span name="org.apache.spark.SparkContext.submitJob.T" class="extype">T</span>]) => <span name="org.apache.spark.SparkContext.submitJob.U" class="extype">U</span></span>, <span name="partitions">partitions: <span name="scala.Seq" class="extype">Seq</span>[<span name="scala.Int" class="extype">Int</span>]</span>, <span name="resultHandler">resultHandler: (<span name="scala.Int" class="extype">Int</span>, <span name="org.apache.spark.SparkContext.submitJob.U" class="extype">U</span>) => <span name="scala.Unit" class="extype">Unit</span></span>, <span name="resultFunc">resultFunc: => <span name="org.apache.spark.SparkContext.submitJob.R" class="extype">R</span></span>)</span><span class="result">: <a href="SimpleFutureAction.html" name="org.apache.spark.SimpleFutureAction" id="org.apache.spark.SimpleFutureAction" class="extype">SimpleFutureAction</a>[<span name="org.apache.spark.SparkContext.submitJob.R" class="extype">R</span>]</span></span><p class="shortcomment cmt">Submit a job for execution and return a FutureJob holding the result.</p><div class="fullcomment"><div class="comment cmt"><p>Submit a job for execution and return a FutureJob holding the result. |
| </p></div><dl class="paramcmts block"><dt class="param">rdd</dt><dd class="cmt"><p>target RDD to run tasks on</p></dd><dt class="param">processPartition</dt><dd class="cmt"><p>a function to run on each partition of the RDD</p></dd><dt class="param">partitions</dt><dd class="cmt"><p>set of partitions to run on; some jobs may not want to compute on all |
| partitions of the target RDD, e.g. for operations like <code>first()</code></p></dd><dt class="param">resultHandler</dt><dd class="cmt"><p>callback to pass each result to</p></dd><dt class="param">resultFunc</dt><dd class="cmt"><p>function to be executed when the result is ready</p></dd></dl></div></li><li class="indented0 " name="scala.AnyRef#synchronized" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="synchronized[T0](x$1:=>T0):T0" class="anchorToMember"></a><a id="synchronized[T0](=>T0):T0" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#synchronized[T0](x$1:=>T0):T0" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier">final </span> <span class="kind">def</span></span> <span class="symbol"><span class="name">synchronized</span><span class="tparams">[<span name="T0">T0</span>]</span><span class="params">(<span name="arg0">arg0: => <span name="java.lang.AnyRef.synchronized.T0" class="extype">T0</span></span>)</span><span class="result">: <span name="java.lang.AnyRef.synchronized.T0" class="extype">T0</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd>AnyRef</dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#textFile" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="textFile(path:String,minPartitions:Int):org.apache.spark.rdd.RDD[String]" class="anchorToMember"></a><a id="textFile(String,Int):RDD[String]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#textFile(path:String,minPartitions:Int):org.apache.spark.rdd.RDD[String]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">textFile</span><span class="params">(<span name="path">path: <span name="scala.Predef.String" class="extype">String</span></span>, <span name="minPartitions">minPartitions: <span name="scala.Int" class="extype">Int</span> = <span class="symbol"><span class="name"><a href="#defaultMinPartitions:Int">defaultMinPartitions</a></span></span></span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="scala.Predef.String" class="extype">String</span>]</span></span><p class="shortcomment cmt">Read a text file from HDFS, a local file system (available on all nodes), or any |
| Hadoop-supported file system URI, and return it as an RDD of Strings.</p><div class="fullcomment"><div class="comment cmt"><p>Read a text file from HDFS, a local file system (available on all nodes), or any |
| Hadoop-supported file system URI, and return it as an RDD of Strings. |
| The text files must be encoded as UTF-8. |
| </p></div><dl class="paramcmts block"><dt class="param">path</dt><dd class="cmt"><p>path to the text file on a supported file system</p></dd><dt class="param">minPartitions</dt><dd class="cmt"><p>suggested minimum number of partitions for the resulting RDD</p></dd><dt>returns</dt><dd class="cmt"><p>RDD of lines of the text file</p></dd></dl></div></li><li class="indented0 " name="scala.AnyRef#toString" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="toString():String" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#toString():String" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">toString</span><span class="params">()</span><span class="result">: <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html#java.lang.String" name="java.lang.String" id="java.lang.String" class="extype">String</a></span></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd>AnyRef → Any</dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#uiWebUrl" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="uiWebUrl:Option[String]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#uiWebUrl:Option[String]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">uiWebUrl</span><span class="result">: <span name="scala.Option" class="extype">Option</span>[<span name="scala.Predef.String" class="extype">String</span>]</span></span></li><li class="indented0 " name="org.apache.spark.SparkContext#union" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="union[T](first:org.apache.spark.rdd.RDD[T],rest:org.apache.spark.rdd.RDD[T]*)(implicitevidence$7:scala.reflect.ClassTag[T]):org.apache.spark.rdd.RDD[T]" class="anchorToMember"></a><a id="union[T](RDD[T],RDD[T]*)(ClassTag[T]):RDD[T]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#union[T](first:org.apache.spark.rdd.RDD[T],rest:org.apache.spark.rdd.RDD[T]*)(implicitevidence$7:scala.reflect.ClassTag[T]):org.apache.spark.rdd.RDD[T]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">union</span><span class="tparams">[<span name="T">T</span>]</span><span class="params">(<span name="first">first: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="org.apache.spark.SparkContext.union.T" class="extype">T</span>]</span>, <span name="rest">rest: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="org.apache.spark.SparkContext.union.T" class="extype">T</span>]*</span>)</span><span class="params">(<span class="implicit">implicit </span><span name="arg0">arg0: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.union.T" class="extype">T</span>]</span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="org.apache.spark.SparkContext.union.T" class="extype">T</span>]</span></span><p class="shortcomment cmt">Build the union of a list of RDDs passed as variable-length arguments.</p></li><li class="indented0 " name="org.apache.spark.SparkContext#union" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="union[T](rdds:Seq[org.apache.spark.rdd.RDD[T]])(implicitevidence$6:scala.reflect.ClassTag[T]):org.apache.spark.rdd.RDD[T]" class="anchorToMember"></a><a id="union[T](Seq[RDD[T]])(ClassTag[T]):RDD[T]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#union[T](rdds:Seq[org.apache.spark.rdd.RDD[T]])(implicitevidence$6:scala.reflect.ClassTag[T]):org.apache.spark.rdd.RDD[T]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">union</span><span class="tparams">[<span name="T">T</span>]</span><span class="params">(<span name="rdds">rdds: <span name="scala.Seq" class="extype">Seq</span>[<a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="org.apache.spark.SparkContext.union.T" class="extype">T</span>]]</span>)</span><span class="params">(<span class="implicit">implicit </span><span name="arg0">arg0: <span name="scala.reflect.ClassTag" class="extype">ClassTag</span>[<span name="org.apache.spark.SparkContext.union.T" class="extype">T</span>]</span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[<span name="org.apache.spark.SparkContext.union.T" class="extype">T</span>]</span></span><p class="shortcomment cmt">Build the union of a list of RDDs.</p></li><li class="indented0 " name="org.apache.spark.SparkContext#version" group="Ungrouped" fullComment="no" data-isabs="false" visbl="pub"><a id="version:String" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#version:String" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">version</span><span class="result">: <span name="scala.Predef.String" class="extype">String</span></span></span><p class="shortcomment cmt">The version of Spark on which this application is running.</p></li><li class="indented0 " name="scala.AnyRef#wait" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="wait(x$1:Long,x$2:Int):Unit" class="anchorToMember"></a><a id="wait(Long,Int):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#wait(x$1:Long,x$2:Int):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier">final </span> <span class="kind">def</span></span> <span class="symbol"><span class="name">wait</span><span class="params">(<span name="arg0">arg0: <span name="scala.Long" class="extype">Long</span></span>, <span name="arg1">arg1: <span name="scala.Int" class="extype">Int</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd>AnyRef</dd><dt>Annotations</dt><dd><span class="name">@throws</span><span class="args">(<span><span class="defval">classOf[java.lang.InterruptedException]</span></span>)</span> </dd></dl></div></li><li class="indented0 " name="scala.AnyRef#wait" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="wait(x$1:Long):Unit" class="anchorToMember"></a><a id="wait(Long):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#wait(x$1:Long):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier">final </span> <span class="kind">def</span></span> <span class="symbol"><span class="name">wait</span><span class="params">(<span name="arg0">arg0: <span name="scala.Long" class="extype">Long</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd>AnyRef</dd><dt>Annotations</dt><dd><span class="name">@throws</span><span class="args">(<span><span class="defval">classOf[java.lang.InterruptedException]</span></span>)</span> <span class="name">@native</span><span class="args">()</span> </dd></dl></div></li><li class="indented0 " name="scala.AnyRef#wait" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="wait():Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#wait():Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier">final </span> <span class="kind">def</span></span> <span class="symbol"><span class="name">wait</span><span class="params">()</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Definition Classes</dt><dd>AnyRef</dd><dt>Annotations</dt><dd><span class="name">@throws</span><span class="args">(<span><span class="defval">classOf[java.lang.InterruptedException]</span></span>)</span> </dd></dl></div></li><li class="indented0 " name="org.apache.spark.SparkContext#wholeTextFiles" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="pub"><a id="wholeTextFiles(path:String,minPartitions:Int):org.apache.spark.rdd.RDD[(String,String)]" class="anchorToMember"></a><a id="wholeTextFiles(String,Int):RDD[(String,String)]" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#wholeTextFiles(path:String,minPartitions:Int):org.apache.spark.rdd.RDD[(String,String)]" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">wholeTextFiles</span><span class="params">(<span name="path">path: <span name="scala.Predef.String" class="extype">String</span></span>, <span name="minPartitions">minPartitions: <span name="scala.Int" class="extype">Int</span> = <span class="symbol"><span class="name"><a href="#defaultMinPartitions:Int">defaultMinPartitions</a></span></span></span>)</span><span class="result">: <a href="rdd/RDD.html" name="org.apache.spark.rdd.RDD" id="org.apache.spark.rdd.RDD" class="extype">RDD</a>[(<span name="scala.Predef.String" class="extype">String</span>, <span name="scala.Predef.String" class="extype">String</span>)]</span></span><p class="shortcomment cmt">Read a directory of text files from HDFS, a local file system (available on all nodes), or any |
| Hadoop-supported file system URI.</p><div class="fullcomment"><div class="comment cmt"><p>Read a directory of text files from HDFS, a local file system (available on all nodes), or any |
| Hadoop-supported file system URI. Each file is read as a single record and returned in a |
| key-value pair, where the key is the path of each file, the value is the content of each file. |
| The text files must be encoded as UTF-8.</p><p> For example, if you have the following files:</p><pre>hdfs:<span class="cmt">//a-hdfs-path/part-00000</span> |
| hdfs:<span class="cmt">//a-hdfs-path/part-00001</span> |
| ... |
| hdfs:<span class="cmt">//a-hdfs-path/part-nnnnn</span></pre><p>Do <code>val rdd = sparkContext.wholeTextFile("hdfs://a-hdfs-path")</code>,</p><p> then <code>rdd</code> contains</p><pre>(a-hdfs-path/part-<span class="num">00000</span>, its content) |
| (a-hdfs-path/part-<span class="num">00001</span>, its content) |
| ... |
| (a-hdfs-path/part-nnnnn, its content)</pre></div><dl class="paramcmts block"><dt class="param">path</dt><dd class="cmt"><p>Directory to the input data files, the path can be comma separated paths as the |
| list of inputs.</p></dd><dt class="param">minPartitions</dt><dd class="cmt"><p>A suggestion value of the minimal splitting number for input data.</p></dd><dt>returns</dt><dd class="cmt"><p>RDD representing tuples of file path and the corresponding file content</p></dd></dl><dl class="attributes block"><dt>Note</dt><dd><span class="cmt"><p>Small files are preferred, large file is also allowable, but may cause bad performance.</p></span>, <span class="cmt"><p>On some filesystems, <code>.../path/*</code> can be a more efficient way to read all files |
| in a directory rather than <code>.../path/</code> or <code>.../path</code></p></span>, <span class="cmt"><p>Partitioning is determined by data locality. This may result in too few partitions |
| by default.</p></span></dd></dl></div></li><li class="indented0 " name="org.apache.spark.internal.Logging#withLogContext" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="withLogContext(context:java.util.HashMap[String,String])(body:=>Unit):Unit" class="anchorToMember"></a><a id="withLogContext(HashMap[String,String])(=>Unit):Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#withLogContext(context:java.util.HashMap[String,String])(body:=>Unit):Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name">withLogContext</span><span class="params">(<span name="context">context: <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/HashMap.html#java.util.HashMap" name="java.util.HashMap" id="java.util.HashMap" class="extype">HashMap</a>[<span name="scala.Predef.String" class="extype">String</span>, <span name="scala.Predef.String" class="extype">String</span>]</span>)</span><span class="params">(<span name="body">body: => <span name="scala.Unit" class="extype">Unit</span></span>)</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected </dd><dt>Definition Classes</dt><dd>Logging</dd></dl></div></li></ol></div><div class="values members"><h3>Deprecated Value Members</h3><ol><li class="indented0 " name="scala.AnyRef#finalize" group="Ungrouped" fullComment="yes" data-isabs="false" visbl="prt"><a id="finalize():Unit" class="anchorToMember"></a> <span class="permalink"><a href="../../../org/apache/spark/SparkContext.html#finalize():Unit" title="Permalink"><i class="material-icons"></i></a></span> <span class="modifier_kind"><span class="modifier"></span> <span class="kind">def</span></span> <span class="symbol"><span class="name deprecated" title="Deprecated: (Since version 9)">finalize</span><span class="params">()</span><span class="result">: <span name="scala.Unit" class="extype">Unit</span></span></span><div class="fullcomment"><dl class="attributes block"><dt>Attributes</dt><dd>protected[<span name="java.lang" class="extype">lang</span>] </dd><dt>Definition Classes</dt><dd>AnyRef</dd><dt>Annotations</dt><dd><span class="name">@throws</span><span class="args">(<span><span class="symbol">classOf[java.lang.Throwable]</span></span>)</span> <span class="name">@Deprecated</span> </dd><dt>Deprecated</dt><dd class="cmt"><p><i>(Since version 9)</i></p></dd></dl></div></li></ol></div></div><div id="inheritedMembers"><div name="org.apache.spark.internal.Logging" class="parent"><h3>Inherited from <span name="org.apache.spark.internal.Logging" class="extype">Logging</span></h3></div><div name="scala.AnyRef" class="parent"><h3>Inherited from <span name="scala.AnyRef" class="extype">AnyRef</span></h3></div><div name="scala.Any" class="parent"><h3>Inherited from <span name="scala.Any" class="extype">Any</span></h3></div></div><div id="groupedMembers"><div name="Ungrouped" class="group"><h3>Ungrouped</h3></div></div></div><div id="tooltip"></div><div id="footer"></div></body></div></div></div></body></html> |