| # Apache Spark |
| |
| > Apache Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Java, Scala, Python, and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for incremental computation and stream processing. |
| |
| Documentation home: https://spark.apache.org/docs/latest/ |
| |
| ## Programming Guides |
| |
| - [Quick Start](https://spark.apache.org/docs/latest/quick-start.html) |
| - [Spark SQL](https://spark.apache.org/docs/latest/sql-programming-guide.html) |
| - [PySpark](https://spark.apache.org/docs/latest/api/python/getting_started/index.html) |
| - [RDD Programming Guide](https://spark.apache.org/docs/latest/rdd-programming-guide.html) |
| - [Structured Streaming](https://spark.apache.org/docs/latest/streaming-programming-guide.html) |
| - [MLlib](https://spark.apache.org/docs/latest/ml-guide.html) |
| - [GraphX](https://spark.apache.org/docs/latest/graphx-programming-guide.html) |
| - [SparkR](https://spark.apache.org/docs/latest/sparkr.html) |
| - [Spark SQL CLI](https://spark.apache.org/docs/latest/sql-distributed-sql-engine-spark-sql-cli.html) |
| |
| ## API Docs |
| |
| - [Spark Python API](https://spark.apache.org/docs/latest/api/python/index.html) |
| - [Spark Scala API](https://spark.apache.org/docs/latest/api/scala/org/apache/spark/index.html) |
| - [Spark Java API](https://spark.apache.org/docs/latest/api/java/index.html) |
| - [Spark R API](https://spark.apache.org/docs/latest/api/R/index.html) |
| - [Spark SQL Built-in Functions](https://spark.apache.org/docs/latest/api/sql/index.html) |
| |
| ## Deployment Guides |
| |
| - [Cluster Overview](https://spark.apache.org/docs/latest/cluster-overview.html) |
| - [Submitting Applications](https://spark.apache.org/docs/latest/submitting-applications.html) |
| - [Standalone Deploy Mode](https://spark.apache.org/docs/latest/spark-standalone.html) |
| - [YARN](https://spark.apache.org/docs/latest/running-on-yarn.html) |
| - [Kubernetes](https://spark.apache.org/docs/latest/running-on-kubernetes.html) |
| |
| ## Other Documents |
| |
| - [Configuration](https://spark.apache.org/docs/latest/configuration.html) |
| - [Monitoring](https://spark.apache.org/docs/latest/monitoring.html) |
| - [Web UI](https://spark.apache.org/docs/latest/web-ui.html) |
| - [Tuning Guide](https://spark.apache.org/docs/latest/tuning.html) |
| - [Job Scheduling](https://spark.apache.org/docs/latest/job-scheduling.html) |
| - [Security](https://spark.apache.org/docs/latest/security.html) |
| - [Hardware Provisioning](https://spark.apache.org/docs/latest/hardware-provisioning.html) |
| - [Cloud Infrastructures](https://spark.apache.org/docs/latest/cloud-integration.html) |
| - [Migration Guide](https://spark.apache.org/docs/latest/migration-guide.html) |
| |
| ## External Resources |
| |
| - [Apache Spark Home](https://spark.apache.org/) |
| - [Downloads](https://spark.apache.org/downloads.html) |
| - [GitHub Repository](https://github.com/apache/spark) |
| - [Issue Tracker (JIRA)](https://issues.apache.org/jira/projects/SPARK) |
| - [Mailing Lists](https://spark.apache.org/mailing-lists.html) |
| - [Community](https://spark.apache.org/community.html) |
| - [Contributing](https://spark.apache.org/contributing.html) |