blob: 21f27514a88fb52e7f5ec31068d13aa4e9fb386e [file] [log] [blame]
# Apache Spark
> Apache Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Java, Scala, Python, and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for incremental computation and stream processing.
Documentation home: https://spark.apache.org/docs/latest/
## Programming Guides
- [Quick Start](https://spark.apache.org/docs/latest/quick-start.html)
- [Spark SQL](https://spark.apache.org/docs/latest/sql-programming-guide.html)
- [PySpark](https://spark.apache.org/docs/latest/api/python/getting_started/index.html)
- [RDD Programming Guide](https://spark.apache.org/docs/latest/rdd-programming-guide.html)
- [Structured Streaming](https://spark.apache.org/docs/latest/streaming-programming-guide.html)
- [MLlib](https://spark.apache.org/docs/latest/ml-guide.html)
- [GraphX](https://spark.apache.org/docs/latest/graphx-programming-guide.html)
- [SparkR](https://spark.apache.org/docs/latest/sparkr.html)
- [Spark SQL CLI](https://spark.apache.org/docs/latest/sql-distributed-sql-engine-spark-sql-cli.html)
## API Docs
- [Spark Python API](https://spark.apache.org/docs/latest/api/python/index.html)
- [Spark Scala API](https://spark.apache.org/docs/latest/api/scala/org/apache/spark/index.html)
- [Spark Java API](https://spark.apache.org/docs/latest/api/java/index.html)
- [Spark R API](https://spark.apache.org/docs/latest/api/R/index.html)
- [Spark SQL Built-in Functions](https://spark.apache.org/docs/latest/api/sql/index.html)
## Deployment Guides
- [Cluster Overview](https://spark.apache.org/docs/latest/cluster-overview.html)
- [Submitting Applications](https://spark.apache.org/docs/latest/submitting-applications.html)
- [Standalone Deploy Mode](https://spark.apache.org/docs/latest/spark-standalone.html)
- [YARN](https://spark.apache.org/docs/latest/running-on-yarn.html)
- [Kubernetes](https://spark.apache.org/docs/latest/running-on-kubernetes.html)
## Other Documents
- [Configuration](https://spark.apache.org/docs/latest/configuration.html)
- [Monitoring](https://spark.apache.org/docs/latest/monitoring.html)
- [Web UI](https://spark.apache.org/docs/latest/web-ui.html)
- [Tuning Guide](https://spark.apache.org/docs/latest/tuning.html)
- [Job Scheduling](https://spark.apache.org/docs/latest/job-scheduling.html)
- [Security](https://spark.apache.org/docs/latest/security.html)
- [Hardware Provisioning](https://spark.apache.org/docs/latest/hardware-provisioning.html)
- [Cloud Infrastructures](https://spark.apache.org/docs/latest/cloud-integration.html)
- [Migration Guide](https://spark.apache.org/docs/latest/migration-guide.html)
## External Resources
- [Apache Spark Home](https://spark.apache.org/)
- [Downloads](https://spark.apache.org/downloads.html)
- [GitHub Repository](https://github.com/apache/spark)
- [Issue Tracker (JIRA)](https://issues.apache.org/jira/projects/SPARK)
- [Mailing Lists](https://spark.apache.org/mailing-lists.html)
- [Community](https://spark.apache.org/community.html)
- [Contributing](https://spark.apache.org/contributing.html)