examples/spark/README.md - hamilton - Git at Google


 # Scaling Hamilton on Spark
 ## Pyspark

 If you're using pyspark, Hamilton allows for natural manipulation of pyspark dataframes,
 with some special constructs for managing DAGs of UDFs.

 See the example in `pyspark` to learn more.

 ## Pandas
 If you're using Pandas, Hamilton scales by using Koalas on Spark.
 Koalas became part of Spark officially in Spark 3.2, and was renamed Pandas on Spark.
 The example in `pandas_on_spark` here assumes that.

 ## Pyspark UDFs
 If you're not using Pandas, then you can use Hamilton to manage and organize your pyspark UDFs.
 See the example in `pyspark_udfs`.

 Note: we're looking to expand coverage and support for more Spark use cases. Please come find us, or open an issue,
 if you have a use case that you'd like to see supported!

	# Scaling Hamilton on Spark
	## Pyspark

	If you're using pyspark, Hamilton allows for natural manipulation of pyspark dataframes,
	with some special constructs for managing DAGs of UDFs.

	See the example in `pyspark` to learn more.

	## Pandas
	If you're using Pandas, Hamilton scales by using Koalas on Spark.
	Koalas became part of Spark officially in Spark 3.2, and was renamed Pandas on Spark.
	The example in `pandas_on_spark` here assumes that.

	## Pyspark UDFs
	If you're not using Pandas, then you can use Hamilton to manage and organize your pyspark UDFs.
	See the example in `pyspark_udfs`.

	Note: we're looking to expand coverage and support for more Spark use cases. Please come find us, or open an issue,
	if you have a use case that you'd like to see supported!