learning/prompts/documentation-lookup-nolinks/06_basic_schema.md - beam - Git at Google

 Prompt:
 What are schemas in Apache Beam?

 Response:
 A Schema in Apache Beam is a language-independent type definition for a `PCollection`. The schema defines elements of that `PCollection` as an ordered list of named fields.

 In many cases, the element type in a `PCollection` has a structure that can be introspected. Some examples are JSON, Protocol Buffer, Avro, and database row objects. All of these formats can be converted to Beam Schemas.

 To take advantage of schemas, your `PCollection` objects must have a schema attached. Often, the source itself attaches a schema to the `PCollection`.

 Although schemas are language-independent, they're designed to embed naturally into the Beam SDK programming languages. This feature allows you to continue using native types while leveraging the advantage of having Beam understand your element schemas.

 Apache Beam provides a collection of schema transforms that operate on schemas. For an example, see `SqlTransform` in the 'Beam SQL overview' section of the Apache Beam documentation.

 The following list includes key capabilities provided by schema transforms:
 * field selection
 * grouping and aggregation
 * join operations
 * filtering data
 * adding and removing fields
 * renaming fields
 * type conversion
 * enhanced ParDo functionality

 For more information about schemas and the Apache Beam best practices when working with structured data, see the 'Schema Usage Patterns' section in the Apache Beam documentation.
	Prompt:
	What are schemas in Apache Beam?

	Response:
	A Schema in Apache Beam is a language-independent type definition for a `PCollection`. The schema defines elements of that `PCollection` as an ordered list of named fields.

	In many cases, the element type in a `PCollection` has a structure that can be introspected. Some examples are JSON, Protocol Buffer, Avro, and database row objects. All of these formats can be converted to Beam Schemas.

	To take advantage of schemas, your `PCollection` objects must have a schema attached. Often, the source itself attaches a schema to the `PCollection`.

	Although schemas are language-independent, they're designed to embed naturally into the Beam SDK programming languages. This feature allows you to continue using native types while leveraging the advantage of having Beam understand your element schemas.

	Apache Beam provides a collection of schema transforms that operate on schemas. For an example, see `SqlTransform` in the 'Beam SQL overview' section of the Apache Beam documentation.

	The following list includes key capabilities provided by schema transforms:
	* field selection
	* grouping and aggregation
	* join operations
	* filtering data
	* adding and removing fields
	* renaming fields
	* type conversion
	* enhanced ParDo functionality

	For more information about schemas and the Apache Beam best practices when working with structured data, see the 'Schema Usage Patterns' section in the Apache Beam documentation.