blob: 42a6c7a708a5d0a076be3115e843163a9d796ffa [file] [log] [blame] [view]
Prompt:
What does this code do?
```java
package gcs;
import org.apache.beam.sdk.Pipeline;
import org.apache.beam.sdk.io.json.JsonIO;
import org.apache.beam.sdk.options.Description;
import org.apache.beam.sdk.options.PipelineOptions;
import org.apache.beam.sdk.options.PipelineOptionsFactory;
import org.apache.beam.sdk.options.Validation;
import org.apache.beam.sdk.schemas.JavaFieldSchema;
import org.apache.beam.sdk.schemas.annotations.DefaultSchema;
import org.apache.beam.sdk.transforms.Create;
import java.io.Serializable;
import java.util.Arrays;
import java.util.List;
public class WriteJsonIO {
@DefaultSchema(JavaFieldSchema.class)
public static class SampleRecord implements Serializable {
public int id;
public String month;
public String amount
public SampleRecord() {
}
public SampleRecord(int id, String month, String amount) {
this.id = id;
this.month = month;
this.amount = amount;
}
}
public interface WriteJsonOptions extends PipelineOptions {
@Description("A file path to write JSON files to")
@Validation.Required
String getFilePath();
void setFilePath(String filePath);
}
public static void main(String[] args) {
.
WriteJsonOptions options =
PipelineOptionsFactory.fromArgs(args)
.withValidation().as(WriteJsonOptions.class);
Pipeline p = Pipeline.create(options);
List<SampleRecord> rows =
Arrays.asList(
new SampleRecord(1, "January", "$1000"),
new SampleRecord(2, "February", "$2000"),
new SampleRecord(3, "March", "$3000"));
p.apply("Create Records", Create.of(rows))
.apply(
"Write Records to JSON File",
JsonIO.<SampleRecord>write(options.getFilePath())
.withNumShards(1));
p.run();
}
}
```
Response:
This code snippet uses the Apache Beam JsonIO connector to write a list of `SampleRecord` objects to a JSON file.
The `SampleRecord` class defines the schema for the JSON records and is annotated with `@DefaultSchema(JavaFieldSchema.class)` to indicate that its fields should be used as the schema for the JSON records.
```java
@DefaultSchema(JavaFieldSchema.class)
public static class SampleRecord implements Serializable {
public int id;
public String month;
public String amount
public SampleRecord() {
}
public SampleRecord(int id, String month, String amount) {
this.id = id;
this.month = month;
this.amount = amount;
}
}
```
The `WriteJsonOptions` interface specifies command-line options for configuring the output file path. The `@Description` annotation provides a description of the option, and `@Validation.Required` indicates that the option is required.
```java
public interface WriteJsonOptions extends PipelineOptions {
@Description("A file path to write JSON files to")
@Validation.Required
String getFilePath();
void setFilePath(String filePath);
}
```
The `PipelineOptionsFactory` class generates `PipelineOptions` from command-line arguments. The `Pipeline.create` method creates a data pipeline, defined as a sequence of transformations to be applied to the data.
```java
WriteJsonOptions options =
PipelineOptionsFactory.fromArgs(args)
.withValidation().as(WriteJsonOptions.class);
Pipeline p = Pipeline.create(options);
```
The pipeline generates a list of `SampleRecord` objects and applies the `Create` transform to create a `PCollection` from this list. Subsequently, the `JsonIO.write` transform is used to write the `PCollection` to a JSON file, with the `withNumShards` method specifying the number of output shards.
```java
List<SampleRecord> rows =
Arrays.asList(
new SampleRecord(1, "January", "$1000"),
new SampleRecord(2, "February", "$2000"),
new SampleRecord(3, "March", "$3000"));
p.apply("Create Records", Create.of(rows))
.apply(
"Write Records to JSON File",
JsonIO.<SampleRecord>write(options.getFilePath())
.withNumShards(1));
```
Finally, the code snippet invokes the `Pipeline.run` method to execute the pipeline.
```java
p.run();
```