Introduction

A RestApiSource is a QueryBasedSource which uses RESTful Api for query. RestApiExtractor is a QueryBasedExtractor that uses REST to communicate with the source. To establish the communication, a RestApiConnector is required.

Constructs

RestApiSource

Coming soon...

RestApiExtractor

A RestApiExtractor sets up the common routines to query information from a REST source, for example, extractMetadata, getMaxWatermark, getSourceCount, getRecordSet, which are mentioned in chapter QueryBasedSource. In terms of constructing the actual query and extracting the data from the response, the source specific layer holds the truth, for example, SalesforceExtractor.

A simplified general flow of routines is depicted in Figure 1:

Depends on the routines, [getX], [constructGetXQuery], [extractXFromResponse] are

Description[getX][constructGetXQuery][extractXFromResponse]
Get data schemaextractMetadatagetSchemaMetadatagetSchema
Calculate latest high watermarkgetMaxWatermarkgetHighWatermarkMetadatagetHighWatermark
Get total counts of records to be pulledgetSourceCountgetCountMetadatagetCount
Get recordsgetRecordSetgetDataMetadatagetData

There are other interactions between the RestApiExtractor layer and SourceSpecificLayer. The key points are:

  • A ProtocolSpecificLayer, such as RestApiExtractor, understands the protocol and sets up a routine to communicate with the source
  • A SourceSpecificLayer, such as SalesforceExtractor, knows the source and fits into the routine by providing and analyzing source specific information

Configuration

Configuration KeyDefault ValueDescription
source.querybased.queryOptionalThe query that the extractor should execute to pull data.
source.querybased.excluded.columnsOptionsNames of columns excluded while pulling data.
extract.delta.fieldsOptionalList of columns that are associated with the watermark.
extract.primary.key.fieldsOptionalList of columns that will be used as the primary key for the data.