A RestApiSource is a QueryBasedSource which uses RESTful Api for query. RestApiExtractor
is a QueryBasedExtractor
that uses REST to communicate with the source. To establish the communication, a RestApiConnector
is required.
RestApiSource
Coming soon...
RestApiExtractor
A RestApiExtractor
sets up the common routines to query information from a REST source, for example, extractMetadata
, getMaxWatermark
, getSourceCount
, getRecordSet
, which are mentioned in chapter QueryBasedSource. In terms of constructing the actual query and extracting the data from the response, the source specific layer holds the truth, for example, SalesforceExtractor
.
A simplified general flow of routines is depicted in Figure 1:
Depends on the routines, [getX], [constructGetXQuery], [extractXFromResponse] are
Description | [getX] | [constructGetXQuery] | [extractXFromResponse] |
---|---|---|---|
Get data schema | extractMetadata | getSchemaMetadata | getSchema |
Calculate latest high watermark | getMaxWatermark | getHighWatermarkMetadata | getHighWatermark |
Get total counts of records to be pulled | getSourceCount | getCountMetadata | getCount |
Get records | getRecordSet | getDataMetadata | getData |
There are other interactions between the RestApiExtractor
layer and SourceSpecificLayer
. The key points are:
ProtocolSpecificLayer
, such as RestApiExtractor
, understands the protocol and sets up a routine to communicate with the sourceSourceSpecificLayer
, such as SalesforceExtractor
, knows the source and fits into the routine by providing and analyzing source specific informationConfiguration Key | Default Value | Description |
---|---|---|
source.querybased.query | Optional | The query that the extractor should execute to pull data. |
source.querybased.excluded.columns | Options | Names of columns excluded while pulling data. |
extract.delta.fields | Optional | List of columns that are associated with the watermark. |
extract.primary.key.fields | Optional | List of columns that will be used as the primary key for the data. |