To read a OneTable synced target table (regardless of the table format) in Amazon Redshift, users have to create an external schema and refer to the external data catalog that contains the table. Redshift infers the table's schema and format from the external catalog/database directly. For more information on creating external schemas, refer to Redshift docs.
The following query creates an external schema onetable_synced_schema using the Glue database glue_database_name
CREATE EXTERNAL SCHEMA onetable_synced_schema FROM DATA CATALOG DATABASE <glue_database_name> IAM_ROLE 'arn:aws:iam::<accountId>:role/<roleName>' CREATE EXTERNAL DATABASE IF NOT EXISTS;
:::danger Note: The IAM role needs to have minimum access to Amazon S3 and AWS Glue Data Catalog. For more information refer to AWS docs. :::
Redshift can infer the tables present in the Glue database automatically. You can then query the tables using:
SELECT * FROM onetable_synced_schema.<table_name>;
For Delta Lake, steps slightly vary because Redshift Spectrum relies on Delta Lake's manifest file - a text file containing the list of data files to read for querying a Delta table.
You have two options to create and query Delta tables in Redshift Spectrum:
Create Symlink tables option in Add data source pop-up window. This will add _symlink_format_manifest folder with manifest files in the table root path.You can then use a similar approach to query the Hudi and Iceberg tables mentioned above.
CREATE EXTERNAL SCHEMA onetable_synced_schema_delta FROM DATA CATALOG DATABASE <delta_glue_database_name> IAM_ROLE 'arn:aws:iam::<accountId>:role/<roleName>' CREATE EXTERNAL DATABASE IF NOT EXISTS;
SELECT * FROM onetable_synced_schema_delta.<table_name>;