Doris JDBC Catalog supports connecting to PostgreSQL databases through the standard JDBC interface. This document describes how to configure a PostgreSQL database connection.
To connect to a PostgreSQL database, you need
PostgreSQL 11.x or higher
JDBC driver for PostgreSQL database, which you can download from the Maven repository Download the latest or specified version of the PostgreSQL JDBC driver. **It is recommended to use PostgreSQL JDBC Driver 42.5.x and above. **
Doris Network connection between each FE and BE node and the PostgreSQL server, default port is 5432.
CREATE CATALOG postgresql PROPERTIES ( "type"="jdbc", "user"="root", "password"="secret", "jdbc_url" = "jdbc:postgresql://example.net:5432/postgres", "driver_url" = "postgresql-42.5.6.jar", "driver_class" = "org.postgresql.Driver" )
:::info remarks jdbc_url defines the connection information and parameters to be passed to the PostgreSQL JDBC driver. Parameters for supported URLs can be found in the PostgreSQL JDBC Driver Documentation. :::
If you configured TLS with a globally trusted certificate installed on the data source, you can enable TLS between the cluster and the data source by appending the parameter to the JDBC connection string set in the jdbc_url property.
For example, for version 42 of the PostgreSQL JDBC driver, enable TLS by adding the ssl=true parameter to the jdbc_url configuration property:
"jdbc_url"="jdbc:postgresql://example.net:5432/database?ssl=true"
For more information about TLS configuration options, see the PostgreSQL JDBC Driver documentation.
When mapping PostgreSQL, a Database in Doris corresponds to a Schema under the specified database in PostgreSQL (such as the schemas under postgres in the jdbc_url parameter in the example). The Table under Doris' Database corresponds to the Tables under the Schema in PostgreSQL. That is, the mapping relationship is as follows:
| Doris | PostgreSQL |
|---|---|
| Catalog | Database |
| Database | Schema |
| Table | Table |
| PostgreSQL Type | Doris Type | Comment |
|---|---|---|
| boolean | BOOLEAN | |
| smallint/int2 | SMALLINT | |
| integer/int4 | INT | |
| bigint/int8 | BIGINT | |
| decimal/numeric | DECIMAL | |
| real/float4 | FLOAT | |
| double precision | DOUBLE | |
| smallserial | SMALLINT | |
| serial | INT | |
| bigserial | BIGINT | |
| char | CHAR | |
| varchar/text | STRING | |
| timestamp/timestampz | DATETIME | |
| date | DATE | |
| json/jsonb | STRING | |
| time | STRING | |
| interval | STRING | |
| point/line/lseg/box/path/polygon/circle | STRING | |
| cidr/inet/macaddr | STRING | |
| bit | BOOLEAN/STRING | |
| uuid | STRING | |
| Other | UNSUPPORTED |
:::tip
Since Doris does not support timestamp types with time zones, when reading the timestampz type of PostgreSQL, Doris will map it to the DATETIME type and convert it to the time in the local time zone when reading.
And since the Java part of BE uses the JVM time zone when reading data from the JDBC type Catalog. The JVM time zone defaults to the time zone of the BE deployment machine, which affects the time zone conversion when JDBC reads data.
In order to ensure time zone consistency, it is recommended to set the JVM time zone in JAVA_OPTS of be.conf to be consistent with the time_zone of the Doris session.
When executing a query like where dt = '2022-01-01', Doris can push these filtering conditions down to the external data source, thereby directly excluding data that does not meet the conditions at the data source level, reducing inaccuracies. Necessary data acquisition and transmission. This greatly improves query performance while also reducing the load on external data sources.
If you have the limit keyword in the query, Doris will push the limit down to PostgreSQL to reduce the amount of data transfer.
Doris will automatically add the escape character ("") to the field names and table names in the query statements sent to PostgreSQL to avoid conflicts between the field names and table names and PostgreSQL internal keywords.