blob: 18d00810a060a73f9d72c630d845a675a9e8ee6b [file] [log] [blame]
////
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
////
:documentationPath: /plugins/transforms/
:language: en_US
:page-alternativeEditUrl: https://github.com/apache/incubator-hop/edit/master/plugins/transforms/pgbulkloader/src/main/doc/pgbulkloader.adoc
= PostgreSQL Bulk Loader
== Description
The PostgreSQL bulk loader is a transform in which we will stream data from Hop to the psql command using "COPY DATA FROM STDIN" into the database.
== Options
[width="90%", options="header"]
|===
|Option|Description
|Transform name|Name of the transform.
|Connection|Name of the database connection on which the target table resides.
|Target schema|The name of the Schema for the table to write data to. This is important for data sources that allow for table names with dots '.' in it.
|Target table|Name of the target table.
|psql path|Full path to the psql utility.
|Load action|Insert, Truncate. Insert inserts, truncate first truncates the table.
|Fields to load a|This table contains a list of fields to load data from, properties include:
* Table field: Table field to be loaded in the PostgreSQL table;
* Stream field: Field to be taken from the incoming rows;
* Date mask: Either "Pass through, "Date" or "DateTime", determines how date/timestamps will be loaded in PostgreSQL.
|===
== Metadata Injection Support
All fields of this transform support metadata injection. You can use this transform with Metadata Injection to pass metadata to your pipeline at runtime.
== Set Up Authentication
"psql" doesn't allow you to specify the password. Here is a part of the connection options:
[source,bash]
----
Connection options:
-h HOSTNAME database server host or socket directory (default: "/var/run/postgresql")
-p PORT database server port (default: "5432")
-U NAME database user name
-W prompt for password (should happen automatically)
----
As you can see there is no way to specify a password for the database. It will always prompt for a password on the console no matter what.
To overcome this you need to set up trusted authentication on the PostgreSQL server.
To make this happen, change the pg_hba.conf file (on my box this is /etc/postgresql/8.2/main/pg_hba.conf) and add a line like this:
[source,bash]
----
host all all 192.168.1.0/24 trust
----
This basically means that everyone from the 192.168.1.0 network (mask 255.255.255.0) can log into postgres on all databases with any username. If you are running Hop on the same server, change it to localhost:
[source,bash]
----
host all all 127.0.0.1/32 trust
----
This is much safer of-course. Make sure you don't invite any strangers onto your PostgreSQL database!