GCP and BigQuery connection updates. fixes #5721 (#5722)
diff --git a/docs/hop-user-manual/modules/ROOT/pages/database/databases/googlebigquery.adoc b/docs/hop-user-manual/modules/ROOT/pages/database/databases/googlebigquery.adoc
index 596917b..47955de 100644
--- a/docs/hop-user-manual/modules/ROOT/pages/database/databases/googlebigquery.adoc
+++ b/docs/hop-user-manual/modules/ROOT/pages/database/databases/googlebigquery.adoc
@@ -34,6 +34,8 @@
|Driver folder | <Hop Installation>/lib/jdbc
|===
+== Driver Installation
+
The Simba driver is packaged as a .zip containing many jars. Only a subset of the jars included with the Driver are necessary to use Bigquery JDBC with Apache Hop. Furthermore, some of the jars may conflict with those packaged with Hop and *must* be excluded.
**SIMBA DRIVER JARS TO EXCLUDE (THESE JARS ARE INCLUDED IN THE SIMBA DRIVER, BUT CONFLICT WITH HOP LIBRARIES AND MUST BE EXCLUDED**
@@ -61,3 +63,82 @@
threetenbp-<VERSION>.jar
pass:[*] Tested with Hop 2.5.0 and Simba 1.3.3.1004. Not all authentication methods tested either, so this list may not be exhaustive
+
+== Connection Configuration
+
+=== Basic Connection Settings
+
+When creating a Google BigQuery connection in Apache Hop, configure the following basic settings:
+
+[cols="2*",options="header"]
+|===
+| Setting | Value
+| Connection name | Your preferred connection name (e.g., "BQ")
+| Connection type | Google BigQuery
+| Access | Native (JDBC)
+| Server host name | https://www.googleapis.com/bigquery/v2
+| Database name | Your Google Cloud Project ID
+| Port number | 443
+|===
+
+=== Authentication Options
+
+Google BigQuery connections in Apache Hop support multiple authentication methods. The most common and recommended approach is using Service Account authentication with a JSON key file.
+
+In the *Options* tab of the connection dialog, configure the following parameters:
+
+[cols="3*",options="header"]
+|===
+| Parameter | Value | Description
+| OAuthType | 0 | Service Account authentication
+| ProjectId | your-project-id | Your Google Cloud Project ID
+| OAuthServiceAcctEmail | service-account@your-project.iam.gserviceaccount.com | Service account email address
+| OAuthPvtKeyPath | /path/to/service-account-key.json | Path to the service account JSON key file
+| TimeOut | 3600 | Connection timeout in seconds (optional)
+|===
+
+=== Example Configuration
+
+Here's a complete example of a working BigQuery connection configuration:
+
+* **Connection name**: BQ
+* **Server host name**: https://www.googleapis.com/bigquery/v2
+* **Database name**: your-project-id
+* **Port number**: 443
+
+**Options tab parameters**:
+
+* **OAuthType**: 0
+* **ProjectId**: your-project-id
+* **OAuthServiceAcctEmail**: your-service-account@your-project-id.iam.gserviceaccount.com
+* **OAuthPvtKeyPath**: /path/to/your-service-account-key.json
+* **TimeOut**: 3600
+
+include::../../snippets/gcp-service-account-setup.adoc[]
+
+== Testing the Connection
+
+After configuring your connection:
+
+1. Click the "Test" button in the connection dialog
+2. If successful, you should see a confirmation message
+3. If the test fails, verify:
+ - Your service account has the necessary BigQuery permissions
+ - The JSON key file path is correct and accessible
+ - Your project ID matches the one in the service account
+ - Network connectivity to Google APIs is available
+
+== Troubleshooting
+
+=== Common Issues
+
+* **Authentication errors**: Verify that your service account has the required BigQuery roles and that the JSON key file path is correct
+* **Project not found**: Ensure the ProjectId parameter matches your actual Google Cloud Project ID
+* **Connection timeout**: Increase the TimeOut value if you're experiencing slow connections
+* **Driver conflicts**: Ensure you've excluded the conflicting GRPC jars as listed in the driver installation section
+
+=== Performance Considerations
+
+* Use appropriate timeouts for large query operations
+* Consider using BigQuery's standard SQL dialect for better performance
+* Implement proper error handling in your workflows when working with large datasets
diff --git a/docs/hop-user-manual/modules/ROOT/pages/snippets/gcp-service-account-setup.adoc b/docs/hop-user-manual/modules/ROOT/pages/snippets/gcp-service-account-setup.adoc
new file mode 100644
index 0000000..f8774bf
--- /dev/null
+++ b/docs/hop-user-manual/modules/ROOT/pages/snippets/gcp-service-account-setup.adoc
@@ -0,0 +1,74 @@
+////
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+ http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+////
+
+== Google Cloud Platform Service Account Setup
+
+To connect to Google Cloud services, you need to set up a service account in your Google Cloud Platform (GCP) project and download the authentication credentials.
+
+=== Creating a Service Account
+
+1. **Navigate to the Google Cloud Console**:
+ - Go to https://console.cloud.google.com/
+ - Select your project or create a new one
+
+2. **Create a Service Account**:
+ - Navigate to "IAM & Admin" > "Service Accounts"
+ - Click "Create Service Account"
+ - Provide a name and description for your service account
+ - Click "Create and Continue"
+
+3. **Assign Roles**:
+ - Assign the appropriate roles to your service account based on the service you're connecting to:
+ * For BigQuery:
+ ** `BigQuery Data Viewer` (`roles/bigquery.dataViewer`) - for read access
+ ** `BigQuery Job User` (`roles/bigquery.jobUser`) - for running queries (required)
+ ** `BigQuery Data Editor` (`roles/bigquery.dataEditor`) - if write access needed
+ ** `BigQuery Data Owner` (`roles/bigquery.dataOwner`) - if full control needed
+ * For Cloud Storage: `Storage Object Viewer`, `Storage Object Admin` (if write access needed)
+ * For other services: Consult the specific service documentation for required roles
+ - Click "Continue" and then "Done"
+
+4. **Generate and Download Key**:
+ - Click on the created service account
+ - Go to the "Keys" tab
+ - Click "Add Key" > "Create new key"
+ - Select "JSON" format
+ - Click "Create" to download the key file
+
+5. **Secure the Key File**:
+ - Store the downloaded JSON key file in a secure location
+ - Note the full path to this file - you'll need it for authentication configuration
+ - Ensure the file has appropriate permissions (readable only by the user running Hop)
+
+=== Alternative: Using Application Default Credentials
+
+Apache Hop can also use Google Cloud's Application Default Credentials (ADC) if you're running Hop on Google Cloud Platform or have configured the Google Cloud SDK locally.
+
+To use ADC:
+1. Install and configure the Google Cloud SDK
+2. Run `gcloud auth application-default login` to set up default credentials
+3. In your Hop connection, you can omit service account key file parameters
+
+This method is particularly useful for development environments or when running Hop on Google Cloud Platform services.
+
+=== Security Best Practices
+
+* **Principle of Least Privilege**: Only assign the minimum roles necessary for your use case
+* **Key Rotation**: Regularly rotate service account keys (recommended every 90 days)
+* **Environment Variables**: Consider using environment variables to store key file paths instead of hardcoding them
+* **Access Control**: Restrict access to service account key files using appropriate file system permissions
+* **Monitoring**: Enable audit logging to monitor service account usage