BYOC Documentation
GCP BYOC Setup Guide
Deploy OptimaFlo's BYOC (Bring Your Own Cloud) agent to your Google Cloud Platform project. Your data never leaves your infrastructure.
Quick Start Overview
Step 1: Enable Required GCP APIs
The BYOC agent requires several GCP APIs to be enabled in your project. These APIs provide the foundation for Cloud Run deployment, networking, data services, and monitoring.
| API | Purpose | Status |
|---|---|---|
compute.googleapis.comCompute Engine API | VPC networks, firewall rules, networking | Required |
run.googleapis.comCloud Run Admin API | Deploy and manage Cloud Run services | Required |
vpcaccess.googleapis.comServerless VPC Access API | VPC connectors for private networking | Required |
iam.googleapis.comIAM API | Service accounts and IAM bindings | Required |
secretmanager.googleapis.comSecret Manager API | Store agent tokens securely | Required |
cloudresourcemanager.googleapis.comCloud Resource Manager API | Project-level IAM policies | Required |
storage.googleapis.comCloud Storage API | GCS bucket operations | Required |
bigquery.googleapis.comBigQuery API | BigQuery dataset and table operations | Required |
composer.googleapis.comCloud Composer API | Managed Apache Airflow for pipeline orchestration | Required |
container.googleapis.comKubernetes Engine API | GKE clusters (Composer dependency) | Required |
sqladmin.googleapis.comCloud SQL Admin API | Composer metadata database | Required |
pubsub.googleapis.comPub/Sub API | Composer internal messaging | Required |
logging.googleapis.comCloud Logging API | Application logging | Required |
monitoring.googleapis.comCloud Monitoring API | Metrics and alerting | Required |
cloudtrace.googleapis.comCloud Trace API | Distributed tracing | Required |
iamcredentials.googleapis.comIAM Service Account Credentials API | Service account impersonation tokens | Required |
artifactregistry.googleapis.comArtifact Registry API | Container image storage | Required |
cloudbuild.googleapis.comCloud Build API | Infrastructure provisioning builds | Required |
servicenetworking.googleapis.comService Networking API | VPC peering for Cloud SQL | Required |
sql-component.googleapis.comCloud SQL Component API | Cloud SQL internal components | Required |
bigqueryconnection.googleapis.comBigQuery Connection API | BigLake connections for Iceberg | Required |
bigquerystorage.googleapis.comBigQuery Storage API | High-throughput BigQuery reads/writes | Required |
dataproc.googleapis.comDataproc API | Spark clusters for large-scale processing (>10TB) | Required |
Enable All APIs at Once In The Command Line
gcloud services enable \
compute.googleapis.com \
run.googleapis.com \
vpcaccess.googleapis.com \
servicenetworking.googleapis.com \
iam.googleapis.com \
iamcredentials.googleapis.com \
secretmanager.googleapis.com \
cloudresourcemanager.googleapis.com \
storage.googleapis.com \
bigquery.googleapis.com \
bigqueryconnection.googleapis.com \
bigquerystorage.googleapis.com \
composer.googleapis.com \
container.googleapis.com \
sqladmin.googleapis.com \
sql-component.googleapis.com \
pubsub.googleapis.com \
artifactregistry.googleapis.com \
cloudbuild.googleapis.com \
dataproc.googleapis.com \
logging.googleapis.com \
monitoring.googleapis.com \
cloudtrace.googleapis.com \
--project=YOUR_PROJECT_IDNote: APIs take 30-60 seconds to propagate after enabling. Wait before running BYOC.
Step 2: Configure IAM Permissions
The OptimaFlo platform service account needs access to browse, preview, and write processed data (Bronze/Silver/Gold layers) to your BigQuery and GCS. Permissions follow the principle of least privilege.
The platform service account in the commands below is optimaflo-sa@production-optimaflo.iam.gserviceaccount.com. This is OptimaFlo's fixed identity, not a placeholder. Use it as-is when granting roles in your project. Authentication from the platform to your project uses Workload Identity Federation under the hood, so no key files are exchanged.
| IAM Role | Scope | Purpose |
|---|---|---|
roles/bigquery.dataEditor | Project | Read and write datasets, tables, and table data (Bronze/Silver/Gold layers) bigquery.datasets.getbigquery.tables.getbigquery.tables.listbigquery.tables.getDatabigquery.tables.createbigquery.tables.updateData |
roles/bigquery.jobUser | Project | Execute queries for data preview and transformation bigquery.jobs.create |
roles/storage.objectViewer | Bucket | Read files and list objects in your data buckets storage.objects.getstorage.objects.list |
roles/run.invoker | Service | Invoke BYOC Cloud Run services (Polaris, Agent API) run.routes.invoke |
roles/iam.serviceAccountTokenCreator | Service Account | Generate access tokens for impersonation (on Agent SA) iam.serviceAccounts.getAccessTokeniam.serviceAccounts.signJwt |
roles/composer.worker | Project | Execute Airflow commands via Cloud Composer API (Agent SA) composer.environments.executeAirflowCommandcomposer.environments.get |
Grant Permissions via gcloud CLI
Optional: Dataproc for Spark Workloads
If you're processing datasets larger than 10TB, you may want to enable Dataproc for distributed Spark processing. Datasets under 100GB use DuckDB, and 100GB–10TB use BigQuery automatically.
| API | Purpose | Status |
|---|---|---|
dataproc.googleapis.comCloud Dataproc API | Create clusters, submit Spark/PySpark jobs | Required |
dataproc-control.googleapis.comCloud Dataproc Control API | Internal control plane (auto-enabled) | Optional |
dataprocrm.googleapis.comDataproc Resource Manager API | Advanced: Dataproc on GKE | Optional |
metastore.googleapis.comDataproc Metastore API | Managed Hive Metastore for persistent catalog | Optional |
# Enable Dataproc API (only if needed for >10TB workloads)
gcloud services enable dataproc.googleapis.com --project=YOUR_PROJECT_ID
# Optional: Enable Dataproc Metastore for persistent Hive catalog
gcloud services enable metastore.googleapis.com --project=YOUR_PROJECT_IDStep 3: Deploy via Dashboard
Once APIs are enabled and permissions are granted, deploy the BYOC agent using the OptimaFlo dashboard. The guided wizard walks you through the deployment process.
- 1Connect GCP Project — Enter your project ID and authenticate with OAuth
- 2Configure Agent — Select region, network settings, and data sources
- 3Deploy — Click deploy and watch real-time progress as your agent provisions
- 4Verify — The dashboard shows agent health and connectivity status
Common Issues
More data than people? Put an AI data team on it.
From raw data to live dashboards in one conversation.
Now in early beta. One flat plan, no per-query tax. Runs in your cloud. Your data never leaves.