Pipeline Builder
Build Pipelines Visually
Drag-and-drop canvas for medallion architecture. Add Ingestion, Cleaning, Aggregation, and Destination nodes — connect them, write SQL (or let AI do it), and deploy to Airflow.
Node Types
Each node represents a step in your data pipeline.
Ingestion Node
RawIngests raw data from a connected data source into an Apache Iceberg table. Zero transformations — stores an exact copy with full history.
Configuration
- Linked data source (GCS, BigQuery, REST API)
- Target Iceberg table name
- Write mode (append or overwrite)
- Schema evolution settings
Cleaning Node
CleanRuns SQL transformations against Ingestion data. Let the Analytics Engineer generate cleaning SQL from natural language.
Configuration
- SQL transformation (manual or AI-generated)
- Preview results before saving
- Target table name
- Deduplication keys
Aggregation Node
MetricsAggregates Cleaning data into business-ready metrics. Star schemas, KPIs, and dimensional models.
Configuration
- Aggregation SQL
- Incremental vs. full refresh
- Target metrics table
- Preview aggregated results
Destination Node
ExportExports Aggregation layer data to an external system. Configure the target warehouse, storage bucket, or database for downstream consumption.
Configuration
- Destination type (BigQuery, Snowflake, GCS, S3)
- Output table or file path
- Write mode (append or overwrite)
- File format for storage destinations
Canvas Features
Everything you need to build, test, and iterate on pipelines.
How SQL Generation Works
When you describe a transformation in natural language, the platform runs a multi-step pipeline to produce correct, safe SQL:
Schema Context
The Analytics Engineer reads your upstream table schema — column names, types, and sample data — so it knows what's available.
SQL Generation
Your natural language request is sent to the LLM with the full schema context. It generates a SQL transformation.
Validation
Every generated query goes through automated validation — column references, type safety, security rules, and syntax are all checked before execution.
Auto-Correction
If an error occurs at runtime, the platform automatically detects and corrects the issue — then retries. Most errors resolve without any manual intervention.
Preview
You see sample results before committing. Edit the SQL manually if needed, or re-prompt the Analytics Engineer.
Scheduling & Execution
Run once or schedule to run on a cadence.
One-Click Execution
Run the full pipeline immediately from the canvas toolbar.
Scheduled Runs
Set hourly, daily, or weekly schedules. OptimaFlo generates an Apache Airflow DAG behind the scenes.
Backfills
Re-process historical date ranges when you update transformation logic. Runs sequentially to avoid Iceberg write conflicts.
Execution Monitoring
Watch each node complete in real-time. View execution time, row counts, and errors per node.
What's Next
Explore the AI that powers the builder.
Visual pipeline building with AI-generated SQL and one-click Airflow scheduling.