Features
DataWork,Done.
YouApproveIt.
AI Engine
Your Team Runs on Your Own AI Key
Every role runs on your own LLM key, so there is no per-query or per-token tax from us. The AI sees your full schema and keeps its own SQL correct and safe.
Data Architecture
Your Data, Organized Automatically
Data flows through clean layers such as raw, cleaned, and business-ready. Each step is schema-enforced, auditable, and stored in open formats you own.
Open Lakehouse
Your Data Stays Yours
It versions itself, travels through time, and stays portable, with no lock-in. Apache Iceberg, the open table format, sits under everything to make that true.
Time Travel
Query any table as it looked at any point in time. Audit changes, debug, or reproduce a past report.
Snapshots
Every write creates an immutable snapshot. Full history, with rollback to any version.
Schema Evolution
Add, rename, or drop columns without rewriting tables or breaking downstream queries.
ACID Transactions
Concurrent reads and writes stay consistent. No half-written tables, no corrupt data.
Pipeline Canvas
See Your Entire Data Flow
The visual pipeline canvas shows every node in your data pipeline as an interactive graph. Add nodes, preview data at each layer, edit SQL, and connect sources to destinations.
Drag & Drop Nodes
Add source, transform, and destination nodes to your pipeline canvas with a click.
Live Data Preview
Preview data at any step in your pipeline before deploying to production.
Inline SQL Editor
Edit transform SQL directly on the canvas with AI copilot assistance and schema context.
Dependency Tracking
See upstream and downstream dependencies for every node in your pipeline.
Compute & Scheduling
The right engine, every time.
Small dataset? It runs instantly. Big dataset? It scales up automatically. You don't manage servers or pick engines. Databricks and Snowflake are built for companies that have a data platform team. OptimaFlo is for the ones that don't.
DuckDB
≤ 100 GBRuns right in the app. No servers to manage. Answers in under a second.
Warehouse
100 GB - 10 TBA cloud warehouse that grows when you need it. Pay per query.
Apache Spark
> 10 TBSplits big jobs across many servers. Runs in your own cloud.
Automatic orchestration
Every pipeline runs on managed Apache Airflow in your own cloud. Scheduling, retries, and monitoring built in.
Cron-based scheduling with configurable intervals. Set it once, and your pipelines run on autopilot.
Failed tasks retry automatically. Run backfills across historical date ranges with sequential triggering.
Semantic Layer & BI
One Source of Truth for Every Metric
Define metrics once, use them everywhere; dashboards, AI queries, exports. No more conflicting definitions across tools.
Define revenue, churn, ARR, and any custom metric once. Tag them as certified so everyone queries the same number.
Organize dimensions into hierarchies like region → country → city. Drill-down and roll-up just work.
Map joins between tables in your semantic layer. AI uses these relationships to write correct multi-table queries automatically.
Define business terms and metrics in one shared glossary, so every person and every AI uses the same definitions.
Describe the dashboard you need. The BI Developer analyzes your tables and builds widgets with the right chart types, filters, and metrics.
Scheduled LLM-narrated insights delivered to email, Slack, or webhooks. Your team gets actionable summaries, not raw data.
Share interactive dashboards with your team. Role-based access controls keep the right people on the right data.
Ask a question in plain English or write SQL directly. Query any connected table and get instant charts and tables back.
Data Quality
Trust Your Data Before Anyone Sees It
Your Quality Engineer runs automatic scoring, AI-generated validation rules, and real-time alerts so bad data gets caught before it reaches your dashboards.
Every table is scored on completeness, accuracy, freshness, validity, and consistency. Quality gates pause pipelines when scores drop below threshold; bad data never reaches downstream.
AI analyzes your data and generates validation rules automatically; null checks, range bounds, format patterns, and custom SQL rules. Mix LLM-generated and hand-crafted.
Get notified when quality drops. Route alerts to Slack, email, or webhooks. Profile any table with one click; distributions, outliers, null rates, and cardinality.
Connectors
Connect, Transform, Export
One-click OAuth or service account auth, with automatic schema inference. Ingest from any source, export to any destination.
Google's serverless data warehouse
Object storage for files and data lakes
Any REST API with JSON or CSV responses
AWS object storage
AWS data warehouse
Popular relational database
Cloud data warehouse
Popular relational database
Any GraphQL API endpoint
Write processed results to your warehouse for BI tools
Export as Parquet, CSV, or JSON to any GCS bucket
Pipeline completion and schema change notifications via Slack, email, or custom webhooks
Your Cloud
Your infrastructure. Our orchestration.
OptimaFlo runs in your own cloud, so your data never leaves. You pay one flat plan price and bring your own LLM key. No per-query or per-DBU bills, no surprise cloud spend.
Your GCP project
Everything runs inside your own GCP project. We set it up. You own it.
Data never leaves
Your raw data, processed tables, and query results stay in your storage. We orchestrate, never store.
Managed orchestration
Pipeline scheduling set up and managed for you. New workflows sync automatically.
Polaris catalog
Each workspace gets its own data catalog. Full isolation between teams and projects.
Automated provisioning
One-click setup. Networking, permissions, storage, and compute configured automatically.
No data lock-in
Your data stays portable, wherever you run it. Open standards under the hood keep it that way.
Enterprise security
RBAC, workspace-level permissions, and encryption at rest. Built in from day one, not sold as an upgrade.
Execution audit trail
Every pipeline run is tracked with status, timing, and errors. Iceberg keeps full snapshot history and schema versions for compliance.
More data than people? Put an AI data team on it.
From raw data to live dashboards in one conversation.
Now in early beta. One flat plan, no per-query tax. Runs in your cloud. Your data never leaves.