pgpg_orca
ORCA, the optimizer behind Greenplum & Apache Cloudberry

Plug ORCA into PostgreSQL 18 & 19-devel.

pg_orca is a PostgreSQL 18 & 19-devel extension that swaps in the ORCA cost-based query optimizer from Greenplum / Apache Cloudberry — exhaustive join enumeration, correlated subquery decorrelation, and dynamic partition pruning. Installed with a single CREATE EXTENSION.

MIT-style license · PostgreSQL 18 & 19-devel · macOS & Linux · Auto-fallback on unsupported queries

Speedup vs PG 18
≥254×
TPC-DS Q1 — ORCA 0.5 s vs PG timeout (120 s)
Measured on a single-node PG 18 vs pg_orca at par=0. See benchmarks for methodology.
Why pg_orca

Three limits of the PostgreSQL query planner.

ORCA was built at Greenplum for exactly this class of problem. pg_orca brings that engine to single-node PostgreSQL 18 & 19-devel — without replacing the planner you already have.

Heuristic join enumeration

PostgreSQL switches to a genetic algorithm at 12+ relations (geqo_threshold). Below that, it greedily prefers left-deep trees. Neither path searches the bushy plan space — and both are sensitive to the planner's row estimates.

-- 6+ way joins like TPC-DS Q25
-- left-deep tree picked even when bushy is dramatically cheaper
SELECT ... FROM store_sales
  JOIN store_returns ON ...
  JOIN catalog_sales ON ...
  JOIN customer    ON ...
  JOIN item        ON ...
  JOIN date_dim    ON ...
GROUP BY ...;

Correlated subqueries become SubPlan

Anything beyond plain EXISTS / NOT EXISTS — equality SELECTs, IN-subqueries, nested correlation, subqueries with aggregates — drops to SubPlan / InitPlan: re-executed per outer row instead of being rewritten into a single join.

-- TPC-H Q17 pattern
SELECT SUM(l_extendedprice) / 7.0
FROM lineitem, part
WHERE p_partkey = l_partkey
  AND l_quantity < (
    SELECT 0.2 * AVG(l_quantity)
    FROM lineitem
    WHERE l_partkey = p_partkey   -- correlated
  );

Partition pruning happens too late

PostgreSQL prunes partitions from constant predicates at plan time, but join-key pruning depends on runtime values the planner can't see. ORCA's DynamicTableScan + PartitionSelector decides which partitions to read after the join key is known — pruning that PG simply does not perform.

-- Star-join: prune sales partitions by date_dim.d_year = 2002
-- PG: scans every sales partition, joins, filters after
-- ORCA: PartitionSelector pushes the year filter to the scan
SELECT SUM(ss_ext_sales_price)
FROM store_sales            -- partitioned by ss_sold_date_sk
JOIN date_dim ON ss_sold_date_sk = d_date_sk
WHERE d_year = 2002;        -- prunes ~95% of partitions at runtime
Benchmarks

TPC-H and TPC-DS benchmarks: pg_orca vs PostgreSQL 18.

ORCA's wins cluster on complex queries with correlated subqueries, multi-CTE joins, and 6+ way roll-ups — the cases PostgreSQL's planner treats heuristically.

Speedup vs PostgreSQL 18 (×, log scale)par = 0
  • Q1
    correlated subquery + multi-CTE · PG TIMEOUT
    ≥254×
  • Q30
    correlated aggregate
    107×
  • Q6
    correlated EXISTS over multi-join · PG TIMEOUT
    ≥91×
  • Q81
    correlated subquery
    77.0×
  • Q41
    correlated IN + EXISTS
    37.4×
  • Q74
    multi-CTE self-join · PG TIMEOUT
    ≥29×
  • Q54
    correlated nested subquery
    27.8×
  • Q11
    multi-CTE self-join · PG TIMEOUT
    ≥23×
  • Q37
    short query — planning overhead
    0.05×

Top 8 wins plus the largest regression. Bars use a log scale so 254× and 20× both stay readable. "≥N×" rows are lower bounds where PG hit the 120s timeout.

Overall

On the 94 queries both finished: ORCA 380 s vs PG 453 s.

0.77×
geomean
0.71×
median
17
ORCA wins >5%
of 94
Coverage edge

ORCA finished 5 queries PG timed out at 120 s — Q1, Q4, Q6, Q11, Q74. Counting the timeout floor → ORCA ≥ 2.77× faster overall.

Methodology

Single-node PG 18 vs same binary with pg_orca loaded (build b17245d, adaptive join-order downshift). 3-run median of EXPLAIN ANALYZE execution time.
max_parallel_workers_per_gather = 0 · statement_timeout = 120 s. Generated 2026-05-27.

Key facts

pg_orca benchmark facts (2026-05-27, build b17245d)

  • On TPC-DS sf=1, pg_orca finishes 99/99 queries; PostgreSQL 18 times out (120 s) on 5 queries: Q1, Q4, Q6, Q11, Q74.
  • Counting the timeout floor, pg_orca is ≥ 2.77× faster than PostgreSQL 18 across all 99 TPC-DS sf=1 queries (ORCA 380 s vs PG ≥ 1,053 s).
  • Largest single-query win: TPC-DS Q1 at ≥ 254× (pg_orca 473 ms vs PG timeout at 120 s).
  • Largest single-query win where both planners finish: TPC-DS Q30 at 107× (pg_orca 221 ms vs PG 23,533 ms).
  • On TPC-H sf=10, pg_orca shows a 1.32× geomean speedup across 22 queries; biggest win is Q17 at 20.83× (correlated subquery onl_quantity < 0.2 × AVG(l_quantity)).
  • All numbers are 3-run median EXPLAIN ANALYZE execution time, with max_parallel_workers_per_gather = 0 and statement_timeout = 120 s.
  • Raw CSV reports: tpch_report_20260527.csv and tpcds_report_20260527.csv.
  • ORCA wins concentrate on correlated subqueries, multi-CTE joins, and 6+ way roll-ups — structural plan-shape decisions that PostgreSQL's planner treats heuristically.
Features

Optimizer features ORCA brings to PostgreSQL.

Four structural plan-shape capabilities — not micro-optimizations.

ORCA's signature capability

Correlated subquery decorrelation

ORCA's CXformApply2Join family algebraically rewrites Apply into a regular Join — so correlated subqueries become a single optimizable plan instead of a per-row SubPlan.

TPC-H Q17
l_quantity < 0.2 × AVG(l_quantity) WHERE l_partkey = p_partkey
20.7×
Bushy plans, cost-driven

Exhaustive join-order enumeration (DPv2)

CJoinOrderDPv2 + commutativity / associativity xforms enumerate the full join space under ORCA's own cardinality model — bushy and left-deep alike. PG's GEQO is a fallback above 12 relations; this is dynamic programming all the way through.

TPC-DS Q25
6-way roll-up join across store_sales × store_returns × catalog_sales × …
9.0×
Stats survive GROUP BY / CTE

Robust statistics propagation

CGroupByStatsProcessor preserves the full input histogram (NDV included) on grouping columns. CTE consumers inherit stats via colid mapping — so downstream joins still see real cardinalities, not defaults.

TPC-DS Q31
2 CTEs, 6-way self-join with stadistinct preserved across grouping
5.8×
Dynamic partition pruning

Partitioned tables

DXL's PartitionSelector is wired through pg_orca's translation layer end-to-end. Partition pruning happens at optimization time and at runtime, on the same plan ORCA generates for non-partitioned scans.

DDL
PARTITION BY RANGE / LIST / HASH — all selector types supported
Architecture

Architecture: how pg_orca turns a query into a plan.

pg_orca registers a planner_hook. When ORCA is enabled, the query takes a detour through DXL — and comes back as a regular PG plan.

PG Query ASTparser/analyzerplanner_hookpg_orca.cppQuery → DXLtranslate/ORCA Optimizerlibgpopt + xformsMemogroups + costStatisticshistograms · NDV · MCVrelcache adaptergpopt/relcache/DXL Planbest-cost alternativePlannedStmtPG Executoron failure: standard_planner
01

DXL — language between two worlds

ORCA was built optimizer-first, database-agnostic. DXL (Data eXchange Language, XML-based) is its IR. pg_orca translates a PostgreSQL Query AST into DXL and the resulting DXL plan back into PG's PlannedStmt.

02

Memo + Xforms — cost-based search

Plans live in a Memo (Volcano/Cascades-style group structure). Transformation rules (xforms) generate alternatives — join associativity, Apply-to-Join, GROUP BY pushdown, index selection — and the cost model picks a winner over the whole space.

03

Failsafe — always returns a plan

Anything ORCA can't handle (unsupported feature, internal error, timeout) falls back to PostgreSQL's standard_planner automatically. Set pg_orca.trace_fallback = on to log each fallback with reason.

Install

Install pg_orca on PostgreSQL 18 or 19-devel in three steps.

Requires PostgreSQL 18 or 19-devel, xerces-c, ICU, CMake ≥ 3.20, and a C++17 compiler. No patches to PostgreSQL itself.

1

Install dependencies

brew install postgresql@18 xerces-c cmake ninja icu4c
2

Clone & build

# 2. Clone & build
git clone https://github.com/quantumiodb/pgorca.git
cd pgorca && mkdir build && cd build
cmake .. -DPG_CONFIG=$(which pg_config) -DCMAKE_BUILD_TYPE=Release -GNinja
ninja -j$(nproc)
ninja install
3

Install, preload, enable, run

psqlsql
-- 3. Install the extension in the target database.
--    CREATE EXTENSION loads the library into the current session,
--    so pg_orca.* GUCs and planner_hook are live immediately.
CREATE EXTENSION pg_orca;

-- 4. (Recommended) Auto-load pg_orca for every new connection
--    to this database. Per-database scope, no server restart;
--    takes effect for sessions opened from this point on.
ALTER DATABASE mydb SET session_preload_libraries = 'pg_orca';

-- 5. Enable ORCA — per session, or persistently with
--    ALTER DATABASE mydb SET pg_orca.enable_orca = on
SET pg_orca.enable_orca = on;

-- 6. Run a query — ORCA optimizes it.
EXPLAIN SELECT * FROM lineitem WHERE l_quantity < 5;

Why session_preload_libraries, not shared_preload_libraries

session_preload_libraries loads pg_orca into each new backend at connection time — no postmaster restart, scoped to one database (or one role), and easily reverted with RESET. Existing sessions stay on the standard planner until they reconnect.

Alternative scopes

Pick the scope that matches how broadly you want pg_orca enabled.

psqlsql
-- Cluster-wide (every database, every role):
ALTER SYSTEM SET session_preload_libraries = 'pg_orca';
SELECT pg_reload_conf();

-- Single role only:
ALTER ROLE bench SET session_preload_libraries = 'pg_orca';

-- Co-exist with sibling preload libraries — list them explicitly:
ALTER DATABASE mydb SET session_preload_libraries = 'pg_orca,pg_stat_statements';

-- Roll back:
ALTER DATABASE mydb RESET session_preload_libraries;
DROP EXTENSION pg_orca;
Honest

What pg_orca is not good at.

Optimizer quality cuts both ways. The same exhaustive search that produces 254× wins on complex analytics costs measurable planning time on simple queries — and ORCA hasn't learned PG's parallel executor yet.

13.7 ms vs 0.25 ms

Planning overhead is high

ORCA's CBO is consistently heavier than PG's planner. On a 1-row point lookup, planning alone takes ~14 ms — where PG finishes in under 1 ms. For OLTP / latency-sensitive paths, planning dominates total wall time.

Single-worker plans only

No parallel query yet

ORCA generates serial plans — Gather / Parallel Seq Scan / Parallel Hash Join nodes are not emitted. Benchmarks are run with max_parallel_workers_per_gather = 0 for an apples-to-apples comparison. On hardware where PG benefits from parallelism, ORCA's serial plan can lose on wall time even when its plan shape is structurally better.

Failsafe: always returns a plan

Any query ORCA can't handle — unsupported feature, internal error, timeout — falls back to PostgreSQL's standard_planner automatically. Enable pg_orca.trace_fallback to log every fallback with reason.

FAQ

pg_orca FAQ — install, compatibility, license.

No. It registers a planner_hook and is opt-in via SET pg_orca.enable_orca = on — either per session, or persistently with ALTER DATABASE mydb SET pg_orca.enable_orca = on. When disabled, PostgreSQL behaves exactly as it would without the extension loaded.
Get in touch

Questions, integration help, or commercial support?

Reach the team for benchmark questions, deployment advice, or to discuss using pg_orca in your stack.