Skip to content

Conversation

@avamingli
Copy link
Contributor

This commit enhances the AQUMV system by enabling it to compute queries directly from materialized views that already contain a GROUP BY clause. This improvement allows us to bypass additional GROUP BY operations during query execution, resulting in faster and more efficient performance.

For example, with a materialized view defined as follows:

CREATE MATERIALIZED VIEW mv_group_1 AS
SELECT c, b, COUNT(b) AS count_b FROM t0 WHERE a > 3 GROUP BY c, b;

An original query like:

SELECT COUNT(b), b, c FROM t0 WHERE a > 3 GROUP BY b, c;

is rewritten to:

SELECT count_b, b, c FROM mv_group_1;

The plan looks like:

explain(costs off, verbose)
select count(b), b, c from t0 where a > 3 group by b, c;
                      QUERY PLAN
---------------------------------------------------------------
 Gather Motion 3:1  (slice1; segments: 3)
   Output: count, b, c
   ->  Seq Scan on aqumv.mv_group_1
         Output: count, b, c
 Settings: enable_answer_query_using_materialized_views = 'on',
optimizer = 'off'
 Optimizer: Postgres query optimizer
(6 rows)

The two SQL queries yield equivalent results, even though the selected columns are in a different order. Since mv_group_1 already contains the aggregated results and all rows have a column a value greater than 3, there is no need for additional filtering or GROUP BY operations.

This enhancement eliminates redundant computations, leading to significant time savings. Fetching results directly from these views reduces overall execution time, improving responsiveness for complex queries. This is particularly beneficial for large datasets, allowing efficient data analysis without performance degradation.

The feature also applies to Dynamic Tables and Incremental Materialized Views.

Authored-by: Zhang Mingli [email protected]

Fixes #ISSUE_Number

What does this PR do?

Type of Change

  • Bug fix (non-breaking change)
  • New feature (non-breaking change)
  • Breaking change (fix or feature with breaking changes)
  • Documentation update

Breaking Changes

Test Plan

  • Unit tests added/updated
  • Integration tests added/updated
  • Passed make installcheck
  • Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Additional Context

CI Skip Instructions


@avamingli avamingli force-pushed the dev0 branch 4 times, most recently from 75d3b26 to 7e9b277 Compare June 18, 2025 07:55
This commit enhances the AQUMV system by enabling it to compute queries
directly from materialized views that already contain a GROUP BY clause.
This improvement allows us to bypass additional GROUP BY operations
during query execution, resulting in faster and more efficient
performance.

For example, with a materialized view defined as follows:

```sql
CREATE MATERIALIZED VIEW mv_group_1 AS
SELECT c, b, COUNT(b) AS count_b FROM t0 WHERE a > 3 GROUP BY c, b;
```
An original query like:
```sql
SELECT COUNT(b), b, c FROM t0 WHERE a > 3 GROUP BY b, c;
```
is rewritten to:
```sql
SELECT count_b, b, c FROM mv_group_1;
```
The plan looks like:
```sql
explain(costs off, verbose)
select count(b), b, c from t0 where a > 3 group by b, c;
                      QUERY PLAN
---------------------------------------------------------------
 Gather Motion 3:1  (slice1; segments: 3)
   Output: count, b, c
   ->  Seq Scan on aqumv.mv_group_1
         Output: count, b, c
 Settings: enable_answer_query_using_materialized_views = 'on',
optimizer = 'off'
 Optimizer: Postgres query optimizer
(6 rows)
```

The two SQL queries yield equivalent results, even though the selected
columns are in a different order. Since mv_group_1 already contains the
aggregated results and all rows have a column a value greater than 3,
there is no need for additional filtering or GROUP BY operations.

This enhancement eliminates redundant computations, leading to
significant time savings. Fetching results directly from these views
reduces overall execution time, improving responsiveness for complex
queries. This is particularly beneficial for large datasets, allowing
efficient data analysis without performance degradation.

The feature also applies to Dynamic Tables and Incremental Materialized
Views.

Authored-by: Zhang Mingli [email protected]
@my-ship-it my-ship-it merged commit d0e9daf into apache:main Jun 19, 2025
26 checks passed
@avamingli avamingli deleted the dev0 branch June 19, 2025 07:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants