Incremental view maintenance #1

avamingli · 2023-07-28T08:22:57Z

closes: #ISSUE_Number

Change logs

Describe your change clearly, including what problem is being solved or what feature is being added.

If it has some breaking backward or forward compatibility, please clary.

Why are the changes needed?

Describe why the changes are necessary.

Does this PR introduce any user-facing change?

If yes, please clarify the previous behavior and the change this PR proposes.

How was this patch tested?

Please detail how the changes were tested, including manual tests and any relevant unit or integration tests.

Contributor's Checklist

Here are some reminders and checklists before/when submitting your pull request, please check them:

Make sure your Pull Request has a clear title and commit message. You can take git-commit template as a reference.
Sign the Contributor License Agreement as prompted for your first-time contribution.
List your communication in the GitHub Issues or Discussions (if has or needed).
Document changes.
Add tests for the change
Pass make installcheck
Pass make -C src/test installcheck-cbdb-parallel
Feel free to @cloudberrydb/dev team for review and approval when your PR is ready🥳

Allow to create Incrementally Maintainable Materialized View (IMMV) by using INCREMENTAL option in CREATE MATERIALIZED VIEW command as follow: CREATE [INCREMANTAL] MATERIALIZED VIEW xxxxx AS SELECT ....;

If this boolean column is true, a relations is Incrementally Maintainable Materialized View (IMMV). This is set when IMMV is created.

Originally, tuplestores of AFTER trigger's transition tables were freed for each query depth. For our IVM implementation, we would like to prolong life of the tuplestores because we have to preserve them for a whole query assuming that some base tables might be changed in some trigger functions.

Add tab completion and meta-command output for IVM.

In this implementation, AFTER triggers are used to collect tuplestores containing transition table contents. When multiple tables are changed, multiple AFTER triggers are invoked, then the final AFTER trigger performs actual update of the matview. In addition, BEFORE triggers are also used to handle global information for view maintenance. To calculate view deltas, we need both pre-state and post-state of base tables. Post-update states are available in AFTER trigger, and pre-update states can be calculated by removing inserted tuples and appending deleted tuples. Insterted tuples are filtered using the snapshot taken before table modiication, and deleted tuples are contained in the old transition table. Incrementally Maintainable Materialized Views (IMMV) can contain duplicated tuples. This patch also allows self-join, simultaneous updates of more than one base table, and multiple updates of the same base table.

We used to rename index-backed constraints in the EXCHANGE PARTITION command in 6X. Now we don't. We've decided to keep that behavior in 7X after looking into the opposing arguments: Argument #1. It might cause problem during upgrade. - Firstly, we won't be using legacy syntax in the dump scripts so we just need to worry about the existing tables produced by EXCHANGE PARTITION. I.e. whether or not they can be restored correctly. - For upgrading from 6X->7X, since those tables already have matched constraint and index names with the table names, we should be OK. - For upgrading 7X->onward, since we implement EXCHANGE PARTIITON simply as a combination of upstream-syntax commands (see AtExecGPExchangePartition()), pg_upgrade should be able to handle them. We've verified that manually and the automated test should cover that too. In fact, this gives another point that we shouldn't do our own hacky things as part of EXCHANGE PARTITION which might confuse pg_upgrade. Argument #2. It might surprise the users and their existing workloads. - The indexed constraint names are all implicitly generated and shouldn't be directly used in most cases. - They are not the only thing that will appear changed. E.g. the normal indexes (e.g. B-tree ones) will be too. So the decision to change one should be made with changing all of them. More details see: https://docs.google.com/document/d/1enJdKYxkk5WFRV1WoqIgLgRxCGxOqI2nglJVE_Wglec

## Problem An error occurs in python lib when a plpython function is executed. After our analysis, in the user's cluster, a plpython UDF was running with the unstable network, and got a timeout error: `failed to acquire resources on one or more segments`. Then a plpython UDF was run in the same session, and the UDF failed with GC error. Here is the core dump: ``` 2023-11-24 10:15:18.945507 CST,,,p2705198,th2081832064,,,,0,,,seg-1,,,,,"LOG","00000","3rd party error log: #0 0x7f7c68b6d55b in frame_dealloc /home/cc/repo/cpython/Objects/frameobject.c:509:5 #1 0x7f7c68b5109d in gen_send_ex /home/cc/repo/cpython/Objects/genobject.c:108:9 #2 0x7f7c68af9ddd in PyIter_Next /home/cc/repo/cpython/Objects/abstract.c:3118:14 #3 0x7f7c78caa5c0 in PLy_exec_function /home/cc/repo/gpdb6/src/pl/plpython/plpy_exec.c:134:11 #4 0x7f7c78cb5ffb in plpython_call_handler /home/cc/repo/gpdb6/src/pl/plpython/plpy_main.c:387:13 #5 0x562f5e008bb5 in ExecMakeTableFunctionResult /home/cc/repo/gpdb6/src/backend/executor/execQual.c:2395:13 #6 0x562f5e0dddec in FunctionNext_guts /home/cc/repo/gpdb6/src/backend/executor/nodeFunctionscan.c:142:5 #7 0x562f5e0da094 in FunctionNext /home/cc/repo/gpdb6/src/backend/executor/nodeFunctionscan.c:350:11 apache#8 0x562f5e03d4b0 in ExecScanFetch /home/cc/repo/gpdb6/src/backend/executor/execScan.c:84:9 apache#9 0x562f5e03cd8f in ExecScan /home/cc/repo/gpdb6/src/backend/executor/execScan.c:154:10 #10 0x562f5e0da072 in ExecFunctionScan /home/cc/repo/gpdb6/src/backend/executor/nodeFunctionscan.c:380:9 apache#11 0x562f5e001a1c in ExecProcNode /home/cc/repo/gpdb6/src/backend/executor/execProcnode.c:1071:13 apache#12 0x562f5dfe6377 in ExecutePlan /home/cc/repo/gpdb6/src/backend/executor/execMain.c:3202:10 apache#13 0x562f5dfe5bf4 in standard_ExecutorRun /home/cc/repo/gpdb6/src/backend/executor/execMain.c:1171:5 apache#14 0x562f5dfe4877 in ExecutorRun /home/cc/repo/gpdb6/src/backend/executor/execMain.c:992:4 apache#15 0x562f5e857e69 in PortalRunSelect /home/cc/repo/gpdb6/src/backend/tcop/pquery.c:1164:4 apache#16 0x562f5e856d3f in PortalRun /home/cc/repo/gpdb6/src/backend/tcop/pquery.c:1005:18 apache#17 0x562f5e84607a in exec_simple_query /home/cc/repo/gpdb6/src/backend/tcop/postgres.c:1848:10 ``` ## Reproduce We can use a simple procedure to reproduce the above problem: - set timeout GUC: `gpconfig -c gp_segment_connect_timeout -v 5` and `gpstop -ari` - prepare function: ``` CREATE EXTENSION plpythonu; CREATE OR REPLACE FUNCTION test_func() RETURNS SETOF int AS $$ plpy.execute("select pg_backend_pid()") for i in range(0, 5): yield (i) $$ LANGUAGE plpythonu; ``` - exit from the current psql session. - stop the postmaster of segment: `gdb -p "the pid of segment postmaster"` - enter a psql session. - call `SELECT test_func();` and get error ``` gpadmin=# select test_func(); ERROR: function "test_func" error fetching next item from iterator (plpy_elog.c:121) DETAIL: Exception: failed to acquire resources on one or more segments CONTEXT: Traceback (most recent call last): PL/Python function "test_func" ``` - quit gdb and make postmaster runnable. - call `SELECT test_func();` again and get panic ``` gpadmin=# SELECT test_func(); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. !> ``` ## Analysis - There is an SPI call in test_func(): `plpy.execute()`. - Then coordinator will start a subtransaction by PLy_spi_subtransaction_begin(); - Meanwhile, if the segment cannot receive the instruction from the coordinator, the subtransaction beginning procedure return fails. - BUT! The Python processor does not know whether an error happened and does not clean its environment. - Then the next plpython UDF in the same session will fail due to the wrong Python environment. ## Solution - Use try-catch to catch the exception caused by PLy_spi_subtransaction_begin() - set the python error indicator by PLy_spi_exception_set() Co-authored-by: Chen Mulong <[email protected]>

In commit a0684da we made a change to use the fixed TOAST_TUPLE_TARGET instead of custom toast_tuple_target that's calculated based on the the blocksize of the AO table for toasting AO table tuples. This caused an issue that some tuples that are well beyond the toast_tuple_target are not being toasted (because they're smaller than TOAST_TUPLE_TARGET). This is ok when the tuples can still fit into the AO table's blocksize. But if not, an error would occur. E.g.: postgres=# create table t (a int, b int[]) WITH (appendonly=true, blocksize=8192); CREATE TABLE postgres=# insert into t select 1, array_agg(x) from generate_series(1, 2030) x; ERROR: item too long (check #1): length 8160, maxBufferLen 8168 CONTEXT: Append-Only table 't' Therefore, we should revert those changes, including: 1. Still use the toast_tuple_target for toasting 'x' and 'e' columns for AO tables. There's also a GPDB_12_MERGE_FIXME in the original comment suggesting to use RelationGetToastTupleTarget for AO table (in order to unify the logic to calculate maxDataLen for heap and AO tuples, I suppose). But that's not possible currently because the calculated toast_tuple_target is stored in AppendOnlyInsertDesc and we cannot get it from the Relation struct. And it seems to me that we won't have a way to do that easily. Therefore, keep this FIXME comment being removed. 2. Still use the toast_tuple_target for toasting 'm' columns for AO tables. Also restore the FIXME comment suggesting that we should use a bigger target size for 'm' columns. This should be something that worth investigating more into. Commit a0684da also includes a change to use custom toast_tuple_target for heap tables, which should be a valid change. I think that fixed an oversight when merging the upstream commit c251336.

yjhjstz added 8 commits July 24, 2023 13:18

Add a syntax to create Incrementally Maintainable Materialized Views

4bca308

Allow to create Incrementally Maintainable Materialized View (IMMV) by using INCREMENTAL option in CREATE MATERIALIZED VIEW command as follow: CREATE [INCREMANTAL] MATERIALIZED VIEW xxxxx AS SELECT ....;

Add relisivm column to pg_class system catalog

ce7deab

If this boolean column is true, a relations is Incrementally Maintainable Materialized View (IMMV). This is set when IMMV is created.

Add Incremental View Maintenance support to pg_dump

77bebdd

Add Incremental View Maintenance support to psql

8f43ca8

Add tab completion and meta-command output for IVM.

Fix: dispatch ivm trigger

6b239f9

test ivm utility mode

e30ffe9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incremental view maintenance #1

Incremental view maintenance #1

avamingli commented Jul 28, 2023

Incremental view maintenance #1

Are you sure you want to change the base?

Incremental view maintenance #1

Conversation

avamingli commented Jul 28, 2023

Change logs

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Contributor's Checklist