gpdb: FOR over SELECT fails on temporary tablespace in PL/pgSQL

Bug Report

Greenplum version or build

6.18.2, 6X_STABLE (current top e3e95865e77)

The issue is very strange, different versions behave quite differently. See details in the following sections.

Expected behavior

The following code won’t fail:

create tablespace myspace location '/tmp/xxx123';
set temp_tablespaces = myspace; -- this is important to show the bug, without this it works.
DO
$$
    DECLARE l_rec RECORD;
    BEGIN
        DROP TABLE IF EXISTS tmp_table;
        CREATE TEMPORARY TABLE tmp_table
        (
            key VARCHAR NOT NULL,
            value   VARCHAR NOT NULL
        )   DISTRIBUTED RANDOMLY;
        INSERT INTO tmp_table VALUES ('a',1),('b',2);
        FOR l_rec IN ( SELECT key, value
                       FROM tmp_table
                       )
            LOOP
                RAISE NOTICE '%:%', l_rec.key, l_rec.value;
            END LOOP;
    END;
$$;

Actual behavior

Current 6X (top commit: e3e95865e), I hit two kinds of errors:

In a newly created cluster, run the case:

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------
 PostgreSQL 9.4.26 (Greenplum Database 6.18.2+dev.171.ge3e95865e7 build dev) 
(1 row)

gpadmin=# DO
$$
    DECLARE l_rec RECORD;
    BEGIN
        DROP TABLE IF EXISTS tmp_table;
        CREATE TEMPORARY TABLE tmp_table
        (
            key VARCHAR NOT NULL,
            value   VARCHAR NOT NULL
        )   DISTRIBUTED RANDOMLY;
        INSERT INTO tmp_table VALUES ('a',1),('b',2);
        FOR l_rec IN ( SELECT key, value
                       FROM tmp_table
                       )
            LOOP
                RAISE NOTICE '%:%', l_rec.key, l_rec.value;
            END LOOP;
    END;
$$;
NOTICE:  table "tmp_table" does not exist, skipping
CONTEXT:  SQL statement "DROP TABLE IF EXISTS tmp_table"
PL/pgSQL function inline_code_block line 4 at SQL statement
ERROR:  too many LWLocks taken (lwlock.c:559)  (seg0 slice1 127.0.1.1:6002 pid=228592) (lwlock.c:559)
CONTEXT:  PL/pgSQL function inline_code_block line 11 at FOR over SELECT rows
gpadmin=#

log out and then log in to re-run:

gpadmin=# set temp_tablespaces = myspace;
SET
gpadmin=# DO
$$
    DECLARE l_rec RECORD;
    BEGIN
        DROP TABLE IF EXISTS tmp_table;
        CREATE TEMPORARY TABLE tmp_table
        (
            key VARCHAR NOT NULL,
            value   VARCHAR NOT NULL
        )   DISTRIBUTED RANDOMLY;
        INSERT INTO tmp_table VALUES ('a',1),('b',2);
        FOR l_rec IN ( SELECT key, value
                       FROM tmp_table
                       )
            LOOP
                RAISE NOTICE '%:%', l_rec.key, l_rec.value;
            END LOOP;
    END;
$$;
NOTICE:  table "tmp_table" does not exist, skipping
CONTEXT:  SQL statement "DROP TABLE IF EXISTS tmp_table"
PL/pgSQL function inline_code_block line 4 at SQL statement
ERROR:  could not create directory "pg_tblspc/16414/GPDB_6_301908232/16410": No such file or directory
CONTEXT:  SQL statement "CREATE TEMPORARY TABLE tmp_table
        (
            key VARCHAR NOT NULL,
            value   VARCHAR NOT NULL
        )   DISTRIBUTED RANDOMLY"
PL/pgSQL function inline_code_block line 5 at SQL statement

Then drop tablespace and re-create and rerun:

gpadmin=# drop tablespace myspace ;
DROP TABLESPACE
gpadmin=# create tablespace myspace location '/tmp/xxx123';
CREATE TABLESPACE
gpadmin=# set temp_tablespaces = myspace;
SET
gpadmin=# DO
$$
    DECLARE l_rec RECORD;
    BEGIN
        DROP TABLE IF EXISTS tmp_table;
        CREATE TEMPORARY TABLE tmp_table
        (
            key VARCHAR NOT NULL,
            value   VARCHAR NOT NULL
        )   DISTRIBUTED RANDOMLY;
        INSERT INTO tmp_table VALUES ('a',1),('b',2);
        FOR l_rec IN ( SELECT key, value
                       FROM tmp_table
                       )
            LOOP
                RAISE NOTICE '%:%', l_rec.key, l_rec.value;
            END LOOP;
    END;
$$;
NOTICE:  table "tmp_table" does not exist, skipping
CONTEXT:  SQL statement "DROP TABLE IF EXISTS tmp_table"
PL/pgSQL function inline_code_block line 4 at SQL statement
ERROR:  relation not found (OID 16453)  (seg0 slice1 127.0.1.1:6002 pid=228752)
DETAIL:  This can be validly caused by a concurrent delete operation on this object.
CONTEXT:  PL/pgSQL function inline_code_block line 11 at FOR over SELECT rows

Test using 6.18.2:

psql:a.sql:25: ERROR:  could not open existing temporary file "base/pgsql_tmp/pgsql_tmp_sess15_w3326427_qdxid7_sync8": No such file or directory (fd.c:1532)  (seg0 slice1 10.117.190.60:6002 pid=3326433) (fd.c:1532)

Step to reproduce the behavior

Run the code:

create tablespace myspace location '/tmp/xxx123';
set temp_tablespaces = myspace; -- this is important to show the bug, without this it works.
DO
$$
    DECLARE l_rec RECORD;
    BEGIN
        DROP TABLE IF EXISTS tmp_table;
        CREATE TEMPORARY TABLE tmp_table
        (
            key VARCHAR NOT NULL,
            value   VARCHAR NOT NULL
        )   DISTRIBUTED RANDOMLY;
        INSERT INTO tmp_table VALUES ('a',1),('b',2);
        FOR l_rec IN ( SELECT key, value
                       FROM tmp_table
                       )
            LOOP
                RAISE NOTICE '%:%', l_rec.key, l_rec.value;
            END LOOP;
    END;
$$;

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 17 (14 by maintainers)

Commits related to this issue

Most upvoted comments

To provide clarity on the issue with #12755, here is stack where recursion happens (will post other update for relation not found error)

High-level

  • a reading shared snapshot in reader gang needs to read catalog to get tablespace oid
  • which in turn needs to load shared combo cids
  • for loading combo cids again needs BufFile as well
#2274 0x000055fcf1539219 in get_tablespace_oid (tablespacename=0x55fcf3e7a778 "myspace", missing_ok=1 '\001') at tablespace.c:1770                                                                                                                                                   
#2275 0x000055fcf1539104 in PrepareTempTablespaces () at tablespace.c:1717                                                                                                                                                                                                           
#2276 0x000055fcf1797e01 in BufFileOpenNamedTemp (fileName=0x7ffc2d56a070 "sess8_w200292_combocid_map", interXact=1 '\001') at buffile.c:302                                                                                                                                         
#2277 0x000055fcf19eee32 in loadSharedComboCommandId (xmin=746, combocid=0, cmin=0x7ffc2d56a4b0, cmax=0x7ffc2d56a4b4) at combocid.c:548                                                                                                                                              
#2278 0x000055fcf19ee85b in getSharedComboCidEntry (xmin=746, combocid=0, min_or_max=CMIN) at combocid.c:411                                                                                                                                                                         
#2279 0x000055fcf19ee912 in GetRealCmin (xmin=746, combocid=0) at combocid.c:426                                                                                                                                                                                                     
#2280 0x000055fcf19ee16b in HeapTupleHeaderGetCmin (tup=0x7fe358e08458) at combocid.c:144                                                                                                                                                                                            
#2281 0x000055fcf19f0cc5 in HeapTupleSatisfiesMVCC (relation=0x7fe35862d608, htup=0x7ffc2d56a640, snapshot=0x55fcf1e10f20 <CatalogSnapshotData>, buffer=3) at tqual.c:1055                                                                                                           
#2282 0x000055fcf12e142a in heapgetpage (scan=0x55fcf3e6b158, page=0) at heapam.c:441                                                                                                                                                                                                
#2283 0x000055fcf12e257f in heapgettup_pagemode (scan=0x55fcf3e6b158, dir=ForwardScanDirection, nkeys=1, key=0x55fcf3da0a18) at heapam.c:800                                                                                                                                         
#2284 0x000055fcf12e43c6 in heap_getnext (scan=0x55fcf3e6b158, direction=ForwardScanDirection) at heapam.c:1769                                                                                                                                                                      
#2285 0x000055fcf1305b19 in systable_getnext (sysscan=0x55fcf3da09c0) at genam.c:422                                                                                                                                                                                                 
#2286 0x000055fcf19362d1 in ScanPgRelation (targetRelId=2662, indexOK=0 '\000', force_non_historic=0 '\000') at relcache.c:357                                                                                                                                                       
#2287 0x000055fcf1938ef1 in RelationReloadIndexInfo (relation=0x55fcf3d8ae48) at relcache.c:1927                                                                                                                                                                                     
#2288 0x000055fcf1938be5 in RelationIdGetRelation (relationId=2662) at relcache.c:1757                                                                                                                                                                                               
#2289 0x000055fcf12e32e0 in relation_open (relationId=2662, lockmode=1) at heapam.c:1086                                                                                                                                                                                             
#2290 0x000055fcf130628d in index_open (relationId=2662, lockmode=1) at indexam.c:167                                                                                                                                                                                                
#2291 0x000055fcf1305841 in systable_beginscan (heapRelation=0x7fe35862d608, indexId=2662, indexOK=1 '\001', snapshot=0x55fcf1e10f20 <CatalogSnapshotData>, nkeys=1, key=0x7ffc2d56a970) at genam.c:322                                                                              
#2292 0x000055fcf19362c1 in ScanPgRelation (targetRelId=1259, indexOK=1 '\001', force_non_historic=0 '\000') at relcache.c:352                                                                                                                                                       
#2293 0x000055fcf1939240 in RelationReloadNailed (relation=0x7fe35862d608) at relcache.c:2049                                                                                                                                                                                        
#2294 0x000055fcf1939664 in RelationClearRelation (relation=0x7fe35862d608, rebuild=1 '\001') at relcache.c:2177                                                                                                                                                                     
#2295 0x000055fcf1938bf8 in RelationIdGetRelation (relationId=1259) at relcache.c:1759                                                                                                                                                                                               
#2296 0x000055fcf12e32e0 in relation_open (relationId=1259, lockmode=1) at heapam.c:1086                                                                                                                                                                                             
#2297 0x000055fcf12e3b00 in heap_open (relationId=1259, lockmode=1) at heapam.c:1452                                                                                                                                                                                                 
#2298 0x000055fcf193624f in ScanPgRelation (targetRelId=1213, indexOK=1 '\001', force_non_historic=0 '\000') at relcache.c:338                                                                                                                                                       
#2299 0x000055fcf1937198 in RelationBuildDesc (targetRelId=1213, insertIt=1 '\001') at relcache.c:886                                                                                                                                                                                
#2300 0x000055fcf1938c5a in RelationIdGetRelation (relationId=1213) at relcache.c:1778                                                                                                                                                                                               
#2301 0x000055fcf12e32e0 in relation_open (relationId=1213, lockmode=1) at heapam.c:1086                                                                                                                                                                                             
#2302 0x000055fcf12e3b00 in heap_open (relationId=1213, lockmode=1) at heapam.c:1452                                                                                                                                                                                                 
#2303 0x000055fcf1539219 in get_tablespace_oid (tablespacename=0x55fcf3da08c8 "myspace", missing_ok=1 '\001') at tablespace.c:1770                                                                                                                                                   
#2304 0x000055fcf1539104 in PrepareTempTablespaces () at tablespace.c:1717                                                                                                                                                                                                           
#2305 0x000055fcf1797e01 in BufFileOpenNamedTemp (fileName=0x55fcf1e72f60 <filename> "sess8_w200292_qdxid15_sync33", interXact=0 '\000') at buffile.c:302                                                                                                                            
#2306 0x000055fcf19f6456 in readSharedLocalSnapshot_forCursor (snapshot=0x55fcf1e10e20 <CurrentSnapshotData>, distributedTransactionContext=DTX_CONTEXT_QE_READER) at sharedsnapshot.c:744                                                                                           
#2307 0x000055fcf17a50d7 in GetSnapshotData (snapshot=0x55fcf1e10e20 <CurrentSnapshotData>, distributedTransactionContext=DTX_CONTEXT_QE_READER) at procarray.c:2107                                                                                                                 
#2308 0x000055fcf19f2582 in GetTransactionSnapshot () at snapmgr.c:220                                                                                                                                                                                                               
#2309 0x000055fcf17de112 in PortalStart (portal=0x55fcf3db0058, params=0x0, eflags=0, snapshot=0x0, ddesc=0x55fcf3d9b910) at pquery.c:649                                                                                                                                            
#2310 0x000055fcf17d54df in exec_mpp_query (query_string=0x55fcf3d9a360 "( SELECT key, value\n", ' ' <repeats 23 times>, "FROM tmp_table\n", ' ' <repeats 23 times>, ")", serializedQuerytree=0x0, serializedQuerytreelen=0, serializedPlantree=0x55fcf3d9a3b3 "(\265/\375`\210\001\\
065\a", serializedPlantreelen=240, serializedParams=0x0, serializedParamslen=0, serializedQueryDispatchDesc=0x55fcf3d9a4a3 "(\265/\375 \362\025\004", serializedQueryDispatchDesclen=139) at postgres.c:1358                                                                         
#2311 0x000055fcf17dc0fd in PostgresMain (argc=1, argv=0x55fcf3cfad98, dbname=0x55fcf3cfac30 "postgres", username=0x55fcf3cfac10 "ashwin") at postgres.c:5400                                                                                                                        
#2312 0x000055fcf173ab2f in BackendRun (port=0x55fcf3d23ac0) at postmaster.c:4815                                                                                                                                                                                                    
#2313 0x000055fcf173a0d2 in BackendStartup (port=0x55fcf3d23ac0) at postmaster.c:4472                                                                                                                                                                                                
#2314 0x000055fcf1735b6b in ServerLoop () at postmaster.c:1948                                                                                                                                                                                                                       
#2315 0x000055fcf1734f15 in PostmasterMain (argc=5, argv=0x55fcf3cf8cf0) at postmaster.c:1518                                                                                                                                                                                        
#2316 0x000055fcf161d21f in main (argc=5, argv=0x55fcf3cf8cf0) at main.c:245                                                                                                                                                                                                         

I am feeling for shared snapshot and for combocids for sure we should always place them in default locations. temp_tablespaces are really more for spill files and such, not for such small meta-data files. Though need to see how that can possibly be achieved from code.

One of the workarounds for this issue could be to avoid PrepareTempTablespaces() if file outlives the transaction boundaries as in that case anyways OpenNamedTemporaryFile() is going to place the file in database’s default location. For combocids we always pass interXact to true currently.

diff --git a/src/backend/storage/file/buffile.c b/src/backend/storage/file/buffile.c
index 3b287b7ad10..65b3b68db4b 100644
--- a/src/backend/storage/file/buffile.c
+++ b/src/backend/storage/file/buffile.c
@@ -258,8 +258,13 @@ BufFileCreateNamedTemp(const char *fileName, bool interXact, workfile_set *work_
         * pretty hard-to-detect bug.  Callers may prefer to do it earlier if they
         * want to be sure that any required catalog access is done in some other
         * resource context.
+        *
+        * if the temp file is slated to outlive the current transaction, then its
+        * always placed in database's default tablespace, hence no need to
+        * PrepareTempTablespaces() for it.
         */
-       PrepareTempTablespaces();
+       if (!interXact)
+               PrepareTempTablespaces();
 
        pfile = OpenNamedTemporaryFile(fileName,
                                                                   true, /* create */
@@ -298,8 +303,14 @@ BufFileOpenNamedTemp(const char *fileName, bool interXact)
         * pretty hard-to-detect bug.  Callers may prefer to do it earlier if they
         * want to be sure that any required catalog access is done in some other
         * resource context.
+        *
+        * if the temp file is slated to outlive the current transaction, then its
+        * always placed in database's default tablespace, hence no need to
+        * PrepareTempTablespaces() for it.
         */
-       PrepareTempTablespaces();
+       if (!interXact)
+               PrepareTempTablespaces();
+
        pfile = OpenNamedTemporaryFile(fileName,
                                                                   false,       /* create */
                                                                   false,       /* delOnClose */

Note: gpdb master branch doesn’t face this issue or challenge because we moved away from writing shared snapshot and combocid to on-disk file. Instead we use DSM segment shared by the QE writer and reader process for this.

Why 6X_STABLE (top fc39bc9cd35972) now will hit infinite loop?

Toast table and comb cid

See tmp_table’s definition:

CREATE TEMPORARY TABLE tmp_table
(
            key VARCHAR NOT NULL,
            value   VARCHAR NOT NULL
)   DISTRIBUTED RANDOMLY;

The table’s fields is toast-able, so while creating the table, it will also create a related toast table, and a heap_update will be invoked to set the pg_class entry’s reltoastrelid column.

In the same transaction, it creates a tuple and update a tuple, so the old tuple’s cmin_cmax will be a comb cid. (You can use pageinspect and gp_select_invisible to do some local experiment).

Reader Gang of Cursor

It will invoke readSharedLocalSnapshot_forCursor to try to fetch the buf-file and load the snapshot, and if the GUC temp_tablespaces is set, it will try to get the information of the tablespace, this leads to catalog access (detailed please refer above comments call stack https://github.com/greenplum-db/gpdb/issues/12871#issuecomment-982205797):

  1. it (the cursor’s reader gang) want to open the catalog pg_tablespace (oid = 1213)
  2. it then by a call chain seems a littlecompliated, it will try to scan pg_class using seq scan
  3. if it encounter the old tuple of tmp_table in pg_class (here by old, I mean the one before heap_update, the one that not set toastrelid), that tuple is created and updated by current transaction, so it will try to compare command id
  4. the command id is a comb cid, so it will try to fetch it from the hash table
  5. the hash table is in wrtiable gang’s local memory (not shared), so the cursor’s reader gang will invoke loadSharedComboCommandId (we are cursor, we are reader gang, let’s go to read the buf-file and load it)
  6. BufFileOpenNamedTemp is invoked, chain: GetRealCmin -> getSharedComboCidEntry -> loadSharedComboCommandId -> BufFileOpenNamedTemp
  7. infinite loop happens…

Misc: not toast-able

So if the table cannot be toast-able, we can never hit this issue, because there is no comb cid.

I try this code and never hit the issue:

create tablespace myspace location '/tmp/xxx123';
set temp_tablespaces = myspace; -- this is important to show the bug, without this it works.
DO
$$
    DECLARE l_rec RECORD;
    BEGIN
        DROP TABLE IF EXISTS tmp_table;
        CREATE TEMPORARY TABLE tmp_table
        (
            key int NOT NULL,
            value   int NOT NULL
        )   DISTRIBUTED RANDOMLY;
        INSERT INTO tmp_table VALUES (1,1),(2,2);
        FOR l_rec IN ( SELECT key, value
                       FROM tmp_table
                       )
            LOOP
                RAISE NOTICE 'ok';
            END LOOP;
    END;
$$;

Un-answered Questions

  1. why later calls does not fall into infinite loop?
  2. why some later calls neither infinite loop nor throw oid not found, instead, it throws ERROR: could not create directory "pg_tblspc/16414/GPDB_6_301908232/16410": No such file or directory?
  3. return the reproduce script, several times, it will PANIC (the coredump is generated by lwlock is already held error, it is seen in dev machine in Beijing office, I cannot reproduce now in my laptop), will try tomorrow.

Let me for sake of completeness post the RCA for the original issue that was reported for 6.18.2 release, the error psql:a.sql:25: ERROR: could not open existing temporary file "base/pgsql_tmp/pgsql_tmp_sess15_w3326427_qdxid7_sync8": No such file or directory (fd.c:

In writer gang during DefineRelation() when GetDefaultTablespace() is called, it internally calls PrepareTempTablespaces() which sets up the tablespaces for writer gang. So, the shared snapshot file gets created in temp_tablespaces location. Though in reader gang, no PrepareTempTablespaces() call happens hence it continues to look for the file in default location causing the error.