LCOV - differential code coverage report
Current view: top level - src/backend/replication/logical - reorderbuffer.c (source / functions) Coverage Total Hit UNC LBC UBC GBC GNC CBC EUB ECB DCB
Current: b45a8d7d8b306b43f31a002f1b3f1dddc8defeaf vs 8767b449a3a1e75626dfb08f24da54933171d4c5 Lines: 93.5 % 1775 1660 3 112 1 27 1632 14
Current Date: 2025-10-28 08:26:42 +0900 Functions: 100.0 % 94 94 12 82
Baseline: lcov-20251028-005825-baseline Branches: 68.8 % 1173 807 1 5 360 1 9 797 3 3
Baseline Date: 2025-10-27 06:37:35 +0000 Line coverage date bins:
Legend: Lines:     hit not hit
Branches: + taken - not taken # not executed
(1,7] days: 100.0 % 1 1 1
(7,30] days: 100.0 % 15 15 15
(30,360] days: 89.3 % 169 151 18 11 140
(360..) days: 93.9 % 1590 1493 3 94 1 1492
Function coverage date bins:
(30,360] days: 100.0 % 14 14 14
(360..) days: 100.0 % 80 80 12 68
Branch coverage date bins:
(7,30] days: 100.0 % 6 6 6
(30,360] days: 66.7 % 108 72 1 35 3 69
(360..) days: 68.5 % 1065 729 5 325 1 728 3 3

 Age         Owner                    Branch data    TLA  Line data    Source code
                                  1                 :                : /*-------------------------------------------------------------------------
                                  2                 :                :  *
                                  3                 :                :  * reorderbuffer.c
                                  4                 :                :  *    PostgreSQL logical replay/reorder buffer management
                                  5                 :                :  *
                                  6                 :                :  *
                                  7                 :                :  * Copyright (c) 2012-2025, PostgreSQL Global Development Group
                                  8                 :                :  *
                                  9                 :                :  *
                                 10                 :                :  * IDENTIFICATION
                                 11                 :                :  *    src/backend/replication/logical/reorderbuffer.c
                                 12                 :                :  *
                                 13                 :                :  * NOTES
                                 14                 :                :  *    This module gets handed individual pieces of transactions in the order
                                 15                 :                :  *    they are written to the WAL and is responsible to reassemble them into
                                 16                 :                :  *    toplevel transaction sized pieces. When a transaction is completely
                                 17                 :                :  *    reassembled - signaled by reading the transaction commit record - it
                                 18                 :                :  *    will then call the output plugin (cf. ReorderBufferCommit()) with the
                                 19                 :                :  *    individual changes. The output plugins rely on snapshots built by
                                 20                 :                :  *    snapbuild.c which hands them to us.
                                 21                 :                :  *
                                 22                 :                :  *    Transactions and subtransactions/savepoints in postgres are not
                                 23                 :                :  *    immediately linked to each other from outside the performing
                                 24                 :                :  *    backend. Only at commit/abort (or special xact_assignment records) they
                                 25                 :                :  *    are linked together. Which means that we will have to splice together a
                                 26                 :                :  *    toplevel transaction from its subtransactions. To do that efficiently we
                                 27                 :                :  *    build a binary heap indexed by the smallest current lsn of the individual
                                 28                 :                :  *    subtransactions' changestreams. As the individual streams are inherently
                                 29                 :                :  *    ordered by LSN - since that is where we build them from - the transaction
                                 30                 :                :  *    can easily be reassembled by always using the subtransaction with the
                                 31                 :                :  *    smallest current LSN from the heap.
                                 32                 :                :  *
                                 33                 :                :  *    In order to cope with large transactions - which can be several times as
                                 34                 :                :  *    big as the available memory - this module supports spooling the contents
                                 35                 :                :  *    of large transactions to disk. When the transaction is replayed the
                                 36                 :                :  *    contents of individual (sub-)transactions will be read from disk in
                                 37                 :                :  *    chunks.
                                 38                 :                :  *
                                 39                 :                :  *    This module also has to deal with reassembling toast records from the
                                 40                 :                :  *    individual chunks stored in WAL. When a new (or initial) version of a
                                 41                 :                :  *    tuple is stored in WAL it will always be preceded by the toast chunks
                                 42                 :                :  *    emitted for the columns stored out of line. Within a single toplevel
                                 43                 :                :  *    transaction there will be no other data carrying records between a row's
                                 44                 :                :  *    toast chunks and the row data itself. See ReorderBufferToast* for
                                 45                 :                :  *    details.
                                 46                 :                :  *
                                 47                 :                :  *    ReorderBuffer uses two special memory context types - SlabContext for
                                 48                 :                :  *    allocations of fixed-length structures (changes and transactions), and
                                 49                 :                :  *    GenerationContext for the variable-length transaction data (allocated
                                 50                 :                :  *    and freed in groups with similar lifespans).
                                 51                 :                :  *
                                 52                 :                :  *    To limit the amount of memory used by decoded changes, we track memory
                                 53                 :                :  *    used at the reorder buffer level (i.e. total amount of memory), and for
                                 54                 :                :  *    each transaction. When the total amount of used memory exceeds the
                                 55                 :                :  *    limit, the transaction consuming the most memory is then serialized to
                                 56                 :                :  *    disk.
                                 57                 :                :  *
                                 58                 :                :  *    Only decoded changes are evicted from memory (spilled to disk), not the
                                 59                 :                :  *    transaction records. The number of toplevel transactions is limited,
                                 60                 :                :  *    but a transaction with many subtransactions may still consume significant
                                 61                 :                :  *    amounts of memory. However, the transaction records are fairly small and
                                 62                 :                :  *    are not included in the memory limit.
                                 63                 :                :  *
                                 64                 :                :  *    The current eviction algorithm is very simple - the transaction is
                                 65                 :                :  *    picked merely by size, while it might be useful to also consider age
                                 66                 :                :  *    (LSN) of the changes for example. With the new Generational memory
                                 67                 :                :  *    allocator, evicting the oldest changes would make it more likely the
                                 68                 :                :  *    memory gets actually freed.
                                 69                 :                :  *
                                 70                 :                :  *    We use a max-heap with transaction size as the key to efficiently find
                                 71                 :                :  *    the largest transaction. We update the max-heap whenever the memory
                                 72                 :                :  *    counter is updated; however transactions with size 0 are not stored in
                                 73                 :                :  *    the heap, because they have no changes to evict.
                                 74                 :                :  *
                                 75                 :                :  *    We still rely on max_changes_in_memory when loading serialized changes
                                 76                 :                :  *    back into memory. At that point we can't use the memory limit directly
                                 77                 :                :  *    as we load the subxacts independently. One option to deal with this
                                 78                 :                :  *    would be to count the subxacts, and allow each to allocate 1/N of the
                                 79                 :                :  *    memory limit. That however does not seem very appealing, because with
                                 80                 :                :  *    many subtransactions it may easily cause thrashing (short cycles of
                                 81                 :                :  *    deserializing and applying very few changes). We probably should give
                                 82                 :                :  *    a bit more memory to the oldest subtransactions, because it's likely
                                 83                 :                :  *    they are the source for the next sequence of changes.
                                 84                 :                :  *
                                 85                 :                :  * -------------------------------------------------------------------------
                                 86                 :                :  */
                                 87                 :                : #include "postgres.h"
                                 88                 :                : 
                                 89                 :                : #include <unistd.h>
                                 90                 :                : #include <sys/stat.h>
                                 91                 :                : 
                                 92                 :                : #include "access/detoast.h"
                                 93                 :                : #include "access/heapam.h"
                                 94                 :                : #include "access/rewriteheap.h"
                                 95                 :                : #include "access/transam.h"
                                 96                 :                : #include "access/xact.h"
                                 97                 :                : #include "access/xlog_internal.h"
                                 98                 :                : #include "catalog/catalog.h"
                                 99                 :                : #include "common/int.h"
                                100                 :                : #include "lib/binaryheap.h"
                                101                 :                : #include "miscadmin.h"
                                102                 :                : #include "pgstat.h"
                                103                 :                : #include "replication/logical.h"
                                104                 :                : #include "replication/reorderbuffer.h"
                                105                 :                : #include "replication/slot.h"
                                106                 :                : #include "replication/snapbuild.h"    /* just for SnapBuildSnapDecRefcount */
                                107                 :                : #include "storage/bufmgr.h"
                                108                 :                : #include "storage/fd.h"
                                109                 :                : #include "storage/procarray.h"
                                110                 :                : #include "storage/sinval.h"
                                111                 :                : #include "utils/builtins.h"
                                112                 :                : #include "utils/inval.h"
                                113                 :                : #include "utils/memutils.h"
                                114                 :                : #include "utils/rel.h"
                                115                 :                : #include "utils/relfilenumbermap.h"
                                116                 :                : 
                                117                 :                : /*
                                118                 :                :  * Each transaction has an 8MB limit for invalidation messages distributed from
                                119                 :                :  * other transactions. This limit is set considering scenarios with many
                                120                 :                :  * concurrent logical decoding operations. When the distributed invalidation
                                121                 :                :  * messages reach this threshold, the transaction is marked as
                                122                 :                :  * RBTXN_DISTR_INVAL_OVERFLOWED to invalidate the complete cache as we have lost
                                123                 :                :  * some inval messages and hence don't know what needs to be invalidated.
                                124                 :                :  */
                                125                 :                : #define MAX_DISTR_INVAL_MSG_PER_TXN \
                                126                 :                :     ((8 * 1024 * 1024) / sizeof(SharedInvalidationMessage))
                                127                 :                : 
                                128                 :                : /* entry for a hash table we use to map from xid to our transaction state */
                                129                 :                : typedef struct ReorderBufferTXNByIdEnt
                                130                 :                : {
                                131                 :                :     TransactionId xid;
                                132                 :                :     ReorderBufferTXN *txn;
                                133                 :                : } ReorderBufferTXNByIdEnt;
                                134                 :                : 
                                135                 :                : /* data structures for (relfilelocator, ctid) => (cmin, cmax) mapping */
                                136                 :                : typedef struct ReorderBufferTupleCidKey
                                137                 :                : {
                                138                 :                :     RelFileLocator rlocator;
                                139                 :                :     ItemPointerData tid;
                                140                 :                : } ReorderBufferTupleCidKey;
                                141                 :                : 
                                142                 :                : typedef struct ReorderBufferTupleCidEnt
                                143                 :                : {
                                144                 :                :     ReorderBufferTupleCidKey key;
                                145                 :                :     CommandId   cmin;
                                146                 :                :     CommandId   cmax;
                                147                 :                :     CommandId   combocid;       /* just for debugging */
                                148                 :                : } ReorderBufferTupleCidEnt;
                                149                 :                : 
                                150                 :                : /* Virtual file descriptor with file offset tracking */
                                151                 :                : typedef struct TXNEntryFile
                                152                 :                : {
                                153                 :                :     File        vfd;            /* -1 when the file is closed */
                                154                 :                :     off_t       curOffset;      /* offset for next write or read. Reset to 0
                                155                 :                :                                  * when vfd is opened. */
                                156                 :                : } TXNEntryFile;
                                157                 :                : 
                                158                 :                : /* k-way in-order change iteration support structures */
                                159                 :                : typedef struct ReorderBufferIterTXNEntry
                                160                 :                : {
                                161                 :                :     XLogRecPtr  lsn;
                                162                 :                :     ReorderBufferChange *change;
                                163                 :                :     ReorderBufferTXN *txn;
                                164                 :                :     TXNEntryFile file;
                                165                 :                :     XLogSegNo   segno;
                                166                 :                : } ReorderBufferIterTXNEntry;
                                167                 :                : 
                                168                 :                : typedef struct ReorderBufferIterTXNState
                                169                 :                : {
                                170                 :                :     binaryheap *heap;
                                171                 :                :     Size        nr_txns;
                                172                 :                :     dlist_head  old_change;
                                173                 :                :     ReorderBufferIterTXNEntry entries[FLEXIBLE_ARRAY_MEMBER];
                                174                 :                : } ReorderBufferIterTXNState;
                                175                 :                : 
                                176                 :                : /* toast datastructures */
                                177                 :                : typedef struct ReorderBufferToastEnt
                                178                 :                : {
                                179                 :                :     Oid         chunk_id;       /* toast_table.chunk_id */
                                180                 :                :     int32       last_chunk_seq; /* toast_table.chunk_seq of the last chunk we
                                181                 :                :                                  * have seen */
                                182                 :                :     Size        num_chunks;     /* number of chunks we've already seen */
                                183                 :                :     Size        size;           /* combined size of chunks seen */
                                184                 :                :     dlist_head  chunks;         /* linked list of chunks */
                                185                 :                :     struct varlena *reconstructed;  /* reconstructed varlena now pointed to in
                                186                 :                :                                      * main tup */
                                187                 :                : } ReorderBufferToastEnt;
                                188                 :                : 
                                189                 :                : /* Disk serialization support datastructures */
                                190                 :                : typedef struct ReorderBufferDiskChange
                                191                 :                : {
                                192                 :                :     Size        size;
                                193                 :                :     ReorderBufferChange change;
                                194                 :                :     /* data follows */
                                195                 :                : } ReorderBufferDiskChange;
                                196                 :                : 
                                197                 :                : #define IsSpecInsert(action) \
                                198                 :                : ( \
                                199                 :                :     ((action) == REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT) \
                                200                 :                : )
                                201                 :                : #define IsSpecConfirmOrAbort(action) \
                                202                 :                : ( \
                                203                 :                :     (((action) == REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM) || \
                                204                 :                :     ((action) == REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT)) \
                                205                 :                : )
                                206                 :                : #define IsInsertOrUpdate(action) \
                                207                 :                : ( \
                                208                 :                :     (((action) == REORDER_BUFFER_CHANGE_INSERT) || \
                                209                 :                :     ((action) == REORDER_BUFFER_CHANGE_UPDATE) || \
                                210                 :                :     ((action) == REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT)) \
                                211                 :                : )
                                212                 :                : 
                                213                 :                : /*
                                214                 :                :  * Maximum number of changes kept in memory, per transaction. After that,
                                215                 :                :  * changes are spooled to disk.
                                216                 :                :  *
                                217                 :                :  * The current value should be sufficient to decode the entire transaction
                                218                 :                :  * without hitting disk in OLTP workloads, while starting to spool to disk in
                                219                 :                :  * other workloads reasonably fast.
                                220                 :                :  *
                                221                 :                :  * At some point in the future it probably makes sense to have a more elaborate
                                222                 :                :  * resource management here, but it's not entirely clear what that would look
                                223                 :                :  * like.
                                224                 :                :  */
                                225                 :                : int         logical_decoding_work_mem;
                                226                 :                : static const Size max_changes_in_memory = 4096; /* XXX for restore only */
                                227                 :                : 
                                228                 :                : /* GUC variable */
                                229                 :                : int         debug_logical_replication_streaming = DEBUG_LOGICAL_REP_STREAMING_BUFFERED;
                                230                 :                : 
                                231                 :                : /* ---------------------------------------
                                232                 :                :  * primary reorderbuffer support routines
                                233                 :                :  * ---------------------------------------
                                234                 :                :  */
                                235                 :                : static ReorderBufferTXN *ReorderBufferAllocTXN(ReorderBuffer *rb);
                                236                 :                : static void ReorderBufferFreeTXN(ReorderBuffer *rb, ReorderBufferTXN *txn);
                                237                 :                : static ReorderBufferTXN *ReorderBufferTXNByXid(ReorderBuffer *rb,
                                238                 :                :                                                TransactionId xid, bool create, bool *is_new,
                                239                 :                :                                                XLogRecPtr lsn, bool create_as_top);
                                240                 :                : static void ReorderBufferTransferSnapToParent(ReorderBufferTXN *txn,
                                241                 :                :                                               ReorderBufferTXN *subtxn);
                                242                 :                : 
                                243                 :                : static void AssertTXNLsnOrder(ReorderBuffer *rb);
                                244                 :                : 
                                245                 :                : /* ---------------------------------------
                                246                 :                :  * support functions for lsn-order iterating over the ->changes of a
                                247                 :                :  * transaction and its subtransactions
                                248                 :                :  *
                                249                 :                :  * used for iteration over the k-way heap merge of a transaction and its
                                250                 :                :  * subtransactions
                                251                 :                :  * ---------------------------------------
                                252                 :                :  */
                                253                 :                : static void ReorderBufferIterTXNInit(ReorderBuffer *rb, ReorderBufferTXN *txn,
                                254                 :                :                                      ReorderBufferIterTXNState *volatile *iter_state);
                                255                 :                : static ReorderBufferChange *ReorderBufferIterTXNNext(ReorderBuffer *rb, ReorderBufferIterTXNState *state);
                                256                 :                : static void ReorderBufferIterTXNFinish(ReorderBuffer *rb,
                                257                 :                :                                        ReorderBufferIterTXNState *state);
                                258                 :                : static void ReorderBufferExecuteInvalidations(uint32 nmsgs, SharedInvalidationMessage *msgs);
                                259                 :                : 
                                260                 :                : /*
                                261                 :                :  * ---------------------------------------
                                262                 :                :  * Disk serialization support functions
                                263                 :                :  * ---------------------------------------
                                264                 :                :  */
                                265                 :                : static void ReorderBufferCheckMemoryLimit(ReorderBuffer *rb);
                                266                 :                : static void ReorderBufferSerializeTXN(ReorderBuffer *rb, ReorderBufferTXN *txn);
                                267                 :                : static void ReorderBufferSerializeChange(ReorderBuffer *rb, ReorderBufferTXN *txn,
                                268                 :                :                                          int fd, ReorderBufferChange *change);
                                269                 :                : static Size ReorderBufferRestoreChanges(ReorderBuffer *rb, ReorderBufferTXN *txn,
                                270                 :                :                                         TXNEntryFile *file, XLogSegNo *segno);
                                271                 :                : static void ReorderBufferRestoreChange(ReorderBuffer *rb, ReorderBufferTXN *txn,
                                272                 :                :                                        char *data);
                                273                 :                : static void ReorderBufferRestoreCleanup(ReorderBuffer *rb, ReorderBufferTXN *txn);
                                274                 :                : static void ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
                                275                 :                :                                      bool txn_prepared);
                                276                 :                : static void ReorderBufferMaybeMarkTXNStreamed(ReorderBuffer *rb, ReorderBufferTXN *txn);
                                277                 :                : static bool ReorderBufferCheckAndTruncateAbortedTXN(ReorderBuffer *rb, ReorderBufferTXN *txn);
                                278                 :                : static void ReorderBufferCleanupSerializedTXNs(const char *slotname);
                                279                 :                : static void ReorderBufferSerializedPath(char *path, ReplicationSlot *slot,
                                280                 :                :                                         TransactionId xid, XLogSegNo segno);
                                281                 :                : static int  ReorderBufferTXNSizeCompare(const pairingheap_node *a, const pairingheap_node *b, void *arg);
                                282                 :                : 
                                283                 :                : static void ReorderBufferFreeSnap(ReorderBuffer *rb, Snapshot snap);
                                284                 :                : static Snapshot ReorderBufferCopySnap(ReorderBuffer *rb, Snapshot orig_snap,
                                285                 :                :                                       ReorderBufferTXN *txn, CommandId cid);
                                286                 :                : 
                                287                 :                : /*
                                288                 :                :  * ---------------------------------------
                                289                 :                :  * Streaming support functions
                                290                 :                :  * ---------------------------------------
                                291                 :                :  */
                                292                 :                : static inline bool ReorderBufferCanStream(ReorderBuffer *rb);
                                293                 :                : static inline bool ReorderBufferCanStartStreaming(ReorderBuffer *rb);
                                294                 :                : static void ReorderBufferStreamTXN(ReorderBuffer *rb, ReorderBufferTXN *txn);
                                295                 :                : static void ReorderBufferStreamCommit(ReorderBuffer *rb, ReorderBufferTXN *txn);
                                296                 :                : 
                                297                 :                : /* ---------------------------------------
                                298                 :                :  * toast reassembly support
                                299                 :                :  * ---------------------------------------
                                300                 :                :  */
                                301                 :                : static void ReorderBufferToastInitHash(ReorderBuffer *rb, ReorderBufferTXN *txn);
                                302                 :                : static void ReorderBufferToastReset(ReorderBuffer *rb, ReorderBufferTXN *txn);
                                303                 :                : static void ReorderBufferToastReplace(ReorderBuffer *rb, ReorderBufferTXN *txn,
                                304                 :                :                                       Relation relation, ReorderBufferChange *change);
                                305                 :                : static void ReorderBufferToastAppendChunk(ReorderBuffer *rb, ReorderBufferTXN *txn,
                                306                 :                :                                           Relation relation, ReorderBufferChange *change);
                                307                 :                : 
                                308                 :                : /*
                                309                 :                :  * ---------------------------------------
                                310                 :                :  * memory accounting
                                311                 :                :  * ---------------------------------------
                                312                 :                :  */
                                313                 :                : static Size ReorderBufferChangeSize(ReorderBufferChange *change);
                                314                 :                : static void ReorderBufferChangeMemoryUpdate(ReorderBuffer *rb,
                                315                 :                :                                             ReorderBufferChange *change,
                                316                 :                :                                             ReorderBufferTXN *txn,
                                317                 :                :                                             bool addition, Size sz);
                                318                 :                : 
                                319                 :                : /*
                                320                 :                :  * Allocate a new ReorderBuffer and clean out any old serialized state from
                                321                 :                :  * prior ReorderBuffer instances for the same slot.
                                322                 :                :  */
                                323                 :                : ReorderBuffer *
 4257 rhaas@postgresql.org      324                 :CBC        1055 : ReorderBufferAllocate(void)
                                325                 :                : {
                                326                 :                :     ReorderBuffer *buffer;
                                327                 :                :     HASHCTL     hash_ctl;
                                328                 :                :     MemoryContext new_ctx;
                                329                 :                : 
 2793 alvherre@alvh.no-ip.      330         [ -  + ]:           1055 :     Assert(MyReplicationSlot != NULL);
                                331                 :                : 
                                332                 :                :     /* allocate memory in own context, to have better accountability */
 4257 rhaas@postgresql.org      333                 :           1055 :     new_ctx = AllocSetContextCreate(CurrentMemoryContext,
                                334                 :                :                                     "ReorderBuffer",
                                335                 :                :                                     ALLOCSET_DEFAULT_SIZES);
                                336                 :                : 
                                337                 :                :     buffer =
                                338                 :           1055 :         (ReorderBuffer *) MemoryContextAlloc(new_ctx, sizeof(ReorderBuffer));
                                339                 :                : 
                                340                 :           1055 :     memset(&hash_ctl, 0, sizeof(hash_ctl));
                                341                 :                : 
                                342                 :           1055 :     buffer->context = new_ctx;
                                343                 :                : 
 3165 andres@anarazel.de        344                 :           1055 :     buffer->change_context = SlabContextCreate(new_ctx,
                                345                 :                :                                                "Change",
                                346                 :                :                                                SLAB_DEFAULT_BLOCK_SIZE,
                                347                 :                :                                                sizeof(ReorderBufferChange));
                                348                 :                : 
                                349                 :           1055 :     buffer->txn_context = SlabContextCreate(new_ctx,
                                350                 :                :                                             "TXN",
                                351                 :                :                                             SLAB_DEFAULT_BLOCK_SIZE,
                                352                 :                :                                             sizeof(ReorderBufferTXN));
                                353                 :                : 
                                354                 :                :     /*
                                355                 :                :      * To minimize memory fragmentation caused by long-running transactions
                                356                 :                :      * with changes spanning multiple memory blocks, we use a single
                                357                 :                :      * fixed-size memory block for decoded tuple storage. The performance
                                358                 :                :      * testing showed that the default memory block size maintains logical
                                359                 :                :      * decoding performance without causing fragmentation due to concurrent
                                360                 :                :      * transactions. One might think that we can use the max size as
                                361                 :                :      * SLAB_LARGE_BLOCK_SIZE but the test also showed it doesn't help resolve
                                362                 :                :      * the memory fragmentation.
                                363                 :                :      */
 2896 simon@2ndQuadrant.co      364                 :           1055 :     buffer->tup_context = GenerationContextCreate(new_ctx,
                                365                 :                :                                                   "Tuples",
                                366                 :                :                                                   SLAB_DEFAULT_BLOCK_SIZE,
                                367                 :                :                                                   SLAB_DEFAULT_BLOCK_SIZE,
                                368                 :                :                                                   SLAB_DEFAULT_BLOCK_SIZE);
                                369                 :                : 
 4257 rhaas@postgresql.org      370                 :           1055 :     hash_ctl.keysize = sizeof(TransactionId);
                                371                 :           1055 :     hash_ctl.entrysize = sizeof(ReorderBufferTXNByIdEnt);
                                372                 :           1055 :     hash_ctl.hcxt = buffer->context;
                                373                 :                : 
                                374                 :           1055 :     buffer->by_txn = hash_create("ReorderBufferByXid", 1000, &hash_ctl,
                                375                 :                :                                  HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
                                376                 :                : 
                                377                 :           1055 :     buffer->by_txn_last_xid = InvalidTransactionId;
                                378                 :           1055 :     buffer->by_txn_last_txn = NULL;
                                379                 :                : 
                                380                 :           1055 :     buffer->outbuf = NULL;
                                381                 :           1055 :     buffer->outbufsize = 0;
 2173 akapila@postgresql.o      382                 :           1055 :     buffer->size = 0;
                                383                 :                : 
                                384                 :                :     /* txn_heap is ordered by transaction size */
  565 msawada@postgresql.o      385                 :           1055 :     buffer->txn_heap = pairingheap_allocate(ReorderBufferTXNSizeCompare, NULL);
                                386                 :                : 
 1846 akapila@postgresql.o      387                 :           1055 :     buffer->spillTxns = 0;
                                388                 :           1055 :     buffer->spillCount = 0;
                                389                 :           1055 :     buffer->spillBytes = 0;
 1825                           390                 :           1055 :     buffer->streamTxns = 0;
                                391                 :           1055 :     buffer->streamCount = 0;
                                392                 :           1055 :     buffer->streamBytes = 0;
   20 msawada@postgresql.o      393                 :GNC        1055 :     buffer->memExceededCount = 0;
 1656 akapila@postgresql.o      394                 :CBC        1055 :     buffer->totalTxns = 0;
                                395                 :           1055 :     buffer->totalBytes = 0;
                                396                 :                : 
 4257 rhaas@postgresql.org      397                 :           1055 :     buffer->current_restart_decoding_lsn = InvalidXLogRecPtr;
                                398                 :                : 
                                399                 :           1055 :     dlist_init(&buffer->toplevel_by_lsn);
 2681 alvherre@alvh.no-ip.      400                 :           1055 :     dlist_init(&buffer->txns_by_base_snapshot_lsn);
 1091 drowley@postgresql.o      401                 :           1055 :     dclist_init(&buffer->catchange_txns);
                                402                 :                : 
                                403                 :                :     /*
                                404                 :                :      * Ensure there's no stale data from prior uses of this slot, in case some
                                405                 :                :      * prior exit avoided calling ReorderBufferFree. Failure to do this can
                                406                 :                :      * produce duplicated txns, and it's very cheap if there's nothing there.
                                407                 :                :      */
 2793 alvherre@alvh.no-ip.      408                 :           1055 :     ReorderBufferCleanupSerializedTXNs(NameStr(MyReplicationSlot->data.name));
                                409                 :                : 
 4257 rhaas@postgresql.org      410                 :           1055 :     return buffer;
                                411                 :                : }
                                412                 :                : 
                                413                 :                : /*
                                414                 :                :  * Free a ReorderBuffer
                                415                 :                :  */
                                416                 :                : void
                                417                 :            841 : ReorderBufferFree(ReorderBuffer *rb)
                                418                 :                : {
                                419                 :            841 :     MemoryContext context = rb->context;
                                420                 :                : 
                                421                 :                :     /*
                                422                 :                :      * We free separately allocated data by entirely scrapping reorderbuffer's
                                423                 :                :      * memory context.
                                424                 :                :      */
                                425                 :            841 :     MemoryContextDelete(context);
                                426                 :                : 
                                427                 :                :     /* Free disk space used by unconsumed reorder buffers */
 2793 alvherre@alvh.no-ip.      428                 :            841 :     ReorderBufferCleanupSerializedTXNs(NameStr(MyReplicationSlot->data.name));
 4257 rhaas@postgresql.org      429                 :            841 : }
                                430                 :                : 
                                431                 :                : /*
                                432                 :                :  * Allocate a new ReorderBufferTXN.
                                433                 :                :  */
                                434                 :                : static ReorderBufferTXN *
  230 heikki.linnakangas@i      435                 :           3965 : ReorderBufferAllocTXN(ReorderBuffer *rb)
                                436                 :                : {
                                437                 :                :     ReorderBufferTXN *txn;
                                438                 :                : 
                                439                 :                :     txn = (ReorderBufferTXN *)
 3165 andres@anarazel.de        440                 :           3965 :         MemoryContextAlloc(rb->txn_context, sizeof(ReorderBufferTXN));
                                441                 :                : 
 4257 rhaas@postgresql.org      442                 :           3965 :     memset(txn, 0, sizeof(ReorderBufferTXN));
                                443                 :                : 
                                444                 :           3965 :     dlist_init(&txn->changes);
                                445                 :           3965 :     dlist_init(&txn->tuplecids);
                                446                 :           3965 :     dlist_init(&txn->subtxns);
                                447                 :                : 
                                448                 :                :     /* InvalidCommandId is not zero, so set it explicitly */
 1907 akapila@postgresql.o      449                 :           3965 :     txn->command_id = InvalidCommandId;
 1806                           450                 :           3965 :     txn->output_plugin_private = NULL;
                                451                 :                : 
 4257 rhaas@postgresql.org      452                 :           3965 :     return txn;
                                453                 :                : }
                                454                 :                : 
                                455                 :                : /*
                                456                 :                :  * Free a ReorderBufferTXN.
                                457                 :                :  */
                                458                 :                : static void
  230 heikki.linnakangas@i      459                 :           3905 : ReorderBufferFreeTXN(ReorderBuffer *rb, ReorderBufferTXN *txn)
                                460                 :                : {
                                461                 :                :     /* clean the lookup cache if we were cached (quite likely) */
 4257 rhaas@postgresql.org      462         [ +  + ]:           3905 :     if (rb->by_txn_last_xid == txn->xid)
                                463                 :                :     {
                                464                 :           3720 :         rb->by_txn_last_xid = InvalidTransactionId;
                                465                 :           3720 :         rb->by_txn_last_txn = NULL;
                                466                 :                :     }
                                467                 :                : 
                                468                 :                :     /* free data that's contained */
                                469                 :                : 
 1758 akapila@postgresql.o      470         [ +  + ]:           3905 :     if (txn->gid != NULL)
                                471                 :                :     {
                                472                 :             43 :         pfree(txn->gid);
                                473                 :             43 :         txn->gid = NULL;
                                474                 :                :     }
                                475                 :                : 
 4257 rhaas@postgresql.org      476         [ +  + ]:           3905 :     if (txn->tuplecid_hash != NULL)
                                477                 :                :     {
                                478                 :            676 :         hash_destroy(txn->tuplecid_hash);
                                479                 :            676 :         txn->tuplecid_hash = NULL;
                                480                 :                :     }
                                481                 :                : 
                                482         [ +  + ]:           3905 :     if (txn->invalidations)
                                483                 :                :     {
                                484                 :           1253 :         pfree(txn->invalidations);
                                485                 :           1253 :         txn->invalidations = NULL;
                                486                 :                :     }
                                487                 :                : 
  134 msawada@postgresql.o      488         [ +  + ]:           3905 :     if (txn->invalidations_distributed)
                                489                 :                :     {
                                490                 :             21 :         pfree(txn->invalidations_distributed);
                                491                 :             21 :         txn->invalidations_distributed = NULL;
                                492                 :                :     }
                                493                 :                : 
                                494                 :                :     /* Reset the toast hash */
 1596 akapila@postgresql.o      495                 :           3905 :     ReorderBufferToastReset(rb, txn);
                                496                 :                : 
                                497                 :                :     /* All changes must be deallocated */
  428 msawada@postgresql.o      498         [ -  + ]:           3905 :     Assert(txn->size == 0);
                                499                 :                : 
 3165 andres@anarazel.de        500                 :           3905 :     pfree(txn);
 4257 rhaas@postgresql.org      501                 :           3905 : }
                                502                 :                : 
                                503                 :                : /*
                                504                 :                :  * Allocate a ReorderBufferChange.
                                505                 :                :  */
                                506                 :                : ReorderBufferChange *
  230 heikki.linnakangas@i      507                 :        1574052 : ReorderBufferAllocChange(ReorderBuffer *rb)
                                508                 :                : {
                                509                 :                :     ReorderBufferChange *change;
                                510                 :                : 
                                511                 :                :     change = (ReorderBufferChange *)
 3165 andres@anarazel.de        512                 :        1574052 :         MemoryContextAlloc(rb->change_context, sizeof(ReorderBufferChange));
                                513                 :                : 
 4257 rhaas@postgresql.org      514                 :        1574052 :     memset(change, 0, sizeof(ReorderBufferChange));
                                515                 :        1574052 :     return change;
                                516                 :                : }
                                517                 :                : 
                                518                 :                : /*
                                519                 :                :  * Free a ReorderBufferChange and update memory accounting, if requested.
                                520                 :                :  */
                                521                 :                : void
  230 heikki.linnakangas@i      522                 :        1572597 : ReorderBufferFreeChange(ReorderBuffer *rb, ReorderBufferChange *change,
                                523                 :                :                         bool upd_mem)
                                524                 :                : {
                                525                 :                :     /* update memory accounting info */
 1907 akapila@postgresql.o      526         [ +  + ]:        1572597 :     if (upd_mem)
  573 msawada@postgresql.o      527                 :         197568 :         ReorderBufferChangeMemoryUpdate(rb, change, NULL, false,
                                528                 :                :                                         ReorderBufferChangeSize(change));
                                529                 :                : 
                                530                 :                :     /* free contained data */
 4253 tgl@sss.pgh.pa.us         531   [ +  +  +  +  :        1572597 :     switch (change->action)
                                           +  +  - ]
                                532                 :                :     {
                                533                 :        1497420 :         case REORDER_BUFFER_CHANGE_INSERT:
                                534                 :                :         case REORDER_BUFFER_CHANGE_UPDATE:
                                535                 :                :         case REORDER_BUFFER_CHANGE_DELETE:
                                536                 :                :         case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT:
                                537         [ +  + ]:        1497420 :             if (change->data.tp.newtuple)
                                538                 :                :             {
  230 heikki.linnakangas@i      539                 :        1282819 :                 ReorderBufferFreeTupleBuf(change->data.tp.newtuple);
 4253 tgl@sss.pgh.pa.us         540                 :        1282819 :                 change->data.tp.newtuple = NULL;
                                541                 :                :             }
                                542                 :                : 
                                543         [ +  + ]:        1497420 :             if (change->data.tp.oldtuple)
                                544                 :                :             {
  230 heikki.linnakangas@i      545                 :         146178 :                 ReorderBufferFreeTupleBuf(change->data.tp.oldtuple);
 4253 tgl@sss.pgh.pa.us         546                 :         146178 :                 change->data.tp.oldtuple = NULL;
                                547                 :                :             }
 4257 rhaas@postgresql.org      548                 :        1497420 :             break;
 3492 simon@2ndQuadrant.co      549                 :             40 :         case REORDER_BUFFER_CHANGE_MESSAGE:
                                550         [ +  - ]:             40 :             if (change->data.msg.prefix != NULL)
                                551                 :             40 :                 pfree(change->data.msg.prefix);
                                552                 :             40 :             change->data.msg.prefix = NULL;
                                553         [ +  - ]:             40 :             if (change->data.msg.message != NULL)
                                554                 :             40 :                 pfree(change->data.msg.message);
                                555                 :             40 :             change->data.msg.message = NULL;
                                556                 :             40 :             break;
 1839 akapila@postgresql.o      557                 :           5246 :         case REORDER_BUFFER_CHANGE_INVALIDATION:
                                558         [ +  - ]:           5246 :             if (change->data.inval.invalidations)
                                559                 :           5246 :                 pfree(change->data.inval.invalidations);
                                560                 :           5246 :             change->data.inval.invalidations = NULL;
                                561                 :           5246 :             break;
 4257 rhaas@postgresql.org      562                 :           1277 :         case REORDER_BUFFER_CHANGE_INTERNAL_SNAPSHOT:
 4253 tgl@sss.pgh.pa.us         563         [ +  - ]:           1277 :             if (change->data.snapshot)
                                564                 :                :             {
                                565                 :           1277 :                 ReorderBufferFreeSnap(rb, change->data.snapshot);
                                566                 :           1277 :                 change->data.snapshot = NULL;
                                567                 :                :             }
 4257 rhaas@postgresql.org      568                 :           1277 :             break;
                                569                 :                :             /* no data in addition to the struct itself */
 2612 tomas.vondra@postgre      570                 :             48 :         case REORDER_BUFFER_CHANGE_TRUNCATE:
                                571         [ +  - ]:             48 :             if (change->data.truncate.relids != NULL)
                                572                 :                :             {
  230 heikki.linnakangas@i      573                 :             48 :                 ReorderBufferFreeRelids(rb, change->data.truncate.relids);
 2612 tomas.vondra@postgre      574                 :             48 :                 change->data.truncate.relids = NULL;
                                575                 :                :             }
                                576                 :             48 :             break;
 3826 andres@anarazel.de        577                 :          68566 :         case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM:
                                578                 :                :         case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT:
                                579                 :                :         case REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID:
                                580                 :                :         case REORDER_BUFFER_CHANGE_INTERNAL_TUPLECID:
 4257 rhaas@postgresql.org      581                 :          68566 :             break;
                                582                 :                :     }
                                583                 :                : 
 3165 andres@anarazel.de        584                 :        1572597 :     pfree(change);
 4257 rhaas@postgresql.org      585                 :        1572597 : }
                                586                 :                : 
                                587                 :                : /*
                                588                 :                :  * Allocate a HeapTuple fitting a tuple of size tuple_len (excluding header
                                589                 :                :  * overhead).
                                590                 :                :  */
                                591                 :                : HeapTuple
  230 heikki.linnakangas@i      592                 :        1430256 : ReorderBufferAllocTupleBuf(ReorderBuffer *rb, Size tuple_len)
                                593                 :                : {
                                594                 :                :     HeapTuple   tuple;
                                595                 :                :     Size        alloc_len;
                                596                 :                : 
 3524 andres@anarazel.de        597                 :        1430256 :     alloc_len = tuple_len + SizeofHeapTupleHeader;
                                598                 :                : 
  638 msawada@postgresql.o      599                 :        1430256 :     tuple = (HeapTuple) MemoryContextAlloc(rb->tup_context,
                                600                 :                :                                            HEAPTUPLESIZE + alloc_len);
                                601                 :        1430256 :     tuple->t_data = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE);
                                602                 :                : 
 4257 rhaas@postgresql.org      603                 :        1430256 :     return tuple;
                                604                 :                : }
                                605                 :                : 
                                606                 :                : /*
                                607                 :                :  * Free a HeapTuple returned by ReorderBufferAllocTupleBuf().
                                608                 :                :  */
                                609                 :                : void
  230 heikki.linnakangas@i      610                 :        1428997 : ReorderBufferFreeTupleBuf(HeapTuple tuple)
                                611                 :                : {
 2896 simon@2ndQuadrant.co      612                 :        1428997 :     pfree(tuple);
 4257 rhaas@postgresql.org      613                 :        1428997 : }
                                614                 :                : 
                                615                 :                : /*
                                616                 :                :  * Allocate an array for relids of truncated relations.
                                617                 :                :  *
                                618                 :                :  * We use the global memory context (for the whole reorder buffer), because
                                619                 :                :  * none of the existing ones seems like a good match (some are SLAB, so we
                                620                 :                :  * can't use those, and tup_context is meant for tuple data, not relids). We
                                621                 :                :  * could add yet another context, but it seems like an overkill - TRUNCATE is
                                622                 :                :  * not particularly common operation, so it does not seem worth it.
                                623                 :                :  */
                                624                 :                : Oid *
  230 heikki.linnakangas@i      625                 :             53 : ReorderBufferAllocRelids(ReorderBuffer *rb, int nrelids)
                                626                 :                : {
                                627                 :                :     Oid        *relids;
                                628                 :                :     Size        alloc_len;
                                629                 :                : 
 2612 tomas.vondra@postgre      630                 :             53 :     alloc_len = sizeof(Oid) * nrelids;
                                631                 :                : 
                                632                 :             53 :     relids = (Oid *) MemoryContextAlloc(rb->context, alloc_len);
                                633                 :                : 
                                634                 :             53 :     return relids;
                                635                 :                : }
                                636                 :                : 
                                637                 :                : /*
                                638                 :                :  * Free an array of relids.
                                639                 :                :  */
                                640                 :                : void
  230 heikki.linnakangas@i      641                 :             48 : ReorderBufferFreeRelids(ReorderBuffer *rb, Oid *relids)
                                642                 :                : {
 2612 tomas.vondra@postgre      643                 :             48 :     pfree(relids);
                                644                 :             48 : }
                                645                 :                : 
                                646                 :                : /*
                                647                 :                :  * Return the ReorderBufferTXN from the given buffer, specified by Xid.
                                648                 :                :  * If create is true, and a transaction doesn't already exist, create it
                                649                 :                :  * (with the given LSN, and as top transaction if that's specified);
                                650                 :                :  * when this happens, is_new is set to true.
                                651                 :                :  */
                                652                 :                : static ReorderBufferTXN *
 4257 rhaas@postgresql.org      653                 :        5187309 : ReorderBufferTXNByXid(ReorderBuffer *rb, TransactionId xid, bool create,
                                654                 :                :                       bool *is_new, XLogRecPtr lsn, bool create_as_top)
                                655                 :                : {
                                656                 :                :     ReorderBufferTXN *txn;
                                657                 :                :     ReorderBufferTXNByIdEnt *ent;
                                658                 :                :     bool        found;
                                659                 :                : 
                                660         [ -  + ]:        5187309 :     Assert(TransactionIdIsValid(xid));
                                661                 :                : 
                                662                 :                :     /*
                                663                 :                :      * Check the one-entry lookup cache first
                                664                 :                :      */
                                665         [ +  + ]:        5187309 :     if (TransactionIdIsValid(rb->by_txn_last_xid) &&
                                666         [ +  + ]:        5183549 :         rb->by_txn_last_xid == xid)
                                667                 :                :     {
                                668                 :        4413810 :         txn = rb->by_txn_last_txn;
                                669                 :                : 
                                670         [ +  + ]:        4413810 :         if (txn != NULL)
                                671                 :                :         {
                                672                 :                :             /* found it, and it's valid */
                                673         [ +  + ]:        4413782 :             if (is_new)
                                674                 :           3241 :                 *is_new = false;
                                675                 :        4413782 :             return txn;
                                676                 :                :         }
                                677                 :                : 
                                678                 :                :         /*
                                679                 :                :          * cached as non-existent, and asked not to create? Then nothing else
                                680                 :                :          * to do.
                                681                 :                :          */
                                682         [ +  + ]:             28 :         if (!create)
                                683                 :             25 :             return NULL;
                                684                 :                :         /* otherwise fall through to create it */
                                685                 :                :     }
                                686                 :                : 
                                687                 :                :     /*
                                688                 :                :      * If the cache wasn't hit or it yielded a "does-not-exist" and we want to
                                689                 :                :      * create an entry.
                                690                 :                :      */
                                691                 :                : 
                                692                 :                :     /* search the lookup table */
                                693                 :                :     ent = (ReorderBufferTXNByIdEnt *)
                                694                 :         773502 :         hash_search(rb->by_txn,
                                695                 :                :                     &xid,
                                696                 :                :                     create ? HASH_ENTER : HASH_FIND,
                                697                 :                :                     &found);
                                698         [ +  + ]:         773502 :     if (found)
                                699                 :         768238 :         txn = ent->txn;
                                700         [ +  + ]:           5264 :     else if (create)
                                701                 :                :     {
                                702                 :                :         /* initialize the new entry, if creation was requested */
                                703         [ -  + ]:           3965 :         Assert(ent != NULL);
 2681 alvherre@alvh.no-ip.      704         [ -  + ]:           3965 :         Assert(lsn != InvalidXLogRecPtr);
                                705                 :                : 
  230 heikki.linnakangas@i      706                 :           3965 :         ent->txn = ReorderBufferAllocTXN(rb);
 4257 rhaas@postgresql.org      707                 :           3965 :         ent->txn->xid = xid;
                                708                 :           3965 :         txn = ent->txn;
                                709                 :           3965 :         txn->first_lsn = lsn;
                                710                 :           3965 :         txn->restart_decoding_lsn = rb->current_restart_decoding_lsn;
                                711                 :                : 
                                712         [ +  + ]:           3965 :         if (create_as_top)
                                713                 :                :         {
                                714                 :           3310 :             dlist_push_tail(&rb->toplevel_by_lsn, &txn->node);
                                715                 :           3310 :             AssertTXNLsnOrder(rb);
                                716                 :                :         }
                                717                 :                :     }
                                718                 :                :     else
                                719                 :           1299 :         txn = NULL;             /* not found and not asked to create */
                                720                 :                : 
                                721                 :                :     /* update cache */
                                722                 :         773502 :     rb->by_txn_last_xid = xid;
                                723                 :         773502 :     rb->by_txn_last_txn = txn;
                                724                 :                : 
                                725         [ +  + ]:         773502 :     if (is_new)
                                726                 :           1727 :         *is_new = !found;
                                727                 :                : 
 3502 andres@anarazel.de        728   [ +  +  -  + ]:         773502 :     Assert(!create || txn != NULL);
 4257 rhaas@postgresql.org      729                 :         773502 :     return txn;
                                730                 :                : }
                                731                 :                : 
                                732                 :                : /*
                                733                 :                :  * Record the partial change for the streaming of in-progress transactions.  We
                                734                 :                :  * can stream only complete changes so if we have a partial change like toast
                                735                 :                :  * table insert or speculative insert then we mark such a 'txn' so that it
                                736                 :                :  * can't be streamed.  We also ensure that if the changes in such a 'txn' can
                                737                 :                :  * be streamed and are above logical_decoding_work_mem threshold then we stream
                                738                 :                :  * them as soon as we have a complete change.
                                739                 :                :  */
                                740                 :                : static void
 1907 akapila@postgresql.o      741                 :        1366970 : ReorderBufferProcessPartialChange(ReorderBuffer *rb, ReorderBufferTXN *txn,
                                742                 :                :                                   ReorderBufferChange *change,
                                743                 :                :                                   bool toast_insert)
                                744                 :                : {
                                745                 :                :     ReorderBufferTXN *toptxn;
                                746                 :                : 
                                747                 :                :     /*
                                748                 :                :      * The partial changes need to be processed only while streaming
                                749                 :                :      * in-progress transactions.
                                750                 :                :      */
                                751         [ +  + ]:        1366970 :     if (!ReorderBufferCanStream(rb))
                                752                 :        1049596 :         return;
                                753                 :                : 
                                754                 :                :     /* Get the top transaction. */
  956                           755         [ +  + ]:         317374 :     toptxn = rbtxn_get_toptxn(txn);
                                756                 :                : 
                                757                 :                :     /*
                                758                 :                :      * Indicate a partial change for toast inserts.  The change will be
                                759                 :                :      * considered as complete once we get the insert or update on the main
                                760                 :                :      * table and we are sure that the pending toast chunks are not required
                                761                 :                :      * anymore.
                                762                 :                :      *
                                763                 :                :      * If we allow streaming when there are pending toast chunks then such
                                764                 :                :      * chunks won't be released till the insert (multi_insert) is complete and
                                765                 :                :      * we expect the txn to have streamed all changes after streaming.  This
                                766                 :                :      * restriction is mainly to ensure the correctness of streamed
                                767                 :                :      * transactions and it doesn't seem worth uplifting such a restriction
                                768                 :                :      * just to allow this case because anyway we will stream the transaction
                                769                 :                :      * once such an insert is complete.
                                770                 :                :      */
 1907                           771         [ +  + ]:         317374 :     if (toast_insert)
 1615                           772                 :           1666 :         toptxn->txn_flags |= RBTXN_HAS_PARTIAL_CHANGE;
                                773         [ +  + ]:         315708 :     else if (rbtxn_has_partial_change(toptxn) &&
                                774   [ +  +  -  +  :             63 :              IsInsertOrUpdate(change->action) &&
                                              -  - ]
                                775         [ +  + ]:             63 :              change->data.tp.clear_toast_afterwards)
                                776                 :             43 :         toptxn->txn_flags &= ~RBTXN_HAS_PARTIAL_CHANGE;
                                777                 :                : 
                                778                 :                :     /*
                                779                 :                :      * Indicate a partial change for speculative inserts.  The change will be
                                780                 :                :      * considered as complete once we get the speculative confirm or abort
                                781                 :                :      * token.
                                782                 :                :      */
 1907                           783         [ -  + ]:         317374 :     if (IsSpecInsert(change->action))
 1615 akapila@postgresql.o      784                 :UBC           0 :         toptxn->txn_flags |= RBTXN_HAS_PARTIAL_CHANGE;
 1615 akapila@postgresql.o      785         [ +  + ]:CBC      317374 :     else if (rbtxn_has_partial_change(toptxn) &&
 1581                           786   [ +  -  -  + ]:           1686 :              IsSpecConfirmOrAbort(change->action))
 1615 akapila@postgresql.o      787                 :UBC           0 :         toptxn->txn_flags &= ~RBTXN_HAS_PARTIAL_CHANGE;
                                788                 :                : 
                                789                 :                :     /*
                                790                 :                :      * Stream the transaction if it is serialized before and the changes are
                                791                 :                :      * now complete in the top-level transaction.
                                792                 :                :      *
                                793                 :                :      * The reason for doing the streaming of such a transaction as soon as we
                                794                 :                :      * get the complete change for it is that previously it would have reached
                                795                 :                :      * the memory threshold and wouldn't get streamed because of incomplete
                                796                 :                :      * changes.  Delaying such transactions would increase apply lag for them.
                                797                 :                :      */
 1907 akapila@postgresql.o      798         [ +  + ]:CBC      317374 :     if (ReorderBufferCanStartStreaming(rb) &&
 1615                           799         [ +  + ]:         176338 :         !(rbtxn_has_partial_change(toptxn)) &&
 1055                           800         [ +  + ]:         174802 :         rbtxn_is_serialized(txn) &&
                                801         [ +  + ]:             38 :         rbtxn_has_streamable_change(toptxn))
 1907                           802                 :              8 :         ReorderBufferStreamTXN(rb, toptxn);
                                803                 :                : }
                                804                 :                : 
                                805                 :                : /*
                                806                 :                :  * Queue a change into a transaction so it can be replayed upon commit or will be
                                807                 :                :  * streamed when we reach logical_decoding_work_mem threshold.
                                808                 :                :  */
                                809                 :                : void
 4257 rhaas@postgresql.org      810                 :        1376379 : ReorderBufferQueueChange(ReorderBuffer *rb, TransactionId xid, XLogRecPtr lsn,
                                811                 :                :                          ReorderBufferChange *change, bool toast_insert)
                                812                 :                : {
                                813                 :                :     ReorderBufferTXN *txn;
                                814                 :                : 
                                815                 :        1376379 :     txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
                                816                 :                : 
                                817                 :                :     /*
                                818                 :                :      * If we have detected that the transaction is aborted while streaming the
                                819                 :                :      * previous changes or by checking its CLOG, there is no point in
                                820                 :                :      * collecting further changes for it.
                                821                 :                :      */
  258 msawada@postgresql.o      822         [ +  + ]:        1376379 :     if (rbtxn_is_aborted(txn))
                                823                 :                :     {
                                824                 :                :         /*
                                825                 :                :          * We don't need to update memory accounting for this change as we
                                826                 :                :          * have not added it to the queue yet.
                                827                 :                :          */
  230 heikki.linnakangas@i      828                 :           9409 :         ReorderBufferFreeChange(rb, change, false);
 1907 akapila@postgresql.o      829                 :           9409 :         return;
                                830                 :                :     }
                                831                 :                : 
                                832                 :                :     /*
                                833                 :                :      * The changes that are sent downstream are considered streamable.  We
                                834                 :                :      * remember such transactions so that only those will later be considered
                                835                 :                :      * for streaming.
                                836                 :                :      */
 1055                           837         [ +  + ]:        1366970 :     if (change->action == REORDER_BUFFER_CHANGE_INSERT ||
                                838         [ +  + ]:         429175 :         change->action == REORDER_BUFFER_CHANGE_UPDATE ||
                                839         [ +  + ]:         269417 :         change->action == REORDER_BUFFER_CHANGE_DELETE ||
                                840         [ +  + ]:          66780 :         change->action == REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT ||
                                841         [ +  + ]:          48864 :         change->action == REORDER_BUFFER_CHANGE_TRUNCATE ||
                                842         [ +  + ]:          48821 :         change->action == REORDER_BUFFER_CHANGE_MESSAGE)
                                843                 :                :     {
  956                           844         [ +  + ]:        1318188 :         ReorderBufferTXN *toptxn = rbtxn_get_toptxn(txn);
                                845                 :                : 
 1055                           846                 :        1318188 :         toptxn->txn_flags |= RBTXN_HAS_STREAMABLE_CHANGE;
                                847                 :                :     }
                                848                 :                : 
 4257 rhaas@postgresql.org      849                 :        1366970 :     change->lsn = lsn;
 2173 akapila@postgresql.o      850                 :        1366970 :     change->txn = txn;
                                851                 :                : 
 4257 rhaas@postgresql.org      852         [ -  + ]:        1366970 :     Assert(InvalidXLogRecPtr != lsn);
                                853                 :        1366970 :     dlist_push_tail(&txn->changes, &change->node);
                                854                 :        1366970 :     txn->nentries++;
                                855                 :        1366970 :     txn->nentries_mem++;
                                856                 :                : 
                                857                 :                :     /* update memory accounting information */
  573 msawada@postgresql.o      858                 :        1366970 :     ReorderBufferChangeMemoryUpdate(rb, change, NULL, true,
                                859                 :                :                                     ReorderBufferChangeSize(change));
                                860                 :                : 
                                861                 :                :     /* process partial change */
 1907 akapila@postgresql.o      862                 :        1366970 :     ReorderBufferProcessPartialChange(rb, txn, change, toast_insert);
                                863                 :                : 
                                864                 :                :     /* check the memory limits and evict something if needed */
 2173                           865                 :        1366970 :     ReorderBufferCheckMemoryLimit(rb);
                                866                 :                : }
                                867                 :                : 
                                868                 :                : /*
                                869                 :                :  * A transactional message is queued to be processed upon commit and a
                                870                 :                :  * non-transactional message gets processed immediately.
                                871                 :                :  */
                                872                 :                : void
 3492 simon@2ndQuadrant.co      873                 :             47 : ReorderBufferQueueMessage(ReorderBuffer *rb, TransactionId xid,
                                874                 :                :                           Snapshot snap, XLogRecPtr lsn,
                                875                 :                :                           bool transactional, const char *prefix,
                                876                 :                :                           Size message_size, const char *message)
                                877                 :                : {
                                878         [ +  + ]:             47 :     if (transactional)
                                879                 :                :     {
                                880                 :                :         MemoryContext oldcontext;
                                881                 :                :         ReorderBufferChange *change;
                                882                 :                : 
                                883         [ -  + ]:             39 :         Assert(xid != InvalidTransactionId);
                                884                 :                : 
                                885                 :                :         /*
                                886                 :                :          * We don't expect snapshots for transactional changes - we'll use the
                                887                 :                :          * snapshot derived later during apply (unless the change gets
                                888                 :                :          * skipped).
                                889                 :                :          */
  979 tomas.vondra@postgre      890         [ -  + ]:             39 :         Assert(!snap);
                                891                 :                : 
 3492 simon@2ndQuadrant.co      892                 :             39 :         oldcontext = MemoryContextSwitchTo(rb->context);
                                893                 :                : 
  230 heikki.linnakangas@i      894                 :             39 :         change = ReorderBufferAllocChange(rb);
 3492 simon@2ndQuadrant.co      895                 :             39 :         change->action = REORDER_BUFFER_CHANGE_MESSAGE;
                                896                 :             39 :         change->data.msg.prefix = pstrdup(prefix);
                                897                 :             39 :         change->data.msg.message_size = message_size;
                                898                 :             39 :         change->data.msg.message = palloc(message_size);
                                899                 :             39 :         memcpy(change->data.msg.message, message, message_size);
                                900                 :                : 
 1907 akapila@postgresql.o      901                 :             39 :         ReorderBufferQueueChange(rb, xid, lsn, change, false);
                                902                 :                : 
 3492 simon@2ndQuadrant.co      903                 :             39 :         MemoryContextSwitchTo(oldcontext);
                                904                 :                :     }
                                905                 :                :     else
                                906                 :                :     {
 3428 rhaas@postgresql.org      907                 :              8 :         ReorderBufferTXN *txn = NULL;
 1137 pg@bowt.ie                908                 :              8 :         volatile Snapshot snapshot_now = snap;
                                909                 :                : 
                                910                 :                :         /* Non-transactional changes require a valid snapshot. */
  979 tomas.vondra@postgre      911         [ -  + ]:              8 :         Assert(snapshot_now);
                                912                 :                : 
 3492 simon@2ndQuadrant.co      913         [ +  + ]:              8 :         if (xid != InvalidTransactionId)
                                914                 :              3 :             txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
                                915                 :                : 
                                916                 :                :         /* setup snapshot to allow catalog access */
                                917                 :              8 :         SetupHistoricSnapshot(snapshot_now, NULL);
                                918         [ +  - ]:              8 :         PG_TRY();
                                919                 :                :         {
                                920                 :              8 :             rb->message(rb, txn, lsn, false, prefix, message_size, message);
                                921                 :                : 
                                922                 :              8 :             TeardownHistoricSnapshot(false);
                                923                 :                :         }
 3492 simon@2ndQuadrant.co      924                 :UBC           0 :         PG_CATCH();
                                925                 :                :         {
                                926                 :              0 :             TeardownHistoricSnapshot(true);
                                927                 :              0 :             PG_RE_THROW();
                                928                 :                :         }
 3492 simon@2ndQuadrant.co      929         [ -  + ]:CBC           8 :         PG_END_TRY();
                                930                 :                :     }
                                931                 :             47 : }
                                932                 :                : 
                                933                 :                : /*
                                934                 :                :  * AssertTXNLsnOrder
                                935                 :                :  *      Verify LSN ordering of transaction lists in the reorderbuffer
                                936                 :                :  *
                                937                 :                :  * Other LSN-related invariants are checked too.
                                938                 :                :  *
                                939                 :                :  * No-op if assertions are not in use.
                                940                 :                :  */
                                941                 :                : static void
 4257 rhaas@postgresql.org      942                 :           8092 : AssertTXNLsnOrder(ReorderBuffer *rb)
                                943                 :                : {
                                944                 :                : #ifdef USE_ASSERT_CHECKING
 1104 akapila@postgresql.o      945                 :           8092 :     LogicalDecodingContext *ctx = rb->private_data;
                                946                 :                :     dlist_iter  iter;
 4257 rhaas@postgresql.org      947                 :           8092 :     XLogRecPtr  prev_first_lsn = InvalidXLogRecPtr;
 2681 alvherre@alvh.no-ip.      948                 :           8092 :     XLogRecPtr  prev_base_snap_lsn = InvalidXLogRecPtr;
                                949                 :                : 
                                950                 :                :     /*
                                951                 :                :      * Skip the verification if we don't reach the LSN at which we start
                                952                 :                :      * decoding the contents of transactions yet because until we reach the
                                953                 :                :      * LSN, we could have transactions that don't have the association between
                                954                 :                :      * the top-level transaction and subtransaction yet and consequently have
                                955                 :                :      * the same LSN.  We don't guarantee this association until we try to
                                956                 :                :      * decode the actual contents of transaction. The ordering of the records
                                957                 :                :      * prior to the start_decoding_at LSN should have been checked before the
                                958                 :                :      * restart.
                                959                 :                :      */
 1104 akapila@postgresql.o      960         [ +  + ]:           8092 :     if (SnapBuildXactNeedsSkip(ctx->snapshot_builder, ctx->reader->EndRecPtr))
                                961                 :           3827 :         return;
                                962                 :                : 
 4257 rhaas@postgresql.org      963   [ +  -  +  + ]:           8042 :     dlist_foreach(iter, &rb->toplevel_by_lsn)
                                964                 :                :     {
 2681 alvherre@alvh.no-ip.      965                 :           3777 :         ReorderBufferTXN *cur_txn = dlist_container(ReorderBufferTXN, node,
                                966                 :                :                                                     iter.cur);
                                967                 :                : 
                                968                 :                :         /* start LSN must be set */
 4257 rhaas@postgresql.org      969         [ -  + ]:           3777 :         Assert(cur_txn->first_lsn != InvalidXLogRecPtr);
                                970                 :                : 
                                971                 :                :         /* If there is an end LSN, it must be higher than start LSN */
                                972         [ +  + ]:           3777 :         if (cur_txn->end_lsn != InvalidXLogRecPtr)
                                973         [ -  + ]:             22 :             Assert(cur_txn->first_lsn <= cur_txn->end_lsn);
                                974                 :                : 
                                975                 :                :         /* Current initial LSN must be strictly higher than previous */
                                976         [ +  + ]:           3777 :         if (prev_first_lsn != InvalidXLogRecPtr)
                                977         [ -  + ]:            230 :             Assert(prev_first_lsn < cur_txn->first_lsn);
                                978                 :                : 
                                979                 :                :         /* known-as-subtxn txns must not be listed */
 2118 alvherre@alvh.no-ip.      980         [ -  + ]:           3777 :         Assert(!rbtxn_is_known_subxact(cur_txn));
                                981                 :                : 
 4257 rhaas@postgresql.org      982                 :           3777 :         prev_first_lsn = cur_txn->first_lsn;
                                983                 :                :     }
                                984                 :                : 
 2681 alvherre@alvh.no-ip.      985   [ +  -  +  + ]:           6348 :     dlist_foreach(iter, &rb->txns_by_base_snapshot_lsn)
                                986                 :                :     {
                                987                 :           2083 :         ReorderBufferTXN *cur_txn = dlist_container(ReorderBufferTXN,
                                988                 :                :                                                     base_snapshot_node,
                                989                 :                :                                                     iter.cur);
                                990                 :                : 
                                991                 :                :         /* base snapshot (and its LSN) must be set */
                                992         [ -  + ]:           2083 :         Assert(cur_txn->base_snapshot != NULL);
                                993         [ -  + ]:           2083 :         Assert(cur_txn->base_snapshot_lsn != InvalidXLogRecPtr);
                                994                 :                : 
                                995                 :                :         /* current LSN must be strictly higher than previous */
                                996         [ +  + ]:           2083 :         if (prev_base_snap_lsn != InvalidXLogRecPtr)
                                997         [ -  + ]:            175 :             Assert(prev_base_snap_lsn < cur_txn->base_snapshot_lsn);
                                998                 :                : 
                                999                 :                :         /* known-as-subtxn txns must not be listed */
 2118                          1000         [ -  + ]:           2083 :         Assert(!rbtxn_is_known_subxact(cur_txn));
                               1001                 :                : 
 2681                          1002                 :           2083 :         prev_base_snap_lsn = cur_txn->base_snapshot_lsn;
                               1003                 :                :     }
                               1004                 :                : #endif
                               1005                 :                : }
                               1006                 :                : 
                               1007                 :                : /*
                               1008                 :                :  * AssertChangeLsnOrder
                               1009                 :                :  *
                               1010                 :                :  * Check ordering of changes in the (sub)transaction.
                               1011                 :                :  */
                               1012                 :                : static void
 1907 akapila@postgresql.o     1013                 :           2645 : AssertChangeLsnOrder(ReorderBufferTXN *txn)
                               1014                 :                : {
                               1015                 :                : #ifdef USE_ASSERT_CHECKING
                               1016                 :                :     dlist_iter  iter;
                               1017                 :           2645 :     XLogRecPtr  prev_lsn = txn->first_lsn;
                               1018                 :                : 
                               1019   [ +  -  +  + ]:         191219 :     dlist_foreach(iter, &txn->changes)
                               1020                 :                :     {
                               1021                 :                :         ReorderBufferChange *cur_change;
                               1022                 :                : 
                               1023                 :         188574 :         cur_change = dlist_container(ReorderBufferChange, node, iter.cur);
                               1024                 :                : 
                               1025         [ -  + ]:         188574 :         Assert(txn->first_lsn != InvalidXLogRecPtr);
                               1026         [ -  + ]:         188574 :         Assert(cur_change->lsn != InvalidXLogRecPtr);
                               1027         [ -  + ]:         188574 :         Assert(txn->first_lsn <= cur_change->lsn);
                               1028                 :                : 
                               1029         [ +  + ]:         188574 :         if (txn->end_lsn != InvalidXLogRecPtr)
                               1030         [ -  + ]:          30444 :             Assert(cur_change->lsn <= txn->end_lsn);
                               1031                 :                : 
                               1032         [ -  + ]:         188574 :         Assert(prev_lsn <= cur_change->lsn);
                               1033                 :                : 
                               1034                 :         188574 :         prev_lsn = cur_change->lsn;
                               1035                 :                :     }
                               1036                 :                : #endif
                               1037                 :           2645 : }
                               1038                 :                : 
                               1039                 :                : /*
                               1040                 :                :  * ReorderBufferGetOldestTXN
                               1041                 :                :  *      Return oldest transaction in reorderbuffer
                               1042                 :                :  */
                               1043                 :                : ReorderBufferTXN *
 4257 rhaas@postgresql.org     1044                 :            412 : ReorderBufferGetOldestTXN(ReorderBuffer *rb)
                               1045                 :                : {
                               1046                 :                :     ReorderBufferTXN *txn;
                               1047                 :                : 
 2681 alvherre@alvh.no-ip.     1048                 :            412 :     AssertTXNLsnOrder(rb);
                               1049                 :                : 
 4257 rhaas@postgresql.org     1050         [ +  + ]:            412 :     if (dlist_is_empty(&rb->toplevel_by_lsn))
                               1051                 :            357 :         return NULL;
                               1052                 :                : 
                               1053                 :             55 :     txn = dlist_head_element(ReorderBufferTXN, node, &rb->toplevel_by_lsn);
                               1054                 :                : 
 2118 alvherre@alvh.no-ip.     1055         [ -  + ]:             55 :     Assert(!rbtxn_is_known_subxact(txn));
 4257 rhaas@postgresql.org     1056         [ -  + ]:             55 :     Assert(txn->first_lsn != InvalidXLogRecPtr);
                               1057                 :             55 :     return txn;
                               1058                 :                : }
                               1059                 :                : 
                               1060                 :                : /*
                               1061                 :                :  * ReorderBufferGetOldestXmin
                               1062                 :                :  *      Return oldest Xmin in reorderbuffer
                               1063                 :                :  *
                               1064                 :                :  * Returns oldest possibly running Xid from the point of view of snapshots
                               1065                 :                :  * used in the transactions kept by reorderbuffer, or InvalidTransactionId if
                               1066                 :                :  * there are none.
                               1067                 :                :  *
                               1068                 :                :  * Since snapshots are assigned monotonically, this equals the Xmin of the
                               1069                 :                :  * base snapshot with minimal base_snapshot_lsn.
                               1070                 :                :  */
                               1071                 :                : TransactionId
 2681 alvherre@alvh.no-ip.     1072                 :            429 : ReorderBufferGetOldestXmin(ReorderBuffer *rb)
                               1073                 :                : {
                               1074                 :                :     ReorderBufferTXN *txn;
                               1075                 :                : 
                               1076                 :            429 :     AssertTXNLsnOrder(rb);
                               1077                 :                : 
                               1078         [ +  + ]:            429 :     if (dlist_is_empty(&rb->txns_by_base_snapshot_lsn))
                               1079                 :            383 :         return InvalidTransactionId;
                               1080                 :                : 
                               1081                 :             46 :     txn = dlist_head_element(ReorderBufferTXN, base_snapshot_node,
                               1082                 :                :                              &rb->txns_by_base_snapshot_lsn);
                               1083                 :             46 :     return txn->base_snapshot->xmin;
                               1084                 :                : }
                               1085                 :                : 
                               1086                 :                : void
 4257 rhaas@postgresql.org     1087                 :            475 : ReorderBufferSetRestartPoint(ReorderBuffer *rb, XLogRecPtr ptr)
                               1088                 :                : {
                               1089                 :            475 :     rb->current_restart_decoding_lsn = ptr;
                               1090                 :            475 : }
                               1091                 :                : 
                               1092                 :                : /*
                               1093                 :                :  * ReorderBufferAssignChild
                               1094                 :                :  *
                               1095                 :                :  * Make note that we know that subxid is a subtransaction of xid, seen as of
                               1096                 :                :  * the given lsn.
                               1097                 :                :  */
                               1098                 :                : void
                               1099                 :            841 : ReorderBufferAssignChild(ReorderBuffer *rb, TransactionId xid,
                               1100                 :                :                          TransactionId subxid, XLogRecPtr lsn)
                               1101                 :                : {
                               1102                 :                :     ReorderBufferTXN *txn;
                               1103                 :                :     ReorderBufferTXN *subtxn;
                               1104                 :                :     bool        new_top;
                               1105                 :                :     bool        new_sub;
                               1106                 :                : 
                               1107                 :            841 :     txn = ReorderBufferTXNByXid(rb, xid, true, &new_top, lsn, true);
                               1108                 :            841 :     subtxn = ReorderBufferTXNByXid(rb, subxid, true, &new_sub, lsn, false);
                               1109                 :                : 
 2681 alvherre@alvh.no-ip.     1110         [ +  + ]:            841 :     if (!new_sub)
                               1111                 :                :     {
 2118                          1112         [ +  - ]:            186 :         if (rbtxn_is_known_subxact(subtxn))
                               1113                 :                :         {
                               1114                 :                :             /* already associated, nothing to do */
 2681                          1115                 :            186 :             return;
                               1116                 :                :         }
                               1117                 :                :         else
                               1118                 :                :         {
                               1119                 :                :             /*
                               1120                 :                :              * We already saw this transaction, but initially added it to the
                               1121                 :                :              * list of top-level txns.  Now that we know it's not top-level,
                               1122                 :                :              * remove it from there.
                               1123                 :                :              */
 2681 alvherre@alvh.no-ip.     1124                 :UBC           0 :             dlist_delete(&subtxn->node);
                               1125                 :                :         }
                               1126                 :                :     }
                               1127                 :                : 
 2118 alvherre@alvh.no-ip.     1128                 :CBC         655 :     subtxn->txn_flags |= RBTXN_IS_SUBXACT;
 2681                          1129                 :            655 :     subtxn->toplevel_xid = xid;
                               1130         [ -  + ]:            655 :     Assert(subtxn->nsubtxns == 0);
                               1131                 :                : 
                               1132                 :                :     /* set the reference to top-level transaction */
 1923 akapila@postgresql.o     1133                 :            655 :     subtxn->toptxn = txn;
                               1134                 :                : 
                               1135                 :                :     /* add to subtransaction list */
 2681 alvherre@alvh.no-ip.     1136                 :            655 :     dlist_push_tail(&txn->subtxns, &subtxn->node);
                               1137                 :            655 :     txn->nsubtxns++;
                               1138                 :                : 
                               1139                 :                :     /* Possibly transfer the subtxn's snapshot to its top-level txn. */
                               1140                 :            655 :     ReorderBufferTransferSnapToParent(txn, subtxn);
                               1141                 :                : 
                               1142                 :                :     /* Verify LSN-ordering invariant */
                               1143                 :            655 :     AssertTXNLsnOrder(rb);
                               1144                 :                : }
                               1145                 :                : 
                               1146                 :                : /*
                               1147                 :                :  * ReorderBufferTransferSnapToParent
                               1148                 :                :  *      Transfer base snapshot from subtxn to top-level txn, if needed
                               1149                 :                :  *
                               1150                 :                :  * This is done if the top-level txn doesn't have a base snapshot, or if the
                               1151                 :                :  * subtxn's base snapshot has an earlier LSN than the top-level txn's base
                               1152                 :                :  * snapshot's LSN.  This can happen if there are no changes in the toplevel
                               1153                 :                :  * txn but there are some in the subtxn, or the first change in subtxn has
                               1154                 :                :  * earlier LSN than first change in the top-level txn and we learned about
                               1155                 :                :  * their kinship only now.
                               1156                 :                :  *
                               1157                 :                :  * The subtransaction's snapshot is cleared regardless of the transfer
                               1158                 :                :  * happening, since it's not needed anymore in either case.
                               1159                 :                :  *
                               1160                 :                :  * We do this as soon as we become aware of their kinship, to avoid queueing
                               1161                 :                :  * extra snapshots to txns known-as-subtxns -- only top-level txns will
                               1162                 :                :  * receive further snapshots.
                               1163                 :                :  */
                               1164                 :                : static void
                               1165                 :            659 : ReorderBufferTransferSnapToParent(ReorderBufferTXN *txn,
                               1166                 :                :                                   ReorderBufferTXN *subtxn)
                               1167                 :                : {
                               1168         [ -  + ]:            659 :     Assert(subtxn->toplevel_xid == txn->xid);
                               1169                 :                : 
                               1170         [ -  + ]:            659 :     if (subtxn->base_snapshot != NULL)
                               1171                 :                :     {
 2681 alvherre@alvh.no-ip.     1172         [ #  # ]:UBC           0 :         if (txn->base_snapshot == NULL ||
                               1173         [ #  # ]:              0 :             subtxn->base_snapshot_lsn < txn->base_snapshot_lsn)
                               1174                 :                :         {
                               1175                 :                :             /*
                               1176                 :                :              * If the toplevel transaction already has a base snapshot but
                               1177                 :                :              * it's newer than the subxact's, purge it.
                               1178                 :                :              */
                               1179         [ #  # ]:              0 :             if (txn->base_snapshot != NULL)
                               1180                 :                :             {
                               1181                 :              0 :                 SnapBuildSnapDecRefcount(txn->base_snapshot);
                               1182                 :              0 :                 dlist_delete(&txn->base_snapshot_node);
                               1183                 :                :             }
                               1184                 :                : 
                               1185                 :                :             /*
                               1186                 :                :              * The snapshot is now the top transaction's; transfer it, and
                               1187                 :                :              * adjust the list position of the top transaction in the list by
                               1188                 :                :              * moving it to where the subtransaction is.
                               1189                 :                :              */
                               1190                 :              0 :             txn->base_snapshot = subtxn->base_snapshot;
                               1191                 :              0 :             txn->base_snapshot_lsn = subtxn->base_snapshot_lsn;
                               1192                 :              0 :             dlist_insert_before(&subtxn->base_snapshot_node,
                               1193                 :                :                                 &txn->base_snapshot_node);
                               1194                 :                : 
                               1195                 :                :             /*
                               1196                 :                :              * The subtransaction doesn't have a snapshot anymore (so it
                               1197                 :                :              * mustn't be in the list.)
                               1198                 :                :              */
                               1199                 :              0 :             subtxn->base_snapshot = NULL;
                               1200                 :              0 :             subtxn->base_snapshot_lsn = InvalidXLogRecPtr;
                               1201                 :              0 :             dlist_delete(&subtxn->base_snapshot_node);
                               1202                 :                :         }
                               1203                 :                :         else
                               1204                 :                :         {
                               1205                 :                :             /* Base snap of toplevel is fine, so subxact's is not needed */
                               1206                 :              0 :             SnapBuildSnapDecRefcount(subtxn->base_snapshot);
                               1207                 :              0 :             dlist_delete(&subtxn->base_snapshot_node);
                               1208                 :              0 :             subtxn->base_snapshot = NULL;
                               1209                 :              0 :             subtxn->base_snapshot_lsn = InvalidXLogRecPtr;
                               1210                 :                :         }
                               1211                 :                :     }
 4257 rhaas@postgresql.org     1212                 :CBC         659 : }
                               1213                 :                : 
                               1214                 :                : /*
                               1215                 :                :  * Associate a subtransaction with its toplevel transaction at commit
                               1216                 :                :  * time. There may be no further changes added after this.
                               1217                 :                :  */
                               1218                 :                : void
                               1219                 :            267 : ReorderBufferCommitChild(ReorderBuffer *rb, TransactionId xid,
                               1220                 :                :                          TransactionId subxid, XLogRecPtr commit_lsn,
                               1221                 :                :                          XLogRecPtr end_lsn)
                               1222                 :                : {
                               1223                 :                :     ReorderBufferTXN *subtxn;
                               1224                 :                : 
                               1225                 :            267 :     subtxn = ReorderBufferTXNByXid(rb, subxid, false, NULL,
                               1226                 :                :                                    InvalidXLogRecPtr, false);
                               1227                 :                : 
                               1228                 :                :     /*
                               1229                 :                :      * No need to do anything if that subtxn didn't contain any changes
                               1230                 :                :      */
                               1231         [ +  + ]:            267 :     if (!subtxn)
                               1232                 :             81 :         return;
                               1233                 :                : 
                               1234                 :            186 :     subtxn->final_lsn = commit_lsn;
                               1235                 :            186 :     subtxn->end_lsn = end_lsn;
                               1236                 :                : 
                               1237                 :                :     /*
                               1238                 :                :      * Assign this subxact as a child of the toplevel xact (no-op if already
                               1239                 :                :      * done.)
                               1240                 :                :      */
 2681 alvherre@alvh.no-ip.     1241                 :            186 :     ReorderBufferAssignChild(rb, xid, subxid, InvalidXLogRecPtr);
                               1242                 :                : }
                               1243                 :                : 
                               1244                 :                : 
                               1245                 :                : /*
                               1246                 :                :  * Support for efficiently iterating over a transaction's and its
                               1247                 :                :  * subtransactions' changes.
                               1248                 :                :  *
                               1249                 :                :  * We do by doing a k-way merge between transactions/subtransactions. For that
                               1250                 :                :  * we model the current heads of the different transactions as a binary heap
                               1251                 :                :  * so we easily know which (sub-)transaction has the change with the smallest
                               1252                 :                :  * lsn next.
                               1253                 :                :  *
                               1254                 :                :  * We assume the changes in individual transactions are already sorted by LSN.
                               1255                 :                :  */
                               1256                 :                : 
                               1257                 :                : /*
                               1258                 :                :  * Binary heap comparison function.
                               1259                 :                :  */
                               1260                 :                : static int
 4257 rhaas@postgresql.org     1261                 :          51568 : ReorderBufferIterCompare(Datum a, Datum b, void *arg)
                               1262                 :                : {
                               1263                 :          51568 :     ReorderBufferIterTXNState *state = (ReorderBufferIterTXNState *) arg;
                               1264                 :          51568 :     XLogRecPtr  pos_a = state->entries[DatumGetInt32(a)].lsn;
                               1265                 :          51568 :     XLogRecPtr  pos_b = state->entries[DatumGetInt32(b)].lsn;
                               1266                 :                : 
                               1267         [ +  + ]:          51568 :     if (pos_a < pos_b)
                               1268                 :          50712 :         return 1;
                               1269         [ -  + ]:            856 :     else if (pos_a == pos_b)
 4257 rhaas@postgresql.org     1270                 :UBC           0 :         return 0;
 4257 rhaas@postgresql.org     1271                 :CBC         856 :     return -1;
                               1272                 :                : }
                               1273                 :                : 
                               1274                 :                : /*
                               1275                 :                :  * Allocate & initialize an iterator which iterates in lsn order over a
                               1276                 :                :  * transaction and all its subtransactions.
                               1277                 :                :  *
                               1278                 :                :  * Note: The iterator state is returned through iter_state parameter rather
                               1279                 :                :  * than the function's return value.  This is because the state gets cleaned up
                               1280                 :                :  * in a PG_CATCH block in the caller, so we want to make sure the caller gets
                               1281                 :                :  * back the state even if this function throws an exception.
                               1282                 :                :  */
                               1283                 :                : static void
 2145 akapila@postgresql.o     1284                 :           2182 : ReorderBufferIterTXNInit(ReorderBuffer *rb, ReorderBufferTXN *txn,
                               1285                 :                :                          ReorderBufferIterTXNState *volatile *iter_state)
                               1286                 :                : {
 4257 rhaas@postgresql.org     1287                 :           2182 :     Size        nr_txns = 0;
                               1288                 :                :     ReorderBufferIterTXNState *state;
                               1289                 :                :     dlist_iter  cur_txn_i;
                               1290                 :                :     int32       off;
                               1291                 :                : 
 2145 akapila@postgresql.o     1292                 :           2182 :     *iter_state = NULL;
                               1293                 :                : 
                               1294                 :                :     /* Check ordering of changes in the toplevel transaction. */
 1907                          1295                 :           2182 :     AssertChangeLsnOrder(txn);
                               1296                 :                : 
                               1297                 :                :     /*
                               1298                 :                :      * Calculate the size of our heap: one element for every transaction that
                               1299                 :                :      * contains changes.  (Besides the transactions already in the reorder
                               1300                 :                :      * buffer, we count the one we were directly passed.)
                               1301                 :                :      */
 4257 rhaas@postgresql.org     1302         [ +  + ]:           2182 :     if (txn->nentries > 0)
                               1303                 :           2000 :         nr_txns++;
                               1304                 :                : 
                               1305   [ +  -  +  + ]:           2645 :     dlist_foreach(cur_txn_i, &txn->subtxns)
                               1306                 :                :     {
                               1307                 :                :         ReorderBufferTXN *cur_txn;
                               1308                 :                : 
                               1309                 :            463 :         cur_txn = dlist_container(ReorderBufferTXN, node, cur_txn_i.cur);
                               1310                 :                : 
                               1311                 :                :         /* Check ordering of changes in this subtransaction. */
 1907 akapila@postgresql.o     1312                 :            463 :         AssertChangeLsnOrder(cur_txn);
                               1313                 :                : 
 4257 rhaas@postgresql.org     1314         [ +  + ]:            463 :         if (cur_txn->nentries > 0)
                               1315                 :            301 :             nr_txns++;
                               1316                 :                :     }
                               1317                 :                : 
                               1318                 :                :     /* allocate iteration state */
                               1319                 :                :     state = (ReorderBufferIterTXNState *)
                               1320                 :           2182 :         MemoryContextAllocZero(rb->context,
                               1321                 :                :                                sizeof(ReorderBufferIterTXNState) +
                               1322                 :           2182 :                                sizeof(ReorderBufferIterTXNEntry) * nr_txns);
                               1323                 :                : 
                               1324                 :           2182 :     state->nr_txns = nr_txns;
                               1325                 :           2182 :     dlist_init(&state->old_change);
                               1326                 :                : 
                               1327         [ +  + ]:           4483 :     for (off = 0; off < state->nr_txns; off++)
                               1328                 :                :     {
 2145 akapila@postgresql.o     1329                 :           2301 :         state->entries[off].file.vfd = -1;
 4257 rhaas@postgresql.org     1330                 :           2301 :         state->entries[off].segno = 0;
                               1331                 :                :     }
                               1332                 :                : 
                               1333                 :                :     /* allocate heap */
                               1334                 :           2182 :     state->heap = binaryheap_allocate(state->nr_txns,
                               1335                 :                :                                       ReorderBufferIterCompare,
                               1336                 :                :                                       state);
                               1337                 :                : 
                               1338                 :                :     /* Now that the state fields are initialized, it is safe to return it. */
 2145 akapila@postgresql.o     1339                 :           2182 :     *iter_state = state;
                               1340                 :                : 
                               1341                 :                :     /*
                               1342                 :                :      * Now insert items into the binary heap, in an unordered fashion.  (We
                               1343                 :                :      * will run a heap assembly step at the end; this is more efficient.)
                               1344                 :                :      */
                               1345                 :                : 
 4257 rhaas@postgresql.org     1346                 :           2182 :     off = 0;
                               1347                 :                : 
                               1348                 :                :     /* add toplevel transaction if it contains changes */
                               1349         [ +  + ]:           2182 :     if (txn->nentries > 0)
                               1350                 :                :     {
                               1351                 :                :         ReorderBufferChange *cur_change;
                               1352                 :                : 
 2118 alvherre@alvh.no-ip.     1353         [ +  + ]:           2000 :         if (rbtxn_is_serialized(txn))
                               1354                 :                :         {
                               1355                 :                :             /* serialize remaining changes */
 3312 andres@anarazel.de       1356                 :             22 :             ReorderBufferSerializeTXN(rb, txn);
 2145 akapila@postgresql.o     1357                 :             22 :             ReorderBufferRestoreChanges(rb, txn, &state->entries[off].file,
                               1358                 :                :                                         &state->entries[off].segno);
                               1359                 :                :         }
                               1360                 :                : 
 4257 rhaas@postgresql.org     1361                 :           2000 :         cur_change = dlist_head_element(ReorderBufferChange, node,
                               1362                 :                :                                         &txn->changes);
                               1363                 :                : 
                               1364                 :           2000 :         state->entries[off].lsn = cur_change->lsn;
                               1365                 :           2000 :         state->entries[off].change = cur_change;
                               1366                 :           2000 :         state->entries[off].txn = txn;
                               1367                 :                : 
                               1368                 :           2000 :         binaryheap_add_unordered(state->heap, Int32GetDatum(off++));
                               1369                 :                :     }
                               1370                 :                : 
                               1371                 :                :     /* add subtransactions if they contain changes */
                               1372   [ +  -  +  + ]:           2645 :     dlist_foreach(cur_txn_i, &txn->subtxns)
                               1373                 :                :     {
                               1374                 :                :         ReorderBufferTXN *cur_txn;
                               1375                 :                : 
                               1376                 :            463 :         cur_txn = dlist_container(ReorderBufferTXN, node, cur_txn_i.cur);
                               1377                 :                : 
                               1378         [ +  + ]:            463 :         if (cur_txn->nentries > 0)
                               1379                 :                :         {
                               1380                 :                :             ReorderBufferChange *cur_change;
                               1381                 :                : 
 2118 alvherre@alvh.no-ip.     1382         [ +  + ]:            301 :             if (rbtxn_is_serialized(cur_txn))
                               1383                 :                :             {
                               1384                 :                :                 /* serialize remaining changes */
 3312 andres@anarazel.de       1385                 :             17 :                 ReorderBufferSerializeTXN(rb, cur_txn);
 4257 rhaas@postgresql.org     1386                 :             17 :                 ReorderBufferRestoreChanges(rb, cur_txn,
                               1387                 :                :                                             &state->entries[off].file,
                               1388                 :                :                                             &state->entries[off].segno);
                               1389                 :                :             }
                               1390                 :            301 :             cur_change = dlist_head_element(ReorderBufferChange, node,
                               1391                 :                :                                             &cur_txn->changes);
                               1392                 :                : 
                               1393                 :            301 :             state->entries[off].lsn = cur_change->lsn;
                               1394                 :            301 :             state->entries[off].change = cur_change;
                               1395                 :            301 :             state->entries[off].txn = cur_txn;
                               1396                 :                : 
                               1397                 :            301 :             binaryheap_add_unordered(state->heap, Int32GetDatum(off++));
                               1398                 :                :         }
                               1399                 :                :     }
                               1400                 :                : 
                               1401                 :                :     /* assemble a valid binary heap */
                               1402                 :           2182 :     binaryheap_build(state->heap);
                               1403                 :           2182 : }
                               1404                 :                : 
                               1405                 :                : /*
                               1406                 :                :  * Return the next change when iterating over a transaction and its
                               1407                 :                :  * subtransactions.
                               1408                 :                :  *
                               1409                 :                :  * Returns NULL when no further changes exist.
                               1410                 :                :  */
                               1411                 :                : static ReorderBufferChange *
                               1412                 :         359036 : ReorderBufferIterTXNNext(ReorderBuffer *rb, ReorderBufferIterTXNState *state)
                               1413                 :                : {
                               1414                 :                :     ReorderBufferChange *change;
                               1415                 :                :     ReorderBufferIterTXNEntry *entry;
                               1416                 :                :     int32       off;
                               1417                 :                : 
                               1418                 :                :     /* nothing there anymore */
  119 nathan@postgresql.or     1419         [ +  + ]:GNC      359036 :     if (binaryheap_empty(state->heap))
 4257 rhaas@postgresql.org     1420                 :CBC        2171 :         return NULL;
                               1421                 :                : 
                               1422                 :         356865 :     off = DatumGetInt32(binaryheap_first(state->heap));
                               1423                 :         356865 :     entry = &state->entries[off];
                               1424                 :                : 
                               1425                 :                :     /* free memory we might have "leaked" in the previous *Next call */
                               1426         [ +  + ]:         356865 :     if (!dlist_is_empty(&state->old_change))
                               1427                 :                :     {
                               1428                 :             44 :         change = dlist_container(ReorderBufferChange, node,
                               1429                 :                :                                  dlist_pop_head_node(&state->old_change));
  230 heikki.linnakangas@i     1430                 :             44 :         ReorderBufferFreeChange(rb, change, true);
 4257 rhaas@postgresql.org     1431         [ -  + ]:             44 :         Assert(dlist_is_empty(&state->old_change));
                               1432                 :                :     }
                               1433                 :                : 
                               1434                 :         356865 :     change = entry->change;
                               1435                 :                : 
                               1436                 :                :     /*
                               1437                 :                :      * update heap with information about which transaction has the next
                               1438                 :                :      * relevant change in LSN order
                               1439                 :                :      */
                               1440                 :                : 
                               1441                 :                :     /* there are in-memory changes */
                               1442         [ +  + ]:         356865 :     if (dlist_has_next(&entry->txn->changes, &entry->change->node))
                               1443                 :                :     {
                               1444                 :         354532 :         dlist_node *next = dlist_next_node(&entry->txn->changes, &change->node);
                               1445                 :         354532 :         ReorderBufferChange *next_change =
  893 tgl@sss.pgh.pa.us        1446                 :         354532 :             dlist_container(ReorderBufferChange, node, next);
                               1447                 :                : 
                               1448                 :                :         /* txn stays the same */
 4257 rhaas@postgresql.org     1449                 :         354532 :         state->entries[off].lsn = next_change->lsn;
                               1450                 :         354532 :         state->entries[off].change = next_change;
                               1451                 :                : 
                               1452                 :         354532 :         binaryheap_replace_first(state->heap, Int32GetDatum(off));
                               1453                 :         354532 :         return change;
                               1454                 :                :     }
                               1455                 :                : 
                               1456                 :                :     /* try to load changes from disk */
                               1457         [ +  + ]:           2333 :     if (entry->txn->nentries != entry->txn->nentries_mem)
                               1458                 :                :     {
                               1459                 :                :         /*
                               1460                 :                :          * Ugly: restoring changes will reuse *Change records, thus delete the
                               1461                 :                :          * current one from the per-tx list and only free in the next call.
                               1462                 :                :          */
                               1463                 :             63 :         dlist_delete(&change->node);
                               1464                 :             63 :         dlist_push_tail(&state->old_change, &change->node);
                               1465                 :                : 
                               1466                 :                :         /*
                               1467                 :                :          * Update the total bytes processed by the txn for which we are
                               1468                 :                :          * releasing the current set of changes and restoring the new set of
                               1469                 :                :          * changes.
                               1470                 :                :          */
 1639 akapila@postgresql.o     1471                 :             63 :         rb->totalBytes += entry->txn->size;
 2145                          1472         [ +  + ]:             63 :         if (ReorderBufferRestoreChanges(rb, entry->txn, &entry->file,
                               1473                 :                :                                         &state->entries[off].segno))
                               1474                 :                :         {
                               1475                 :                :             /* successfully restored changes from disk */
                               1476                 :                :             ReorderBufferChange *next_change =
  893 tgl@sss.pgh.pa.us        1477                 :             35 :                 dlist_head_element(ReorderBufferChange, node,
                               1478                 :                :                                    &entry->txn->changes);
                               1479                 :                : 
 4257 rhaas@postgresql.org     1480         [ -  + ]:             35 :             elog(DEBUG2, "restored %u/%u changes from disk",
                               1481                 :                :                  (uint32) entry->txn->nentries_mem,
                               1482                 :                :                  (uint32) entry->txn->nentries);
                               1483                 :                : 
                               1484         [ -  + ]:             35 :             Assert(entry->txn->nentries_mem);
                               1485                 :                :             /* txn stays the same */
                               1486                 :             35 :             state->entries[off].lsn = next_change->lsn;
                               1487                 :             35 :             state->entries[off].change = next_change;
                               1488                 :             35 :             binaryheap_replace_first(state->heap, Int32GetDatum(off));
                               1489                 :                : 
                               1490                 :             35 :             return change;
                               1491                 :                :         }
                               1492                 :                :     }
                               1493                 :                : 
                               1494                 :                :     /* ok, no changes there anymore, remove */
                               1495                 :           2298 :     binaryheap_remove_first(state->heap);
                               1496                 :                : 
                               1497                 :           2298 :     return change;
                               1498                 :                : }
                               1499                 :                : 
                               1500                 :                : /*
                               1501                 :                :  * Deallocate the iterator
                               1502                 :                :  */
                               1503                 :                : static void
                               1504                 :           2180 : ReorderBufferIterTXNFinish(ReorderBuffer *rb,
                               1505                 :                :                            ReorderBufferIterTXNState *state)
                               1506                 :                : {
                               1507                 :                :     int32       off;
                               1508                 :                : 
                               1509         [ +  + ]:           4479 :     for (off = 0; off < state->nr_txns; off++)
                               1510                 :                :     {
 2145 akapila@postgresql.o     1511         [ -  + ]:           2299 :         if (state->entries[off].file.vfd != -1)
 2145 akapila@postgresql.o     1512                 :UBC           0 :             FileClose(state->entries[off].file.vfd);
                               1513                 :                :     }
                               1514                 :                : 
                               1515                 :                :     /* free memory we might have "leaked" in the last *Next call */
 4257 rhaas@postgresql.org     1516         [ +  + ]:CBC        2180 :     if (!dlist_is_empty(&state->old_change))
                               1517                 :                :     {
                               1518                 :                :         ReorderBufferChange *change;
                               1519                 :                : 
                               1520                 :             18 :         change = dlist_container(ReorderBufferChange, node,
                               1521                 :                :                                  dlist_pop_head_node(&state->old_change));
  230 heikki.linnakangas@i     1522                 :             18 :         ReorderBufferFreeChange(rb, change, true);
 4257 rhaas@postgresql.org     1523         [ -  + ]:             18 :         Assert(dlist_is_empty(&state->old_change));
                               1524                 :                :     }
                               1525                 :                : 
                               1526                 :           2180 :     binaryheap_free(state->heap);
                               1527                 :           2180 :     pfree(state);
                               1528                 :           2180 : }
                               1529                 :                : 
                               1530                 :                : /*
                               1531                 :                :  * Cleanup the contents of a transaction, usually after the transaction
                               1532                 :                :  * committed or aborted.
                               1533                 :                :  */
                               1534                 :                : static void
                               1535                 :           3905 : ReorderBufferCleanupTXN(ReorderBuffer *rb, ReorderBufferTXN *txn)
                               1536                 :                : {
                               1537                 :                :     bool        found;
                               1538                 :                :     dlist_mutable_iter iter;
  428 msawada@postgresql.o     1539                 :           3905 :     Size        mem_freed = 0;
                               1540                 :                : 
                               1541                 :                :     /* cleanup subtransactions & their changes */
 4257 rhaas@postgresql.org     1542   [ +  -  +  + ]:           4090 :     dlist_foreach_modify(iter, &txn->subtxns)
                               1543                 :                :     {
                               1544                 :                :         ReorderBufferTXN *subtxn;
                               1545                 :                : 
                               1546                 :            185 :         subtxn = dlist_container(ReorderBufferTXN, node, iter.cur);
                               1547                 :                : 
                               1548                 :                :         /*
                               1549                 :                :          * Subtransactions are always associated to the toplevel TXN, even if
                               1550                 :                :          * they originally were happening inside another subtxn, so we won't
                               1551                 :                :          * ever recurse more than one level deep here.
                               1552                 :                :          */
 2118 alvherre@alvh.no-ip.     1553         [ -  + ]:            185 :         Assert(rbtxn_is_known_subxact(subtxn));
 4257 rhaas@postgresql.org     1554         [ -  + ]:            185 :         Assert(subtxn->nsubtxns == 0);
                               1555                 :                : 
                               1556                 :            185 :         ReorderBufferCleanupTXN(rb, subtxn);
                               1557                 :                :     }
                               1558                 :                : 
                               1559                 :                :     /* cleanup changes in the txn */
                               1560   [ +  -  +  + ]:          71604 :     dlist_foreach_modify(iter, &txn->changes)
                               1561                 :                :     {
                               1562                 :                :         ReorderBufferChange *change;
                               1563                 :                : 
                               1564                 :          67699 :         change = dlist_container(ReorderBufferChange, node, iter.cur);
                               1565                 :                : 
                               1566                 :                :         /* Check we're not mixing changes from different transactions. */
 2173 akapila@postgresql.o     1567         [ -  + ]:          67699 :         Assert(change->txn == txn);
                               1568                 :                : 
                               1569                 :                :         /*
                               1570                 :                :          * Instead of updating the memory counter for individual changes, we
                               1571                 :                :          * sum up the size of memory to free so we can update the memory
                               1572                 :                :          * counter all together below. This saves costs of maintaining the
                               1573                 :                :          * max-heap.
                               1574                 :                :          */
  428 msawada@postgresql.o     1575                 :          67699 :         mem_freed += ReorderBufferChangeSize(change);
                               1576                 :                : 
  230 heikki.linnakangas@i     1577                 :          67699 :         ReorderBufferFreeChange(rb, change, false);
                               1578                 :                :     }
                               1579                 :                : 
                               1580                 :                :     /* Update the memory counter */
  428 msawada@postgresql.o     1581                 :           3905 :     ReorderBufferChangeMemoryUpdate(rb, NULL, txn, false, mem_freed);
                               1582                 :                : 
                               1583                 :                :     /*
                               1584                 :                :      * Cleanup the tuplecids we stored for decoding catalog snapshot access.
                               1585                 :                :      * They are always stored in the toplevel transaction.
                               1586                 :                :      */
 4257 rhaas@postgresql.org     1587   [ +  -  +  + ]:          28116 :     dlist_foreach_modify(iter, &txn->tuplecids)
                               1588                 :                :     {
                               1589                 :                :         ReorderBufferChange *change;
                               1590                 :                : 
                               1591                 :          24211 :         change = dlist_container(ReorderBufferChange, node, iter.cur);
                               1592                 :                : 
                               1593                 :                :         /* Check we're not mixing changes from different transactions. */
 2173 akapila@postgresql.o     1594         [ -  + ]:          24211 :         Assert(change->txn == txn);
 4253 tgl@sss.pgh.pa.us        1595         [ -  + ]:          24211 :         Assert(change->action == REORDER_BUFFER_CHANGE_INTERNAL_TUPLECID);
                               1596                 :                : 
  230 heikki.linnakangas@i     1597                 :          24211 :         ReorderBufferFreeChange(rb, change, true);
                               1598                 :                :     }
                               1599                 :                : 
                               1600                 :                :     /*
                               1601                 :                :      * Cleanup the base snapshot, if set.
                               1602                 :                :      */
 4257 rhaas@postgresql.org     1603         [ +  + ]:           3905 :     if (txn->base_snapshot != NULL)
                               1604                 :                :     {
                               1605                 :           3235 :         SnapBuildSnapDecRefcount(txn->base_snapshot);
 2681 alvherre@alvh.no-ip.     1606                 :           3235 :         dlist_delete(&txn->base_snapshot_node);
                               1607                 :                :     }
                               1608                 :                : 
                               1609                 :                :     /*
                               1610                 :                :      * Cleanup the snapshot for the last streamed run.
                               1611                 :                :      */
 1907 akapila@postgresql.o     1612         [ +  + ]:           3905 :     if (txn->snapshot_now != NULL)
                               1613                 :                :     {
                               1614         [ -  + ]:             66 :         Assert(rbtxn_is_streamed(txn));
                               1615                 :             66 :         ReorderBufferFreeSnap(rb, txn->snapshot_now);
                               1616                 :                :     }
                               1617                 :                : 
                               1618                 :                :     /*
                               1619                 :                :      * Remove TXN from its containing lists.
                               1620                 :                :      *
                               1621                 :                :      * Note: if txn is known as subxact, we are deleting the TXN from its
                               1622                 :                :      * parent's list of known subxacts; this leaves the parent's nsubxacts
                               1623                 :                :      * count too high, but we don't care.  Otherwise, we are deleting the TXN
                               1624                 :                :      * from the LSN-ordered list of toplevel TXNs. We remove the TXN from the
                               1625                 :                :      * list of catalog modifying transactions as well.
                               1626                 :                :      */
 3341 tgl@sss.pgh.pa.us        1627                 :           3905 :     dlist_delete(&txn->node);
 1174 akapila@postgresql.o     1628         [ +  + ]:           3905 :     if (rbtxn_has_catalog_changes(txn))
 1091 drowley@postgresql.o     1629                 :           1307 :         dclist_delete_from(&rb->catchange_txns, &txn->catchange_node);
                               1630                 :                : 
                               1631                 :                :     /* now remove reference from buffer */
  893 tgl@sss.pgh.pa.us        1632                 :           3905 :     hash_search(rb->by_txn, &txn->xid, HASH_REMOVE, &found);
 4257 rhaas@postgresql.org     1633         [ -  + ]:           3905 :     Assert(found);
                               1634                 :                : 
                               1635                 :                :     /* remove entries spilled to disk */
 2118 alvherre@alvh.no-ip.     1636         [ +  + ]:           3905 :     if (rbtxn_is_serialized(txn))
 4257 rhaas@postgresql.org     1637                 :            244 :         ReorderBufferRestoreCleanup(rb, txn);
                               1638                 :                : 
                               1639                 :                :     /* deallocate */
  230 heikki.linnakangas@i     1640                 :           3905 :     ReorderBufferFreeTXN(rb, txn);
 4257 rhaas@postgresql.org     1641                 :           3905 : }
                               1642                 :                : 
                               1643                 :                : /*
                               1644                 :                :  * Discard changes from a transaction (and subtransactions), either after
                               1645                 :                :  * streaming, decoding them at PREPARE, or detecting the transaction abort.
                               1646                 :                :  * Keep the remaining info - transactions, tuplecids, invalidations and
                               1647                 :                :  * snapshots.
                               1648                 :                :  *
                               1649                 :                :  * We additionally remove tuplecids after decoding the transaction at prepare
                               1650                 :                :  * time as we only need to perform invalidation at rollback or commit prepared.
                               1651                 :                :  *
                               1652                 :                :  * 'txn_prepared' indicates that we have decoded the transaction at prepare
                               1653                 :                :  * time.
                               1654                 :                :  */
                               1655                 :                : static void
 1758 akapila@postgresql.o     1656                 :           1076 : ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, bool txn_prepared)
                               1657                 :                : {
                               1658                 :                :     dlist_mutable_iter iter;
  428 msawada@postgresql.o     1659                 :           1076 :     Size        mem_freed = 0;
                               1660                 :                : 
                               1661                 :                :     /* cleanup subtransactions & their changes */
 1907 akapila@postgresql.o     1662   [ +  -  +  + ]:           1373 :     dlist_foreach_modify(iter, &txn->subtxns)
                               1663                 :                :     {
                               1664                 :                :         ReorderBufferTXN *subtxn;
                               1665                 :                : 
                               1666                 :            297 :         subtxn = dlist_container(ReorderBufferTXN, node, iter.cur);
                               1667                 :                : 
                               1668                 :                :         /*
                               1669                 :                :          * Subtransactions are always associated to the toplevel TXN, even if
                               1670                 :                :          * they originally were happening inside another subtxn, so we won't
                               1671                 :                :          * ever recurse more than one level deep here.
                               1672                 :                :          */
                               1673         [ -  + ]:            297 :         Assert(rbtxn_is_known_subxact(subtxn));
                               1674         [ -  + ]:            297 :         Assert(subtxn->nsubtxns == 0);
                               1675                 :                : 
  258 msawada@postgresql.o     1676                 :            297 :         ReorderBufferMaybeMarkTXNStreamed(rb, subtxn);
 1758 akapila@postgresql.o     1677                 :            297 :         ReorderBufferTruncateTXN(rb, subtxn, txn_prepared);
                               1678                 :                :     }
                               1679                 :                : 
                               1680                 :                :     /* cleanup changes in the txn */
 1907                          1681   [ +  -  +  + ]:         163966 :     dlist_foreach_modify(iter, &txn->changes)
                               1682                 :                :     {
                               1683                 :                :         ReorderBufferChange *change;
                               1684                 :                : 
                               1685                 :         162890 :         change = dlist_container(ReorderBufferChange, node, iter.cur);
                               1686                 :                : 
                               1687                 :                :         /* Check we're not mixing changes from different transactions. */
                               1688         [ -  + ]:         162890 :         Assert(change->txn == txn);
                               1689                 :                : 
                               1690                 :                :         /* remove the change from its containing list */
                               1691                 :         162890 :         dlist_delete(&change->node);
                               1692                 :                : 
                               1693                 :                :         /*
                               1694                 :                :          * Instead of updating the memory counter for individual changes, we
                               1695                 :                :          * sum up the size of memory to free so we can update the memory
                               1696                 :                :          * counter all together below. This saves costs of maintaining the
                               1697                 :                :          * max-heap.
                               1698                 :                :          */
  428 msawada@postgresql.o     1699                 :         162890 :         mem_freed += ReorderBufferChangeSize(change);
                               1700                 :                : 
  230 heikki.linnakangas@i     1701                 :         162890 :         ReorderBufferFreeChange(rb, change, false);
                               1702                 :                :     }
                               1703                 :                : 
                               1704                 :                :     /* Update the memory counter */
  428 msawada@postgresql.o     1705                 :           1076 :     ReorderBufferChangeMemoryUpdate(rb, NULL, txn, false, mem_freed);
                               1706                 :                : 
 1758 akapila@postgresql.o     1707         [ +  + ]:           1076 :     if (txn_prepared)
                               1708                 :                :     {
                               1709                 :                :         /*
                               1710                 :                :          * If this is a prepared txn, cleanup the tuplecids we stored for
                               1711                 :                :          * decoding catalog snapshot access. They are always stored in the
                               1712                 :                :          * toplevel transaction.
                               1713                 :                :          */
                               1714   [ +  -  +  + ]:            185 :         dlist_foreach_modify(iter, &txn->tuplecids)
                               1715                 :                :         {
                               1716                 :                :             ReorderBufferChange *change;
                               1717                 :                : 
                               1718                 :            123 :             change = dlist_container(ReorderBufferChange, node, iter.cur);
                               1719                 :                : 
                               1720                 :                :             /* Check we're not mixing changes from different transactions. */
                               1721         [ -  + ]:            123 :             Assert(change->txn == txn);
                               1722         [ -  + ]:            123 :             Assert(change->action == REORDER_BUFFER_CHANGE_INTERNAL_TUPLECID);
                               1723                 :                : 
                               1724                 :                :             /* Remove the change from its containing list. */
                               1725                 :            123 :             dlist_delete(&change->node);
                               1726                 :                : 
  230 heikki.linnakangas@i     1727                 :            123 :             ReorderBufferFreeChange(rb, change, true);
                               1728                 :                :         }
                               1729                 :                :     }
                               1730                 :                : 
                               1731                 :                :     /*
                               1732                 :                :      * Destroy the (relfilelocator, ctid) hashtable, so that we don't leak any
                               1733                 :                :      * memory. We could also keep the hash table and update it with new ctid
                               1734                 :                :      * values, but this seems simpler and good enough for now.
                               1735                 :                :      */
 1907 akapila@postgresql.o     1736         [ +  + ]:           1076 :     if (txn->tuplecid_hash != NULL)
                               1737                 :                :     {
                               1738                 :             51 :         hash_destroy(txn->tuplecid_hash);
                               1739                 :             51 :         txn->tuplecid_hash = NULL;
                               1740                 :                :     }
                               1741                 :                : 
                               1742                 :                :     /* If this txn is serialized then clean the disk space. */
                               1743         [ +  + ]:           1076 :     if (rbtxn_is_serialized(txn))
                               1744                 :                :     {
                               1745                 :              8 :         ReorderBufferRestoreCleanup(rb, txn);
                               1746                 :              8 :         txn->txn_flags &= ~RBTXN_IS_SERIALIZED;
                               1747                 :                : 
                               1748                 :                :         /*
                               1749                 :                :          * We set this flag to indicate if the transaction is ever serialized.
                               1750                 :                :          * We need this to accurately update the stats as otherwise the same
                               1751                 :                :          * transaction can be counted as serialized multiple times.
                               1752                 :                :          */
 1846                          1753                 :              8 :         txn->txn_flags |= RBTXN_IS_SERIALIZED_CLEAR;
                               1754                 :                :     }
                               1755                 :                : 
                               1756                 :                :     /* also reset the number of entries in the transaction */
 1907                          1757                 :           1076 :     txn->nentries_mem = 0;
                               1758                 :           1076 :     txn->nentries = 0;
                               1759                 :           1076 : }
                               1760                 :                : 
                               1761                 :                : /*
                               1762                 :                :  * Check the transaction status by CLOG lookup and discard all changes if
                               1763                 :                :  * the transaction is aborted. The transaction status is cached in
                               1764                 :                :  * txn->txn_flags so we can skip future changes and avoid CLOG lookups on the
                               1765                 :                :  * next call.
                               1766                 :                :  *
                               1767                 :                :  * Return true if the transaction is aborted, otherwise return false.
                               1768                 :                :  *
                               1769                 :                :  * When the 'debug_logical_replication_streaming' is set to "immediate", we
                               1770                 :                :  * don't check the transaction status, meaning the caller will always process
                               1771                 :                :  * this transaction.
                               1772                 :                :  */
                               1773                 :                : static bool
  258 msawada@postgresql.o     1774                 :           3974 : ReorderBufferCheckAndTruncateAbortedTXN(ReorderBuffer *rb, ReorderBufferTXN *txn)
                               1775                 :                : {
                               1776                 :                :     /* Quick return for regression tests */
                               1777         [ +  + ]:           3974 :     if (unlikely(debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_IMMEDIATE))
                               1778                 :            962 :         return false;
                               1779                 :                : 
                               1780                 :                :     /*
                               1781                 :                :      * Quick return if the transaction status is already known.
                               1782                 :                :      */
                               1783                 :                : 
                               1784         [ +  + ]:           3012 :     if (rbtxn_is_committed(txn))
                               1785                 :           2557 :         return false;
                               1786         [ -  + ]:            455 :     if (rbtxn_is_aborted(txn))
                               1787                 :                :     {
                               1788                 :                :         /* Already-aborted transactions should not have any changes */
  258 msawada@postgresql.o     1789         [ #  # ]:UBC           0 :         Assert(txn->size == 0);
                               1790                 :                : 
                               1791                 :              0 :         return true;
                               1792                 :                :     }
                               1793                 :                : 
                               1794                 :                :     /* Otherwise, check the transaction status using CLOG lookup */
                               1795                 :                : 
  258 msawada@postgresql.o     1796         [ +  + ]:CBC         455 :     if (TransactionIdIsInProgress(txn->xid))
                               1797                 :            245 :         return false;
                               1798                 :                : 
                               1799         [ +  + ]:            210 :     if (TransactionIdDidCommit(txn->xid))
                               1800                 :                :     {
                               1801                 :                :         /*
                               1802                 :                :          * Remember the transaction is committed so that we can skip CLOG
                               1803                 :                :          * check next time, avoiding the pressure on CLOG lookup.
                               1804                 :                :          */
                               1805         [ -  + ]:            201 :         Assert(!rbtxn_is_aborted(txn));
                               1806                 :            201 :         txn->txn_flags |= RBTXN_IS_COMMITTED;
                               1807                 :            201 :         return false;
                               1808                 :                :     }
                               1809                 :                : 
                               1810                 :                :     /*
                               1811                 :                :      * The transaction aborted. We discard both the changes collected so far
                               1812                 :                :      * and the toast reconstruction data. The full cleanup will happen as part
                               1813                 :                :      * of decoding ABORT record of this transaction.
                               1814                 :                :      */
                               1815                 :              9 :     ReorderBufferTruncateTXN(rb, txn, rbtxn_is_prepared(txn));
                               1816                 :              9 :     ReorderBufferToastReset(rb, txn);
                               1817                 :                : 
                               1818                 :                :     /* All changes should be discarded */
                               1819         [ -  + ]:              9 :     Assert(txn->size == 0);
                               1820                 :                : 
                               1821                 :                :     /*
                               1822                 :                :      * Mark the transaction as aborted so we can ignore future changes of this
                               1823                 :                :      * transaction.
                               1824                 :                :      */
                               1825         [ -  + ]:              9 :     Assert(!rbtxn_is_committed(txn));
                               1826                 :              9 :     txn->txn_flags |= RBTXN_IS_ABORTED;
                               1827                 :                : 
                               1828                 :              9 :     return true;
                               1829                 :                : }
                               1830                 :                : 
                               1831                 :                : /*
                               1832                 :                :  * Build a hash with a (relfilelocator, ctid) -> (cmin, cmax) mapping for use by
                               1833                 :                :  * HeapTupleSatisfiesHistoricMVCC.
                               1834                 :                :  */
                               1835                 :                : static void
 4257 rhaas@postgresql.org     1836                 :           2182 : ReorderBufferBuildTupleCidHash(ReorderBuffer *rb, ReorderBufferTXN *txn)
                               1837                 :                : {
                               1838                 :                :     dlist_iter  iter;
                               1839                 :                :     HASHCTL     hash_ctl;
                               1840                 :                : 
 2118 alvherre@alvh.no-ip.     1841   [ +  +  +  + ]:           2182 :     if (!rbtxn_has_catalog_changes(txn) || dlist_is_empty(&txn->tuplecids))
 4257 rhaas@postgresql.org     1842                 :           1455 :         return;
                               1843                 :                : 
                               1844                 :            727 :     hash_ctl.keysize = sizeof(ReorderBufferTupleCidKey);
                               1845                 :            727 :     hash_ctl.entrysize = sizeof(ReorderBufferTupleCidEnt);
                               1846                 :            727 :     hash_ctl.hcxt = rb->context;
                               1847                 :                : 
                               1848                 :                :     /*
                               1849                 :                :      * create the hash with the exact number of to-be-stored tuplecids from
                               1850                 :                :      * the start
                               1851                 :                :      */
                               1852                 :            727 :     txn->tuplecid_hash =
                               1853                 :            727 :         hash_create("ReorderBufferTupleCid", txn->ntuplecids, &hash_ctl,
                               1854                 :                :                     HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
                               1855                 :                : 
                               1856   [ +  -  +  + ]:          12904 :     dlist_foreach(iter, &txn->tuplecids)
                               1857                 :                :     {
                               1858                 :                :         ReorderBufferTupleCidKey key;
                               1859                 :                :         ReorderBufferTupleCidEnt *ent;
                               1860                 :                :         bool        found;
                               1861                 :                :         ReorderBufferChange *change;
                               1862                 :                : 
                               1863                 :          12177 :         change = dlist_container(ReorderBufferChange, node, iter.cur);
                               1864                 :                : 
 4253 tgl@sss.pgh.pa.us        1865         [ -  + ]:          12177 :         Assert(change->action == REORDER_BUFFER_CHANGE_INTERNAL_TUPLECID);
                               1866                 :                : 
                               1867                 :                :         /* be careful about padding */
 4257 rhaas@postgresql.org     1868                 :          12177 :         memset(&key, 0, sizeof(ReorderBufferTupleCidKey));
                               1869                 :                : 
 1210                          1870                 :          12177 :         key.rlocator = change->data.tuplecid.locator;
                               1871                 :                : 
 4253 tgl@sss.pgh.pa.us        1872                 :          12177 :         ItemPointerCopy(&change->data.tuplecid.tid,
                               1873                 :                :                         &key.tid);
                               1874                 :                : 
                               1875                 :                :         ent = (ReorderBufferTupleCidEnt *)
  995 peter@eisentraut.org     1876                 :          12177 :             hash_search(txn->tuplecid_hash, &key, HASH_ENTER, &found);
 4257 rhaas@postgresql.org     1877         [ +  + ]:          12177 :         if (!found)
                               1878                 :                :         {
 4253 tgl@sss.pgh.pa.us        1879                 :          10540 :             ent->cmin = change->data.tuplecid.cmin;
                               1880                 :          10540 :             ent->cmax = change->data.tuplecid.cmax;
                               1881                 :          10540 :             ent->combocid = change->data.tuplecid.combocid;
                               1882                 :                :         }
                               1883                 :                :         else
                               1884                 :                :         {
                               1885                 :                :             /*
                               1886                 :                :              * Maybe we already saw this tuple before in this transaction, but
                               1887                 :                :              * if so it must have the same cmin.
                               1888                 :                :              */
                               1889         [ -  + ]:           1637 :             Assert(ent->cmin == change->data.tuplecid.cmin);
                               1890                 :                : 
                               1891                 :                :             /*
                               1892                 :                :              * cmax may be initially invalid, but once set it can only grow,
                               1893                 :                :              * and never become invalid again.
                               1894                 :                :              */
 2450 alvherre@alvh.no-ip.     1895   [ +  +  +  -  :           1637 :             Assert((ent->cmax == InvalidCommandId) ||
                                              -  + ]
                               1896                 :                :                    ((change->data.tuplecid.cmax != InvalidCommandId) &&
                               1897                 :                :                     (change->data.tuplecid.cmax > ent->cmax)));
 4253 tgl@sss.pgh.pa.us        1898                 :           1637 :             ent->cmax = change->data.tuplecid.cmax;
                               1899                 :                :         }
                               1900                 :                :     }
                               1901                 :                : }
                               1902                 :                : 
                               1903                 :                : /*
                               1904                 :                :  * Copy a provided snapshot so we can modify it privately. This is needed so
                               1905                 :                :  * that catalog modifying transactions can look into intermediate catalog
                               1906                 :                :  * states.
                               1907                 :                :  */
                               1908                 :                : static Snapshot
 4257 rhaas@postgresql.org     1909                 :           2086 : ReorderBufferCopySnap(ReorderBuffer *rb, Snapshot orig_snap,
                               1910                 :                :                       ReorderBufferTXN *txn, CommandId cid)
                               1911                 :                : {
                               1912                 :                :     Snapshot    snap;
                               1913                 :                :     dlist_iter  iter;
                               1914                 :           2086 :     int         i = 0;
                               1915                 :                :     Size        size;
                               1916                 :                : 
                               1917                 :           2086 :     size = sizeof(SnapshotData) +
                               1918                 :           2086 :         sizeof(TransactionId) * orig_snap->xcnt +
                               1919                 :           2086 :         sizeof(TransactionId) * (txn->nsubtxns + 1);
                               1920                 :                : 
                               1921                 :           2086 :     snap = MemoryContextAllocZero(rb->context, size);
                               1922                 :           2086 :     memcpy(snap, orig_snap, sizeof(SnapshotData));
                               1923                 :                : 
                               1924                 :           2086 :     snap->copied = true;
 3848 heikki.linnakangas@i     1925                 :           2086 :     snap->active_count = 1;      /* mark as active so nobody frees it */
                               1926                 :           2086 :     snap->regd_count = 0;
 4257 rhaas@postgresql.org     1927                 :           2086 :     snap->xip = (TransactionId *) (snap + 1);
                               1928                 :                : 
                               1929                 :           2086 :     memcpy(snap->xip, orig_snap->xip, sizeof(TransactionId) * snap->xcnt);
                               1930                 :                : 
                               1931                 :                :     /*
                               1932                 :                :      * snap->subxip contains all txids that belong to our transaction which we
                               1933                 :                :      * need to check via cmin/cmax. That's why we store the toplevel
                               1934                 :                :      * transaction in there as well.
                               1935                 :                :      */
                               1936                 :           2086 :     snap->subxip = snap->xip + snap->xcnt;
                               1937                 :           2086 :     snap->subxip[i++] = txn->xid;
                               1938                 :                : 
                               1939                 :                :     /*
                               1940                 :                :      * txn->nsubtxns isn't decreased when subtransactions abort, so count
                               1941                 :                :      * manually. Since it's an upper boundary it is safe to use it for the
                               1942                 :                :      * allocation above.
                               1943                 :                :      */
                               1944                 :           2086 :     snap->subxcnt = 1;
                               1945                 :                : 
                               1946   [ +  -  +  + ]:           2395 :     dlist_foreach(iter, &txn->subtxns)
                               1947                 :                :     {
                               1948                 :                :         ReorderBufferTXN *sub_txn;
                               1949                 :                : 
                               1950                 :            309 :         sub_txn = dlist_container(ReorderBufferTXN, node, iter.cur);
                               1951                 :            309 :         snap->subxip[i++] = sub_txn->xid;
                               1952                 :            309 :         snap->subxcnt++;
                               1953                 :                :     }
                               1954                 :                : 
                               1955                 :                :     /* sort so we can bsearch() later */
                               1956                 :           2086 :     qsort(snap->subxip, snap->subxcnt, sizeof(TransactionId), xidComparator);
                               1957                 :                : 
                               1958                 :                :     /* store the specified current CommandId */
                               1959                 :           2086 :     snap->curcid = cid;
                               1960                 :                : 
                               1961                 :           2086 :     return snap;
                               1962                 :                : }
                               1963                 :                : 
                               1964                 :                : /*
                               1965                 :                :  * Free a previously ReorderBufferCopySnap'ed snapshot
                               1966                 :                :  */
                               1967                 :                : static void
                               1968                 :           3357 : ReorderBufferFreeSnap(ReorderBuffer *rb, Snapshot snap)
                               1969                 :                : {
                               1970         [ +  + ]:           3357 :     if (snap->copied)
                               1971                 :           2082 :         pfree(snap);
                               1972                 :                :     else
                               1973                 :           1275 :         SnapBuildSnapDecRefcount(snap);
                               1974                 :           3357 : }
                               1975                 :                : 
                               1976                 :                : /*
                               1977                 :                :  * If the transaction was (partially) streamed, we need to prepare or commit
                               1978                 :                :  * it in a 'streamed' way.  That is, we first stream the remaining part of the
                               1979                 :                :  * transaction, and then invoke stream_prepare or stream_commit message as per
                               1980                 :                :  * the case.
                               1981                 :                :  */
                               1982                 :                : static void
 1907 akapila@postgresql.o     1983                 :             66 : ReorderBufferStreamCommit(ReorderBuffer *rb, ReorderBufferTXN *txn)
                               1984                 :                : {
                               1985                 :                :     /* we should only call this for previously streamed transactions */
                               1986         [ -  + ]:             66 :     Assert(rbtxn_is_streamed(txn));
                               1987                 :                : 
                               1988                 :             66 :     ReorderBufferStreamTXN(rb, txn);
                               1989                 :                : 
  258 msawada@postgresql.o     1990         [ +  + ]:             66 :     if (rbtxn_is_prepared(txn))
                               1991                 :                :     {
                               1992                 :                :         /*
                               1993                 :                :          * Note, we send stream prepare even if a concurrent abort is
                               1994                 :                :          * detected. See DecodePrepare for more information.
                               1995                 :                :          */
                               1996         [ -  + ]:             15 :         Assert(!rbtxn_sent_prepare(txn));
 1758 akapila@postgresql.o     1997                 :             15 :         rb->stream_prepare(rb, txn, txn->final_lsn);
  258 msawada@postgresql.o     1998                 :             15 :         txn->txn_flags |= RBTXN_SENT_PREPARE;
                               1999                 :                : 
                               2000                 :                :         /*
                               2001                 :                :          * This is a PREPARED transaction, part of a two-phase commit. The
                               2002                 :                :          * full cleanup will happen as part of the COMMIT PREPAREDs, so now
                               2003                 :                :          * just truncate txn by removing changes and tuplecids.
                               2004                 :                :          */
 1758 akapila@postgresql.o     2005                 :             15 :         ReorderBufferTruncateTXN(rb, txn, true);
                               2006                 :                :         /* Reset the CheckXidAlive */
                               2007                 :             15 :         CheckXidAlive = InvalidTransactionId;
                               2008                 :                :     }
                               2009                 :                :     else
                               2010                 :                :     {
                               2011                 :             51 :         rb->stream_commit(rb, txn, txn->final_lsn);
                               2012                 :             51 :         ReorderBufferCleanupTXN(rb, txn);
                               2013                 :                :     }
 1907                          2014                 :             66 : }
                               2015                 :                : 
                               2016                 :                : /*
                               2017                 :                :  * Set xid to detect concurrent aborts.
                               2018                 :                :  *
                               2019                 :                :  * While streaming an in-progress transaction or decoding a prepared
                               2020                 :                :  * transaction there is a possibility that the (sub)transaction might get
                               2021                 :                :  * aborted concurrently.  In such case if the (sub)transaction has catalog
                               2022                 :                :  * update then we might decode the tuple using wrong catalog version.  For
                               2023                 :                :  * example, suppose there is one catalog tuple with (xmin: 500, xmax: 0).  Now,
                               2024                 :                :  * the transaction 501 updates the catalog tuple and after that we will have
                               2025                 :                :  * two tuples (xmin: 500, xmax: 501) and (xmin: 501, xmax: 0).  Now, if 501 is
                               2026                 :                :  * aborted and some other transaction say 502 updates the same catalog tuple
                               2027                 :                :  * then the first tuple will be changed to (xmin: 500, xmax: 502).  So, the
                               2028                 :                :  * problem is that when we try to decode the tuple inserted/updated in 501
                               2029                 :                :  * after the catalog update, we will see the catalog tuple with (xmin: 500,
                               2030                 :                :  * xmax: 502) as visible because it will consider that the tuple is deleted by
                               2031                 :                :  * xid 502 which is not visible to our snapshot.  And when we will try to
                               2032                 :                :  * decode with that catalog tuple, it can lead to a wrong result or a crash.
                               2033                 :                :  * So, it is necessary to detect concurrent aborts to allow streaming of
                               2034                 :                :  * in-progress transactions or decoding of prepared transactions.
                               2035                 :                :  *
                               2036                 :                :  * For detecting the concurrent abort we set CheckXidAlive to the current
                               2037                 :                :  * (sub)transaction's xid for which this change belongs to.  And, during
                               2038                 :                :  * catalog scan we can check the status of the xid and if it is aborted we will
                               2039                 :                :  * report a specific error so that we can stop streaming current transaction
                               2040                 :                :  * and discard the already streamed changes on such an error.  We might have
                               2041                 :                :  * already streamed some of the changes for the aborted (sub)transaction, but
                               2042                 :                :  * that is fine because when we decode the abort we will stream abort message
                               2043                 :                :  * to truncate the changes in the subscriber. Similarly, for prepared
                               2044                 :                :  * transactions, we stop decoding if concurrent abort is detected and then
                               2045                 :                :  * rollback the changes when rollback prepared is encountered. See
                               2046                 :                :  * DecodePrepare.
                               2047                 :                :  */
                               2048                 :                : static inline void
                               2049                 :         177874 : SetupCheckXidLive(TransactionId xid)
                               2050                 :                : {
                               2051                 :                :     /*
                               2052                 :                :      * If the input transaction id is already set as a CheckXidAlive then
                               2053                 :                :      * nothing to do.
                               2054                 :                :      */
                               2055         [ +  + ]:         177874 :     if (TransactionIdEquals(CheckXidAlive, xid))
 4257 rhaas@postgresql.org     2056                 :          98900 :         return;
                               2057                 :                : 
                               2058                 :                :     /*
                               2059                 :                :      * setup CheckXidAlive if it's not committed yet.  We don't check if the
                               2060                 :                :      * xid is aborted.  That will happen during catalog access.
                               2061                 :                :      */
 1907 akapila@postgresql.o     2062         [ +  + ]:          78974 :     if (!TransactionIdDidCommit(xid))
                               2063                 :            411 :         CheckXidAlive = xid;
                               2064                 :                :     else
                               2065                 :          78563 :         CheckXidAlive = InvalidTransactionId;
                               2066                 :                : }
                               2067                 :                : 
                               2068                 :                : /*
                               2069                 :                :  * Helper function for ReorderBufferProcessTXN for applying change.
                               2070                 :                :  */
                               2071                 :                : static inline void
                               2072                 :         334084 : ReorderBufferApplyChange(ReorderBuffer *rb, ReorderBufferTXN *txn,
                               2073                 :                :                          Relation relation, ReorderBufferChange *change,
                               2074                 :                :                          bool streaming)
                               2075                 :                : {
                               2076         [ +  + ]:         334084 :     if (streaming)
                               2077                 :         176006 :         rb->stream_change(rb, txn, relation, change);
                               2078                 :                :     else
                               2079                 :         158078 :         rb->apply_change(rb, txn, relation, change);
                               2080                 :         334081 : }
                               2081                 :                : 
                               2082                 :                : /*
                               2083                 :                :  * Helper function for ReorderBufferProcessTXN for applying the truncate.
                               2084                 :                :  */
                               2085                 :                : static inline void
                               2086                 :             22 : ReorderBufferApplyTruncate(ReorderBuffer *rb, ReorderBufferTXN *txn,
                               2087                 :                :                            int nrelations, Relation *relations,
                               2088                 :                :                            ReorderBufferChange *change, bool streaming)
                               2089                 :                : {
                               2090         [ -  + ]:             22 :     if (streaming)
 1907 akapila@postgresql.o     2091                 :UBC           0 :         rb->stream_truncate(rb, txn, nrelations, relations, change);
                               2092                 :                :     else
 1907 akapila@postgresql.o     2093                 :CBC          22 :         rb->apply_truncate(rb, txn, nrelations, relations, change);
                               2094                 :             22 : }
                               2095                 :                : 
                               2096                 :                : /*
                               2097                 :                :  * Helper function for ReorderBufferProcessTXN for applying the message.
                               2098                 :                :  */
                               2099                 :                : static inline void
                               2100                 :             11 : ReorderBufferApplyMessage(ReorderBuffer *rb, ReorderBufferTXN *txn,
                               2101                 :                :                           ReorderBufferChange *change, bool streaming)
                               2102                 :                : {
                               2103         [ +  + ]:             11 :     if (streaming)
                               2104                 :              3 :         rb->stream_message(rb, txn, change->lsn, true,
                               2105                 :              3 :                            change->data.msg.prefix,
                               2106                 :                :                            change->data.msg.message_size,
                               2107                 :              3 :                            change->data.msg.message);
                               2108                 :                :     else
                               2109                 :              8 :         rb->message(rb, txn, change->lsn, true,
                               2110                 :              8 :                     change->data.msg.prefix,
                               2111                 :                :                     change->data.msg.message_size,
                               2112                 :              8 :                     change->data.msg.message);
                               2113                 :             11 : }
                               2114                 :                : 
                               2115                 :                : /*
                               2116                 :                :  * Function to store the command id and snapshot at the end of the current
                               2117                 :                :  * stream so that we can reuse the same while sending the next stream.
                               2118                 :                :  */
                               2119                 :                : static inline void
                               2120                 :            725 : ReorderBufferSaveTXNSnapshot(ReorderBuffer *rb, ReorderBufferTXN *txn,
                               2121                 :                :                              Snapshot snapshot_now, CommandId command_id)
                               2122                 :                : {
                               2123                 :            725 :     txn->command_id = command_id;
                               2124                 :                : 
                               2125                 :                :     /* Avoid copying if it's already copied. */
                               2126         [ +  - ]:            725 :     if (snapshot_now->copied)
                               2127                 :            725 :         txn->snapshot_now = snapshot_now;
                               2128                 :                :     else
 1907 akapila@postgresql.o     2129                 :UBC           0 :         txn->snapshot_now = ReorderBufferCopySnap(rb, snapshot_now,
                               2130                 :                :                                                   txn, command_id);
 1907 akapila@postgresql.o     2131                 :CBC         725 : }
                               2132                 :                : 
                               2133                 :                : /*
                               2134                 :                :  * Mark the given transaction as streamed if it's a top-level transaction
                               2135                 :                :  * or has changes.
                               2136                 :                :  */
                               2137                 :                : static void
  258 msawada@postgresql.o     2138                 :           1022 : ReorderBufferMaybeMarkTXNStreamed(ReorderBuffer *rb, ReorderBufferTXN *txn)
                               2139                 :                : {
                               2140                 :                :     /*
                               2141                 :                :      * The top-level transaction, is marked as streamed always, even if it
                               2142                 :                :      * does not contain any changes (that is, when all the changes are in
                               2143                 :                :      * subtransactions).
                               2144                 :                :      *
                               2145                 :                :      * For subtransactions, we only mark them as streamed when there are
                               2146                 :                :      * changes in them.
                               2147                 :                :      *
                               2148                 :                :      * We do it this way because of aborts - we don't want to send aborts for
                               2149                 :                :      * XIDs the downstream is not aware of. And of course, it always knows
                               2150                 :                :      * about the top-level xact (we send the XID in all messages), but we
                               2151                 :                :      * never stream XIDs of empty subxacts.
                               2152                 :                :      */
                               2153   [ +  +  +  + ]:           1022 :     if (rbtxn_is_toptxn(txn) || (txn->nentries_mem != 0))
                               2154                 :            860 :         txn->txn_flags |= RBTXN_IS_STREAMED;
                               2155                 :           1022 : }
                               2156                 :                : 
                               2157                 :                : /*
                               2158                 :                :  * Helper function for ReorderBufferProcessTXN to handle the concurrent
                               2159                 :                :  * abort of the streaming transaction.  This resets the TXN such that it
                               2160                 :                :  * can be used to stream the remaining data of transaction being processed.
                               2161                 :                :  * This can happen when the subtransaction is aborted and we still want to
                               2162                 :                :  * continue processing the main or other subtransactions data.
                               2163                 :                :  */
                               2164                 :                : static void
 1907 akapila@postgresql.o     2165                 :              8 : ReorderBufferResetTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
                               2166                 :                :                       Snapshot snapshot_now,
                               2167                 :                :                       CommandId command_id,
                               2168                 :                :                       XLogRecPtr last_lsn,
                               2169                 :                :                       ReorderBufferChange *specinsert)
                               2170                 :                : {
                               2171                 :                :     /* Discard the changes that we just streamed */
  258 msawada@postgresql.o     2172                 :              8 :     ReorderBufferTruncateTXN(rb, txn, rbtxn_is_prepared(txn));
                               2173                 :                : 
                               2174                 :                :     /* Free all resources allocated for toast reconstruction */
 1907 akapila@postgresql.o     2175                 :              8 :     ReorderBufferToastReset(rb, txn);
                               2176                 :                : 
                               2177                 :                :     /* Return the spec insert change if it is not NULL */
                               2178         [ -  + ]:              8 :     if (specinsert != NULL)
                               2179                 :                :     {
  230 heikki.linnakangas@i     2180                 :UBC           0 :         ReorderBufferFreeChange(rb, specinsert, true);
 1907 akapila@postgresql.o     2181                 :              0 :         specinsert = NULL;
                               2182                 :                :     }
                               2183                 :                : 
                               2184                 :                :     /*
                               2185                 :                :      * For the streaming case, stop the stream and remember the command ID and
                               2186                 :                :      * snapshot for the streaming run.
                               2187                 :                :      */
 1758 akapila@postgresql.o     2188         [ +  - ]:CBC           8 :     if (rbtxn_is_streamed(txn))
                               2189                 :                :     {
                               2190                 :              8 :         rb->stream_stop(rb, txn, last_lsn);
                               2191                 :              8 :         ReorderBufferSaveTXNSnapshot(rb, txn, snapshot_now, command_id);
                               2192                 :                :     }
                               2193                 :                : 
                               2194                 :                :     /* All changes must be deallocated */
  428 msawada@postgresql.o     2195         [ -  + ]:              8 :     Assert(txn->size == 0);
 1907 akapila@postgresql.o     2196                 :              8 : }
                               2197                 :                : 
                               2198                 :                : /*
                               2199                 :                :  * Helper function for ReorderBufferReplay and ReorderBufferStreamTXN.
                               2200                 :                :  *
                               2201                 :                :  * Send data of a transaction (and its subtransactions) to the
                               2202                 :                :  * output plugin. We iterate over the top and subtransactions (using a k-way
                               2203                 :                :  * merge) and replay the changes in lsn order.
                               2204                 :                :  *
                               2205                 :                :  * If streaming is true then data will be sent using stream API.
                               2206                 :                :  *
                               2207                 :                :  * Note: "volatile" markers on some parameters are to avoid trouble with
                               2208                 :                :  * PG_TRY inside the function.
                               2209                 :                :  */
                               2210                 :                : static void
                               2211                 :           2182 : ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
                               2212                 :                :                         XLogRecPtr commit_lsn,
                               2213                 :                :                         volatile Snapshot snapshot_now,
                               2214                 :                :                         volatile CommandId command_id,
                               2215                 :                :                         bool streaming)
                               2216                 :                : {
                               2217                 :                :     bool        using_subtxn;
                               2218                 :           2182 :     MemoryContext ccxt = CurrentMemoryContext;
   46 alvherre@kurilemu.de     2219                 :GNC        2182 :     ResourceOwner cowner = CurrentResourceOwner;
 1907 akapila@postgresql.o     2220                 :CBC        2182 :     ReorderBufferIterTXNState *volatile iterstate = NULL;
                               2221                 :           2182 :     volatile XLogRecPtr prev_lsn = InvalidXLogRecPtr;
                               2222                 :           2182 :     ReorderBufferChange *volatile specinsert = NULL;
                               2223                 :           2182 :     volatile bool stream_started = false;
                               2224                 :           2182 :     ReorderBufferTXN *volatile curtxn = NULL;
                               2225                 :                : 
                               2226                 :                :     /* build data to be able to lookup the CommandIds of catalog tuples */
 4257 rhaas@postgresql.org     2227                 :           2182 :     ReorderBufferBuildTupleCidHash(rb, txn);
                               2228                 :                : 
                               2229                 :                :     /* setup the initial snapshot */
                               2230                 :           2182 :     SetupHistoricSnapshot(snapshot_now, txn->tuplecid_hash);
                               2231                 :                : 
                               2232                 :                :     /*
                               2233                 :                :      * Decoding needs access to syscaches et al., which in turn use
                               2234                 :                :      * heavyweight locks and such. Thus we need to have enough state around to
                               2235                 :                :      * keep track of those.  The easiest way is to simply use a transaction
                               2236                 :                :      * internally.  That also allows us to easily enforce that nothing writes
                               2237                 :                :      * to the database by checking for xid assignments.
                               2238                 :                :      *
                               2239                 :                :      * When we're called via the SQL SRF there's already a transaction
                               2240                 :                :      * started, so start an explicit subtransaction there.
                               2241                 :                :      */
 3929 tgl@sss.pgh.pa.us        2242                 :           2182 :     using_subtxn = IsTransactionOrTransactionBlock();
                               2243                 :                : 
 4257 rhaas@postgresql.org     2244         [ +  + ]:           2182 :     PG_TRY();
                               2245                 :                :     {
                               2246                 :                :         ReorderBufferChange *change;
  993 akapila@postgresql.o     2247                 :           2182 :         int         changes_count = 0;  /* used to accumulate the number of
                               2248                 :                :                                          * changes */
                               2249                 :                : 
 4002 andres@anarazel.de       2250         [ +  + ]:           2182 :         if (using_subtxn)
 1907 akapila@postgresql.o     2251         [ +  + ]:            485 :             BeginInternalSubTransaction(streaming ? "stream" : "replay");
                               2252                 :                :         else
 4257 rhaas@postgresql.org     2253                 :           1697 :             StartTransactionCommand();
                               2254                 :                : 
                               2255                 :                :         /*
                               2256                 :                :          * We only need to send begin/begin-prepare for non-streamed
                               2257                 :                :          * transactions.
                               2258                 :                :          */
 1907 akapila@postgresql.o     2259         [ +  + ]:           2182 :         if (!streaming)
                               2260                 :                :         {
  258 msawada@postgresql.o     2261         [ +  + ]:           1457 :             if (rbtxn_is_prepared(txn))
 1758 akapila@postgresql.o     2262                 :             30 :                 rb->begin_prepare(rb, txn);
                               2263                 :                :             else
                               2264                 :           1427 :                 rb->begin(rb, txn);
                               2265                 :                :         }
                               2266                 :                : 
 2145                          2267                 :           2182 :         ReorderBufferIterTXNInit(rb, txn, &iterstate);
 3929 tgl@sss.pgh.pa.us        2268         [ +  + ]:         361218 :         while ((change = ReorderBufferIterTXNNext(rb, iterstate)) != NULL)
                               2269                 :                :         {
 4257 rhaas@postgresql.org     2270                 :         356865 :             Relation    relation = NULL;
                               2271                 :                :             Oid         reloid;
                               2272                 :                : 
 1162 akapila@postgresql.o     2273         [ -  + ]:         356865 :             CHECK_FOR_INTERRUPTS();
                               2274                 :                : 
                               2275                 :                :             /*
                               2276                 :                :              * We can't call start stream callback before processing first
                               2277                 :                :              * change.
                               2278                 :                :              */
 1907                          2279         [ +  + ]:         356865 :             if (prev_lsn == InvalidXLogRecPtr)
                               2280                 :                :             {
                               2281         [ +  + ]:           2143 :                 if (streaming)
                               2282                 :                :                 {
                               2283                 :            687 :                     txn->origin_id = change->origin_id;
                               2284                 :            687 :                     rb->stream_start(rb, txn, change->lsn);
                               2285                 :            687 :                     stream_started = true;
                               2286                 :                :                 }
                               2287                 :                :             }
                               2288                 :                : 
                               2289                 :                :             /*
                               2290                 :                :              * Enforce correct ordering of changes, merged from multiple
                               2291                 :                :              * subtransactions. The changes may have the same LSN due to
                               2292                 :                :              * MULTI_INSERT xlog records.
                               2293                 :                :              */
                               2294   [ +  +  -  + ]:         356865 :             Assert(prev_lsn == InvalidXLogRecPtr || prev_lsn <= change->lsn);
                               2295                 :                : 
                               2296                 :         356865 :             prev_lsn = change->lsn;
                               2297                 :                : 
                               2298                 :                :             /*
                               2299                 :                :              * Set the current xid to detect concurrent aborts. This is
                               2300                 :                :              * required for the cases when we decode the changes before the
                               2301                 :                :              * COMMIT record is processed.
                               2302                 :                :              */
  258 msawada@postgresql.o     2303   [ +  +  +  + ]:         356865 :             if (streaming || rbtxn_is_prepared(change->txn))
                               2304                 :                :             {
 1907 akapila@postgresql.o     2305                 :         177874 :                 curtxn = change->txn;
                               2306                 :         177874 :                 SetupCheckXidLive(curtxn->xid);
                               2307                 :                :             }
                               2308                 :                : 
 4253 tgl@sss.pgh.pa.us        2309   [ +  +  +  -  :         356865 :             switch (change->action)
                                     +  +  +  +  +  
                                              -  - ]
                               2310                 :                :             {
 3826 andres@anarazel.de       2311                 :           1782 :                 case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM:
                               2312                 :                : 
                               2313                 :                :                     /*
                               2314                 :                :                      * Confirmation for speculative insertion arrived. Simply
                               2315                 :                :                      * use as a normal record. It'll be cleaned up at the end
                               2316                 :                :                      * of INSERT processing.
                               2317                 :                :                      */
 2672 alvherre@alvh.no-ip.     2318         [ -  + ]:           1782 :                     if (specinsert == NULL)
 2672 alvherre@alvh.no-ip.     2319         [ #  # ]:UBC           0 :                         elog(ERROR, "invalid ordering of speculative insertion changes");
 3826 andres@anarazel.de       2320         [ -  + ]:CBC        1782 :                     Assert(specinsert->data.tp.oldtuple == NULL);
                               2321                 :           1782 :                     change = specinsert;
                               2322                 :           1782 :                     change->action = REORDER_BUFFER_CHANGE_INSERT;
                               2323                 :                : 
                               2324                 :                :                     /* intentionally fall through */
 4253 tgl@sss.pgh.pa.us        2325                 :         340620 :                 case REORDER_BUFFER_CHANGE_INSERT:
                               2326                 :                :                 case REORDER_BUFFER_CHANGE_UPDATE:
                               2327                 :                :                 case REORDER_BUFFER_CHANGE_DELETE:
 4257 rhaas@postgresql.org     2328         [ -  + ]:         340620 :                     Assert(snapshot_now);
                               2329                 :                : 
 1210                          2330                 :         340620 :                     reloid = RelidByRelfilenumber(change->data.tp.rlocator.spcOid,
                               2331                 :                :                                                   change->data.tp.rlocator.relNumber);
                               2332                 :                : 
                               2333                 :                :                     /*
                               2334                 :                :                      * Mapped catalog tuple without data, emitted while
                               2335                 :                :                      * catalog table was in the process of being rewritten. We
                               2336                 :                :                      * can fail to look up the relfilenumber, because the
                               2337                 :                :                      * relmapper has no "historic" view, in contrast to the
                               2338                 :                :                      * normal catalog during decoding. Thus repeated rewrites
                               2339                 :                :                      * can cause a lookup failure. That's OK because we do not
                               2340                 :                :                      * decode catalog changes anyway. Normally such tuples
                               2341                 :                :                      * would be skipped over below, but we can't identify
                               2342                 :                :                      * whether the table should be logically logged without
                               2343                 :                :                      * mapping the relfilenumber to the oid.
                               2344                 :                :                      */
 4257                          2345         [ +  + ]:         340612 :                     if (reloid == InvalidOid &&
 4253 tgl@sss.pgh.pa.us        2346         [ +  - ]:             83 :                         change->data.tp.newtuple == NULL &&
                               2347         [ +  - ]:             83 :                         change->data.tp.oldtuple == NULL)
 3826 andres@anarazel.de       2348                 :             83 :                         goto change_done;
 4257 rhaas@postgresql.org     2349         [ -  + ]:         340529 :                     else if (reloid == InvalidOid)
 1210 rhaas@postgresql.org     2350         [ #  # ]:UBC           0 :                         elog(ERROR, "could not map filenumber \"%s\" to relation OID",
                               2351                 :                :                              relpathperm(change->data.tp.rlocator,
                               2352                 :                :                                          MAIN_FORKNUM).str);
                               2353                 :                : 
 4257 rhaas@postgresql.org     2354                 :CBC      340529 :                     relation = RelationIdGetRelation(reloid);
                               2355                 :                : 
 2242 tgl@sss.pgh.pa.us        2356         [ -  + ]:         340529 :                     if (!RelationIsValid(relation))
 1210 rhaas@postgresql.org     2357         [ #  # ]:UBC           0 :                         elog(ERROR, "could not open relation with OID %u (for filenumber \"%s\")",
                               2358                 :                :                              reloid,
                               2359                 :                :                              relpathperm(change->data.tp.rlocator,
                               2360                 :                :                                          MAIN_FORKNUM).str);
                               2361                 :                : 
 3826 andres@anarazel.de       2362   [ +  -  +  -  :CBC      340529 :                     if (!RelationIsLogicallyLogged(relation))
                                     -  +  -  -  -  
                                        -  +  -  +  
                                                 + ]
                               2363                 :           4358 :                         goto change_done;
                               2364                 :                : 
                               2365                 :                :                     /*
                               2366                 :                :                      * Ignore temporary heaps created during DDL unless the
                               2367                 :                :                      * plugin has asked for them.
                               2368                 :                :                      */
 2778 peter_e@gmx.net          2369   [ +  +  +  + ]:         336171 :                     if (relation->rd_rel->relrewrite && !rb->output_rewrites)
                               2370                 :             26 :                         goto change_done;
                               2371                 :                : 
                               2372                 :                :                     /*
                               2373                 :                :                      * For now ignore sequence changes entirely. Most of the
                               2374                 :                :                      * time they don't log changes using records we
                               2375                 :                :                      * understand, so it doesn't make sense to handle the few
                               2376                 :                :                      * cases we do.
                               2377                 :                :                      */
 3826 andres@anarazel.de       2378         [ -  + ]:         336145 :                     if (relation->rd_rel->relkind == RELKIND_SEQUENCE)
 3826 andres@anarazel.de       2379                 :UBC           0 :                         goto change_done;
                               2380                 :                : 
                               2381                 :                :                     /* user-triggered change */
 3826 andres@anarazel.de       2382         [ +  + ]:CBC      336145 :                     if (!IsToastRelation(relation))
                               2383                 :                :                     {
                               2384                 :         334084 :                         ReorderBufferToastReplace(rb, txn, relation, change);
 1907 akapila@postgresql.o     2385                 :         334084 :                         ReorderBufferApplyChange(rb, txn, relation, change,
                               2386                 :                :                                                  streaming);
                               2387                 :                : 
                               2388                 :                :                         /*
                               2389                 :                :                          * Only clear reassembled toast chunks if we're sure
                               2390                 :                :                          * they're not required anymore. The creator of the
                               2391                 :                :                          * tuple tells us.
                               2392                 :                :                          */
 3826 andres@anarazel.de       2393         [ +  + ]:         334081 :                         if (change->data.tp.clear_toast_afterwards)
                               2394                 :         333860 :                             ReorderBufferToastReset(rb, txn);
                               2395                 :                :                     }
                               2396                 :                :                     /* we're not interested in toast deletions */
                               2397         [ +  + ]:           2061 :                     else if (change->action == REORDER_BUFFER_CHANGE_INSERT)
                               2398                 :                :                     {
                               2399                 :                :                         /*
                               2400                 :                :                          * Need to reassemble the full toasted Datum in
                               2401                 :                :                          * memory, to ensure the chunks don't get reused till
                               2402                 :                :                          * we're done remove it from the list of this
                               2403                 :                :                          * transaction's changes. Otherwise it will get
                               2404                 :                :                          * freed/reused while restoring spooled data from
                               2405                 :                :                          * disk.
                               2406                 :                :                          */
 2526 tomas.vondra@postgre     2407         [ -  + ]:           1830 :                         Assert(change->data.tp.newtuple != NULL);
                               2408                 :                : 
                               2409                 :           1830 :                         dlist_delete(&change->node);
                               2410                 :           1830 :                         ReorderBufferToastAppendChunk(rb, txn, relation,
                               2411                 :                :                                                       change);
                               2412                 :                :                     }
                               2413                 :                : 
 3811 bruce@momjian.us         2414                 :            231 :             change_done:
                               2415                 :                : 
                               2416                 :                :                     /*
                               2417                 :                :                      * If speculative insertion was confirmed, the record
                               2418                 :                :                      * isn't needed anymore.
                               2419                 :                :                      */
 3826 andres@anarazel.de       2420         [ +  + ]:         340609 :                     if (specinsert != NULL)
                               2421                 :                :                     {
  230 heikki.linnakangas@i     2422                 :           1782 :                         ReorderBufferFreeChange(rb, specinsert, true);
 3826 andres@anarazel.de       2423                 :           1782 :                         specinsert = NULL;
                               2424                 :                :                     }
                               2425                 :                : 
 1907 akapila@postgresql.o     2426         [ +  + ]:         340609 :                     if (RelationIsValid(relation))
                               2427                 :                :                     {
 3826 andres@anarazel.de       2428                 :         340526 :                         RelationClose(relation);
                               2429                 :         340526 :                         relation = NULL;
                               2430                 :                :                     }
 4257 rhaas@postgresql.org     2431                 :         340609 :                     break;
                               2432                 :                : 
 3826 andres@anarazel.de       2433                 :           1782 :                 case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT:
                               2434                 :                : 
                               2435                 :                :                     /*
                               2436                 :                :                      * Speculative insertions are dealt with by delaying the
                               2437                 :                :                      * processing of the insert until the confirmation record
                               2438                 :                :                      * arrives. For that we simply unlink the record from the
                               2439                 :                :                      * chain, so it does not get freed/reused while restoring
                               2440                 :                :                      * spooled data from disk.
                               2441                 :                :                      *
                               2442                 :                :                      * This is safe in the face of concurrent catalog changes
                               2443                 :                :                      * because the relevant relation can't be changed between
                               2444                 :                :                      * speculative insertion and confirmation due to
                               2445                 :                :                      * CheckTableNotInUse() and locking.
                               2446                 :                :                      */
                               2447                 :                : 
                               2448                 :                :                     /* clear out a pending (and thus failed) speculation */
                               2449         [ -  + ]:           1782 :                     if (specinsert != NULL)
                               2450                 :                :                     {
  230 heikki.linnakangas@i     2451                 :UBC           0 :                         ReorderBufferFreeChange(rb, specinsert, true);
 3826 andres@anarazel.de       2452                 :              0 :                         specinsert = NULL;
                               2453                 :                :                     }
                               2454                 :                : 
                               2455                 :                :                     /* and memorize the pending insertion */
 3826 andres@anarazel.de       2456                 :CBC        1782 :                     dlist_delete(&change->node);
                               2457                 :           1782 :                     specinsert = change;
                               2458                 :           1782 :                     break;
                               2459                 :                : 
 1596 akapila@postgresql.o     2460                 :UBC           0 :                 case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT:
                               2461                 :                : 
                               2462                 :                :                     /*
                               2463                 :                :                      * Abort for speculative insertion arrived. So cleanup the
                               2464                 :                :                      * specinsert tuple and toast hash.
                               2465                 :                :                      *
                               2466                 :                :                      * Note that we get the spec abort change for each toast
                               2467                 :                :                      * entry but we need to perform the cleanup only the first
                               2468                 :                :                      * time we get it for the main table.
                               2469                 :                :                      */
                               2470         [ #  # ]:              0 :                     if (specinsert != NULL)
                               2471                 :                :                     {
                               2472                 :                :                         /*
                               2473                 :                :                          * We must clean the toast hash before processing a
                               2474                 :                :                          * completely new tuple to avoid confusion about the
                               2475                 :                :                          * previous tuple's toast chunks.
                               2476                 :                :                          */
                               2477         [ #  # ]:              0 :                         Assert(change->data.tp.clear_toast_afterwards);
                               2478                 :              0 :                         ReorderBufferToastReset(rb, txn);
                               2479                 :                : 
                               2480                 :                :                         /* We don't need this record anymore. */
  230 heikki.linnakangas@i     2481                 :              0 :                         ReorderBufferFreeChange(rb, specinsert, true);
 1596 akapila@postgresql.o     2482                 :              0 :                         specinsert = NULL;
                               2483                 :                :                     }
                               2484                 :              0 :                     break;
                               2485                 :                : 
 2761 peter_e@gmx.net          2486                 :CBC          22 :                 case REORDER_BUFFER_CHANGE_TRUNCATE:
                               2487                 :                :                     {
                               2488                 :                :                         int         i;
 2742 tgl@sss.pgh.pa.us        2489                 :             22 :                         int         nrelids = change->data.truncate.nrelids;
                               2490                 :             22 :                         int         nrelations = 0;
                               2491                 :                :                         Relation   *relations;
                               2492                 :                : 
                               2493                 :             22 :                         relations = palloc0(nrelids * sizeof(Relation));
                               2494         [ +  + ]:             64 :                         for (i = 0; i < nrelids; i++)
                               2495                 :                :                         {
                               2496                 :             42 :                             Oid         relid = change->data.truncate.relids[i];
                               2497                 :                :                             Relation    rel;
                               2498                 :                : 
 1119 drowley@postgresql.o     2499                 :             42 :                             rel = RelationIdGetRelation(relid);
                               2500                 :                : 
                               2501         [ -  + ]:             42 :                             if (!RelationIsValid(rel))
 2742 tgl@sss.pgh.pa.us        2502         [ #  # ]:UBC           0 :                                 elog(ERROR, "could not open relation with OID %u", relid);
                               2503                 :                : 
 1119 drowley@postgresql.o     2504   [ +  -  +  -  :CBC          42 :                             if (!RelationIsLogicallyLogged(rel))
                                     -  +  -  -  -  
                                        -  +  -  -  
                                                 + ]
 2742 tgl@sss.pgh.pa.us        2505                 :UBC           0 :                                 continue;
                               2506                 :                : 
 1119 drowley@postgresql.o     2507                 :CBC          42 :                             relations[nrelations++] = rel;
                               2508                 :                :                         }
                               2509                 :                : 
                               2510                 :                :                         /* Apply the truncate. */
 1907 akapila@postgresql.o     2511                 :             22 :                         ReorderBufferApplyTruncate(rb, txn, nrelations,
                               2512                 :                :                                                    relations, change,
                               2513                 :                :                                                    streaming);
                               2514                 :                : 
 2742 tgl@sss.pgh.pa.us        2515         [ +  + ]:             64 :                         for (i = 0; i < nrelations; i++)
                               2516                 :             42 :                             RelationClose(relations[i]);
                               2517                 :                : 
                               2518                 :             22 :                         break;
                               2519                 :                :                     }
                               2520                 :                : 
 3492 simon@2ndQuadrant.co     2521                 :             11 :                 case REORDER_BUFFER_CHANGE_MESSAGE:
 1907 akapila@postgresql.o     2522                 :             11 :                     ReorderBufferApplyMessage(rb, txn, change, streaming);
 3492 simon@2ndQuadrant.co     2523                 :             11 :                     break;
                               2524                 :                : 
 1839 akapila@postgresql.o     2525                 :           2450 :                 case REORDER_BUFFER_CHANGE_INVALIDATION:
                               2526                 :                :                     /* Execute the invalidation messages locally */
 1264 alvherre@alvh.no-ip.     2527                 :           2450 :                     ReorderBufferExecuteInvalidations(change->data.inval.ninvalidations,
                               2528                 :                :                                                       change->data.inval.invalidations);
 1839 akapila@postgresql.o     2529                 :           2450 :                     break;
                               2530                 :                : 
 4257 rhaas@postgresql.org     2531                 :            708 :                 case REORDER_BUFFER_CHANGE_INTERNAL_SNAPSHOT:
                               2532                 :                :                     /* get rid of the old */
                               2533                 :            708 :                     TeardownHistoricSnapshot(false);
                               2534                 :                : 
                               2535         [ +  + ]:            708 :                     if (snapshot_now->copied)
                               2536                 :                :                     {
                               2537                 :            683 :                         ReorderBufferFreeSnap(rb, snapshot_now);
                               2538                 :            683 :                         snapshot_now =
 4253 tgl@sss.pgh.pa.us        2539                 :            683 :                             ReorderBufferCopySnap(rb, change->data.snapshot,
                               2540                 :                :                                                   txn, command_id);
                               2541                 :                :                     }
                               2542                 :                : 
                               2543                 :                :                     /*
                               2544                 :                :                      * Restored from disk, need to be careful not to double
                               2545                 :                :                      * free. We could introduce refcounting for that, but for
                               2546                 :                :                      * now this seems infrequent enough not to care.
                               2547                 :                :                      */
                               2548         [ -  + ]:             25 :                     else if (change->data.snapshot->copied)
                               2549                 :                :                     {
 4257 rhaas@postgresql.org     2550                 :UBC           0 :                         snapshot_now =
 4253 tgl@sss.pgh.pa.us        2551                 :              0 :                             ReorderBufferCopySnap(rb, change->data.snapshot,
                               2552                 :                :                                                   txn, command_id);
                               2553                 :                :                     }
                               2554                 :                :                     else
                               2555                 :                :                     {
 4253 tgl@sss.pgh.pa.us        2556                 :CBC          25 :                         snapshot_now = change->data.snapshot;
                               2557                 :                :                     }
                               2558                 :                : 
                               2559                 :                :                     /* and continue with the new one */
 4257 rhaas@postgresql.org     2560                 :            708 :                     SetupHistoricSnapshot(snapshot_now, txn->tuplecid_hash);
                               2561                 :            708 :                     break;
                               2562                 :                : 
                               2563                 :          11272 :                 case REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID:
 4253 tgl@sss.pgh.pa.us        2564         [ -  + ]:          11272 :                     Assert(change->data.command_id != InvalidCommandId);
                               2565                 :                : 
                               2566         [ +  + ]:          11272 :                     if (command_id < change->data.command_id)
                               2567                 :                :                     {
                               2568                 :           2123 :                         command_id = change->data.command_id;
                               2569                 :                : 
 4257 rhaas@postgresql.org     2570         [ +  + ]:           2123 :                         if (!snapshot_now->copied)
                               2571                 :                :                         {
                               2572                 :                :                             /* we don't use the global one anymore */
                               2573                 :            678 :                             snapshot_now = ReorderBufferCopySnap(rb, snapshot_now,
                               2574                 :                :                                                                  txn, command_id);
                               2575                 :                :                         }
                               2576                 :                : 
                               2577                 :           2123 :                         snapshot_now->curcid = command_id;
                               2578                 :                : 
                               2579                 :           2123 :                         TeardownHistoricSnapshot(false);
                               2580                 :           2123 :                         SetupHistoricSnapshot(snapshot_now, txn->tuplecid_hash);
                               2581                 :                :                     }
                               2582                 :                : 
                               2583                 :          11272 :                     break;
                               2584                 :                : 
 4257 rhaas@postgresql.org     2585                 :UBC           0 :                 case REORDER_BUFFER_CHANGE_INTERNAL_TUPLECID:
                               2586         [ #  # ]:              0 :                     elog(ERROR, "tuplecid value in changequeue");
                               2587                 :                :                     break;
                               2588                 :                :             }
                               2589                 :                : 
                               2590                 :                :             /*
                               2591                 :                :              * It is possible that the data is not sent to downstream for a
                               2592                 :                :              * long time either because the output plugin filtered it or there
                               2593                 :                :              * is a DDL that generates a lot of data that is not processed by
                               2594                 :                :              * the plugin. So, in such cases, the downstream can timeout. To
                               2595                 :                :              * avoid that we try to send a keepalive message if required.
                               2596                 :                :              * Trying to send a keepalive message after every change has some
                               2597                 :                :              * overhead, but testing showed there is no noticeable overhead if
                               2598                 :                :              * we do it after every ~100 changes.
                               2599                 :                :              */
                               2600                 :                : #define CHANGES_THRESHOLD 100
                               2601                 :                : 
  993 akapila@postgresql.o     2602         [ +  + ]:CBC      356854 :             if (++changes_count >= CHANGES_THRESHOLD)
                               2603                 :                :             {
   87 michael@paquier.xyz      2604                 :           3098 :                 rb->update_progress_txn(rb, txn, prev_lsn);
  993 akapila@postgresql.o     2605                 :           3098 :                 changes_count = 0;
                               2606                 :                :             }
                               2607                 :                :         }
                               2608                 :                : 
                               2609                 :                :         /* speculative insertion record must be freed by now */
 1596                          2610         [ -  + ]:           2171 :         Assert(!specinsert);
                               2611                 :                : 
                               2612                 :                :         /* clean up the iterator */
 4257 rhaas@postgresql.org     2613                 :           2171 :         ReorderBufferIterTXNFinish(rb, iterstate);
 3930 tgl@sss.pgh.pa.us        2614                 :           2171 :         iterstate = NULL;
                               2615                 :                : 
                               2616                 :                :         /*
                               2617                 :                :          * Update total transaction count and total bytes processed by the
                               2618                 :                :          * transaction and its subtransactions. Ensure to not count the
                               2619                 :                :          * streamed transaction multiple times.
                               2620                 :                :          *
                               2621                 :                :          * Note that the statistics computation has to be done after
                               2622                 :                :          * ReorderBufferIterTXNFinish as it releases the serialized change
                               2623                 :                :          * which we have already accounted in ReorderBufferIterTXNNext.
                               2624                 :                :          */
 1656 akapila@postgresql.o     2625         [ +  + ]:           2171 :         if (!rbtxn_is_streamed(txn))
                               2626                 :           1522 :             rb->totalTxns++;
                               2627                 :                : 
 1639                          2628                 :           2171 :         rb->totalBytes += txn->total_size;
                               2629                 :                : 
                               2630                 :                :         /*
                               2631                 :                :          * Done with current changes, send the last message for this set of
                               2632                 :                :          * changes depending upon streaming mode.
                               2633                 :                :          */
 1907                          2634         [ +  + ]:           2171 :         if (streaming)
                               2635                 :                :         {
                               2636         [ +  + ]:            717 :             if (stream_started)
                               2637                 :                :             {
                               2638                 :            679 :                 rb->stream_stop(rb, txn, prev_lsn);
                               2639                 :            679 :                 stream_started = false;
                               2640                 :                :             }
                               2641                 :                :         }
                               2642                 :                :         else
                               2643                 :                :         {
                               2644                 :                :             /*
                               2645                 :                :              * Call either PREPARE (for two-phase transactions) or COMMIT (for
                               2646                 :                :              * regular ones).
                               2647                 :                :              */
  258 msawada@postgresql.o     2648         [ +  + ]:           1454 :             if (rbtxn_is_prepared(txn))
                               2649                 :                :             {
                               2650         [ -  + ]:             30 :                 Assert(!rbtxn_sent_prepare(txn));
 1758 akapila@postgresql.o     2651                 :             30 :                 rb->prepare(rb, txn, commit_lsn);
  258 msawada@postgresql.o     2652                 :             30 :                 txn->txn_flags |= RBTXN_SENT_PREPARE;
                               2653                 :                :             }
                               2654                 :                :             else
 1758 akapila@postgresql.o     2655                 :           1424 :                 rb->commit(rb, txn, commit_lsn);
                               2656                 :                :         }
                               2657                 :                : 
                               2658                 :                :         /* this is just a sanity check against bad output plugin behaviour */
 4257 rhaas@postgresql.org     2659         [ -  + ]:           2169 :         if (GetCurrentTransactionIdIfAny() != InvalidTransactionId)
 4199 tgl@sss.pgh.pa.us        2660         [ #  # ]:UBC           0 :             elog(ERROR, "output plugin used XID %u",
                               2661                 :                :                  GetCurrentTransactionId());
                               2662                 :                : 
                               2663                 :                :         /*
                               2664                 :                :          * Remember the command ID and snapshot for the next set of changes in
                               2665                 :                :          * streaming mode.
                               2666                 :                :          */
 1907 akapila@postgresql.o     2667         [ +  + ]:CBC        2169 :         if (streaming)
                               2668                 :            717 :             ReorderBufferSaveTXNSnapshot(rb, txn, snapshot_now, command_id);
                               2669         [ +  + ]:           1452 :         else if (snapshot_now->copied)
                               2670                 :            678 :             ReorderBufferFreeSnap(rb, snapshot_now);
                               2671                 :                : 
                               2672                 :                :         /* cleanup */
 4257 rhaas@postgresql.org     2673                 :           2169 :         TeardownHistoricSnapshot(false);
                               2674                 :                : 
                               2675                 :                :         /*
                               2676                 :                :          * Aborting the current (sub-)transaction as a whole has the right
                               2677                 :                :          * semantics. We want all locks acquired in here to be released, not
                               2678                 :                :          * reassigned to the parent and we do not want any database access
                               2679                 :                :          * have persistent effects.
                               2680                 :                :          */
 4002 andres@anarazel.de       2681                 :           2169 :         AbortCurrentTransaction();
                               2682                 :                : 
                               2683                 :                :         /* make sure there's no cache pollution */
  134 msawada@postgresql.o     2684         [ -  + ]:           2169 :         if (rbtxn_distr_inval_overflowed(txn))
                               2685                 :                :         {
  134 msawada@postgresql.o     2686         [ #  # ]:UBC           0 :             Assert(txn->ninvalidations_distributed == 0);
                               2687                 :              0 :             InvalidateSystemCaches();
                               2688                 :                :         }
                               2689                 :                :         else
                               2690                 :                :         {
  134 msawada@postgresql.o     2691                 :CBC        2169 :             ReorderBufferExecuteInvalidations(txn->ninvalidations, txn->invalidations);
                               2692                 :           2169 :             ReorderBufferExecuteInvalidations(txn->ninvalidations_distributed,
                               2693                 :                :                                               txn->invalidations_distributed);
                               2694                 :                :         }
                               2695                 :                : 
 4002 andres@anarazel.de       2696         [ +  + ]:           2169 :         if (using_subtxn)
                               2697                 :                :         {
 4257 rhaas@postgresql.org     2698                 :            481 :             RollbackAndReleaseCurrentSubTransaction();
   46 alvherre@kurilemu.de     2699                 :GNC         481 :             MemoryContextSwitchTo(ccxt);
                               2700                 :            481 :             CurrentResourceOwner = cowner;
                               2701                 :                :         }
                               2702                 :                : 
                               2703                 :                :         /*
                               2704                 :                :          * We are here due to one of the four reasons: 1. Decoding an
                               2705                 :                :          * in-progress txn. 2. Decoding a prepared txn. 3. Decoding of a
                               2706                 :                :          * prepared txn that was (partially) streamed. 4. Decoding a committed
                               2707                 :                :          * txn.
                               2708                 :                :          *
                               2709                 :                :          * For 1, we allow truncation of txn data by removing the changes
                               2710                 :                :          * already streamed but still keeping other things like invalidations,
                               2711                 :                :          * snapshot, and tuplecids. For 2 and 3, we indicate
                               2712                 :                :          * ReorderBufferTruncateTXN to do more elaborate truncation of txn
                               2713                 :                :          * data as the entire transaction has been decoded except for commit.
                               2714                 :                :          * For 4, as the entire txn has been decoded, we can fully clean up
                               2715                 :                :          * the TXN reorder buffer.
                               2716                 :                :          */
  258 msawada@postgresql.o     2717   [ +  +  +  + ]:CBC        2169 :         if (streaming || rbtxn_is_prepared(txn))
                               2718                 :                :         {
                               2719         [ +  + ]:            747 :             if (streaming)
                               2720                 :            717 :                 ReorderBufferMaybeMarkTXNStreamed(rb, txn);
                               2721                 :                : 
                               2722                 :            747 :             ReorderBufferTruncateTXN(rb, txn, rbtxn_is_prepared(txn));
                               2723                 :                :             /* Reset the CheckXidAlive */
 1907 akapila@postgresql.o     2724                 :            747 :             CheckXidAlive = InvalidTransactionId;
                               2725                 :                :         }
                               2726                 :                :         else
                               2727                 :           1422 :             ReorderBufferCleanupTXN(rb, txn);
                               2728                 :                :     }
 4257 rhaas@postgresql.org     2729                 :              9 :     PG_CATCH();
                               2730                 :                :     {
 1907 akapila@postgresql.o     2731                 :              9 :         MemoryContext ecxt = MemoryContextSwitchTo(ccxt);
                               2732                 :              9 :         ErrorData  *errdata = CopyErrorData();
                               2733                 :                : 
                               2734                 :                :         /* TODO: Encapsulate cleanup from the PG_TRY and PG_CATCH blocks */
 4257 rhaas@postgresql.org     2735         [ +  - ]:              9 :         if (iterstate)
                               2736                 :              9 :             ReorderBufferIterTXNFinish(rb, iterstate);
                               2737                 :                : 
                               2738                 :              9 :         TeardownHistoricSnapshot(true);
                               2739                 :                : 
                               2740                 :                :         /*
                               2741                 :                :          * Force cache invalidation to happen outside of a valid transaction
                               2742                 :                :          * to prevent catalog access as we just caught an error.
                               2743                 :                :          */
 4002 andres@anarazel.de       2744                 :              9 :         AbortCurrentTransaction();
                               2745                 :                : 
                               2746                 :                :         /* make sure there's no cache pollution */
  134 msawada@postgresql.o     2747         [ -  + ]:              9 :         if (rbtxn_distr_inval_overflowed(txn))
                               2748                 :                :         {
  134 msawada@postgresql.o     2749         [ #  # ]:UBC           0 :             Assert(txn->ninvalidations_distributed == 0);
                               2750                 :              0 :             InvalidateSystemCaches();
                               2751                 :                :         }
                               2752                 :                :         else
                               2753                 :                :         {
  134 msawada@postgresql.o     2754                 :CBC           9 :             ReorderBufferExecuteInvalidations(txn->ninvalidations, txn->invalidations);
                               2755                 :              9 :             ReorderBufferExecuteInvalidations(txn->ninvalidations_distributed,
                               2756                 :                :                                               txn->invalidations_distributed);
                               2757                 :                :         }
                               2758                 :                : 
 4002 andres@anarazel.de       2759         [ +  + ]:              9 :         if (using_subtxn)
                               2760                 :                :         {
                               2761                 :              4 :             RollbackAndReleaseCurrentSubTransaction();
   46 alvherre@kurilemu.de     2762                 :GNC           4 :             MemoryContextSwitchTo(ccxt);
                               2763                 :              4 :             CurrentResourceOwner = cowner;
                               2764                 :                :         }
                               2765                 :                : 
                               2766                 :                :         /*
                               2767                 :                :          * The error code ERRCODE_TRANSACTION_ROLLBACK indicates a concurrent
                               2768                 :                :          * abort of the (sub)transaction we are streaming or preparing. We
                               2769                 :                :          * need to do the cleanup and return gracefully on this error, see
                               2770                 :                :          * SetupCheckXidLive.
                               2771                 :                :          *
                               2772                 :                :          * This error code can be thrown by one of the callbacks we call
                               2773                 :                :          * during decoding so we need to ensure that we return gracefully only
                               2774                 :                :          * when we are sending the data in streaming mode and the streaming is
                               2775                 :                :          * not finished yet or when we are sending the data out on a PREPARE
                               2776                 :                :          * during a two-phase commit.
                               2777                 :                :          */
 1636 akapila@postgresql.o     2778         [ +  + ]:CBC           9 :         if (errdata->sqlerrcode == ERRCODE_TRANSACTION_ROLLBACK &&
  258 msawada@postgresql.o     2779   [ -  +  -  - ]:              8 :             (stream_started || rbtxn_is_prepared(txn)))
                               2780                 :                :         {
                               2781                 :                :             /* curtxn must be set for streaming or prepared transactions */
 1636 akapila@postgresql.o     2782         [ -  + ]:              8 :             Assert(curtxn);
                               2783                 :                : 
                               2784                 :                :             /* Cleanup the temporary error state. */
 1907                          2785                 :              8 :             FlushErrorState();
                               2786                 :              8 :             FreeErrorData(errdata);
                               2787                 :              8 :             errdata = NULL;
                               2788                 :                : 
                               2789                 :                :             /* Remember the transaction is aborted. */
  258 msawada@postgresql.o     2790         [ -  + ]:              8 :             Assert(!rbtxn_is_committed(curtxn));
                               2791                 :              8 :             curtxn->txn_flags |= RBTXN_IS_ABORTED;
                               2792                 :                : 
                               2793                 :                :             /* Mark the transaction is streamed if appropriate */
                               2794         [ +  - ]:              8 :             if (stream_started)
                               2795                 :              8 :                 ReorderBufferMaybeMarkTXNStreamed(rb, txn);
                               2796                 :                : 
                               2797                 :                :             /* Reset the TXN so that it is allowed to stream remaining data. */
 1907 akapila@postgresql.o     2798                 :              8 :             ReorderBufferResetTXN(rb, txn, snapshot_now,
                               2799                 :                :                                   command_id, prev_lsn,
                               2800                 :                :                                   specinsert);
                               2801                 :                :         }
                               2802                 :                :         else
                               2803                 :                :         {
                               2804                 :              1 :             ReorderBufferCleanupTXN(rb, txn);
                               2805                 :              1 :             MemoryContextSwitchTo(ecxt);
                               2806                 :              1 :             PG_RE_THROW();
                               2807                 :                :         }
                               2808                 :                :     }
                               2809         [ -  + ]:           2177 :     PG_END_TRY();
                               2810                 :           2177 : }
                               2811                 :                : 
                               2812                 :                : /*
                               2813                 :                :  * Perform the replay of a transaction and its non-aborted subtransactions.
                               2814                 :                :  *
                               2815                 :                :  * Subtransactions previously have to be processed by
                               2816                 :                :  * ReorderBufferCommitChild(), even if previously assigned to the toplevel
                               2817                 :                :  * transaction with ReorderBufferAssignChild.
                               2818                 :                :  *
                               2819                 :                :  * This interface is called once a prepare or toplevel commit is read for both
                               2820                 :                :  * streamed as well as non-streamed transactions.
                               2821                 :                :  */
                               2822                 :                : static void
 1758                          2823                 :           1526 : ReorderBufferReplay(ReorderBufferTXN *txn,
                               2824                 :                :                     ReorderBuffer *rb, TransactionId xid,
                               2825                 :                :                     XLogRecPtr commit_lsn, XLogRecPtr end_lsn,
                               2826                 :                :                     TimestampTz commit_time,
                               2827                 :                :                     RepOriginId origin_id, XLogRecPtr origin_lsn)
                               2828                 :                : {
                               2829                 :                :     Snapshot    snapshot_now;
 1907                          2830                 :           1526 :     CommandId   command_id = FirstCommandId;
                               2831                 :                : 
                               2832                 :           1526 :     txn->final_lsn = commit_lsn;
                               2833                 :           1526 :     txn->end_lsn = end_lsn;
   28 peter@eisentraut.org     2834                 :GNC        1526 :     txn->commit_time = commit_time;
 1907 akapila@postgresql.o     2835                 :CBC        1526 :     txn->origin_id = origin_id;
                               2836                 :           1526 :     txn->origin_lsn = origin_lsn;
                               2837                 :                : 
                               2838                 :                :     /*
                               2839                 :                :      * If the transaction was (partially) streamed, we need to commit it in a
                               2840                 :                :      * 'streamed' way. That is, we first stream the remaining part of the
                               2841                 :                :      * transaction, and then invoke stream_commit message.
                               2842                 :                :      *
                               2843                 :                :      * Called after everything (origin ID, LSN, ...) is stored in the
                               2844                 :                :      * transaction to avoid passing that information directly.
                               2845                 :                :      */
                               2846         [ +  + ]:           1526 :     if (rbtxn_is_streamed(txn))
                               2847                 :                :     {
                               2848                 :             66 :         ReorderBufferStreamCommit(rb, txn);
                               2849                 :             66 :         return;
                               2850                 :                :     }
                               2851                 :                : 
                               2852                 :                :     /*
                               2853                 :                :      * If this transaction has no snapshot, it didn't make any changes to the
                               2854                 :                :      * database, so there's nothing to decode.  Note that
                               2855                 :                :      * ReorderBufferCommitChild will have transferred any snapshots from
                               2856                 :                :      * subtransactions if there were any.
                               2857                 :                :      */
                               2858         [ +  + ]:           1460 :     if (txn->base_snapshot == NULL)
                               2859                 :                :     {
                               2860         [ -  + ]:              3 :         Assert(txn->ninvalidations == 0);
                               2861                 :                : 
                               2862                 :                :         /*
                               2863                 :                :          * Removing this txn before a commit might result in the computation
                               2864                 :                :          * of an incorrect restart_lsn. See SnapBuildProcessRunningXacts.
                               2865                 :                :          */
  258 msawada@postgresql.o     2866         [ +  - ]:              3 :         if (!rbtxn_is_prepared(txn))
 1758 akapila@postgresql.o     2867                 :              3 :             ReorderBufferCleanupTXN(rb, txn);
 1907                          2868                 :              3 :         return;
                               2869                 :                :     }
                               2870                 :                : 
                               2871                 :           1457 :     snapshot_now = txn->base_snapshot;
                               2872                 :                : 
                               2873                 :                :     /* Process and send the changes to output plugin. */
                               2874                 :           1457 :     ReorderBufferProcessTXN(rb, txn, commit_lsn, snapshot_now,
                               2875                 :                :                             command_id, false);
                               2876                 :                : }
                               2877                 :                : 
                               2878                 :                : /*
                               2879                 :                :  * Commit a transaction.
                               2880                 :                :  *
                               2881                 :                :  * See comments for ReorderBufferReplay().
                               2882                 :                :  */
                               2883                 :                : void
 1758                          2884                 :           1498 : ReorderBufferCommit(ReorderBuffer *rb, TransactionId xid,
                               2885                 :                :                     XLogRecPtr commit_lsn, XLogRecPtr end_lsn,
                               2886                 :                :                     TimestampTz commit_time,
                               2887                 :                :                     RepOriginId origin_id, XLogRecPtr origin_lsn)
                               2888                 :                : {
                               2889                 :                :     ReorderBufferTXN *txn;
                               2890                 :                : 
                               2891                 :           1498 :     txn = ReorderBufferTXNByXid(rb, xid, false, NULL, InvalidXLogRecPtr,
                               2892                 :                :                                 false);
                               2893                 :                : 
                               2894                 :                :     /* unknown transaction, nothing to replay */
                               2895         [ +  + ]:           1498 :     if (txn == NULL)
                               2896                 :             17 :         return;
                               2897                 :                : 
                               2898                 :           1481 :     ReorderBufferReplay(txn, rb, xid, commit_lsn, end_lsn, commit_time,
                               2899                 :                :                         origin_id, origin_lsn);
                               2900                 :                : }
                               2901                 :                : 
                               2902                 :                : /*
                               2903                 :                :  * Record the prepare information for a transaction. Also, mark the transaction
                               2904                 :                :  * as a prepared transaction.
                               2905                 :                :  */
                               2906                 :                : bool
                               2907                 :            146 : ReorderBufferRememberPrepareInfo(ReorderBuffer *rb, TransactionId xid,
                               2908                 :                :                                  XLogRecPtr prepare_lsn, XLogRecPtr end_lsn,
                               2909                 :                :                                  TimestampTz prepare_time,
                               2910                 :                :                                  RepOriginId origin_id, XLogRecPtr origin_lsn)
                               2911                 :                : {
                               2912                 :                :     ReorderBufferTXN *txn;
                               2913                 :                : 
                               2914                 :            146 :     txn = ReorderBufferTXNByXid(rb, xid, false, NULL, InvalidXLogRecPtr, false);
                               2915                 :                : 
                               2916                 :                :     /* unknown transaction, nothing to do */
                               2917         [ -  + ]:            146 :     if (txn == NULL)
 1758 akapila@postgresql.o     2918                 :UBC           0 :         return false;
                               2919                 :                : 
                               2920                 :                :     /*
                               2921                 :                :      * Remember the prepare information to be later used by commit prepared in
                               2922                 :                :      * case we skip doing prepare.
                               2923                 :                :      */
 1758 akapila@postgresql.o     2924                 :CBC         146 :     txn->final_lsn = prepare_lsn;
                               2925                 :            146 :     txn->end_lsn = end_lsn;
   28 peter@eisentraut.org     2926                 :GNC         146 :     txn->prepare_time = prepare_time;
 1758 akapila@postgresql.o     2927                 :CBC         146 :     txn->origin_id = origin_id;
                               2928                 :            146 :     txn->origin_lsn = origin_lsn;
                               2929                 :                : 
                               2930                 :                :     /* Mark this transaction as a prepared transaction */
  258 msawada@postgresql.o     2931         [ -  + ]:            146 :     Assert((txn->txn_flags & RBTXN_PREPARE_STATUS_MASK) == 0);
                               2932                 :            146 :     txn->txn_flags |= RBTXN_IS_PREPARED;
                               2933                 :                : 
 1758 akapila@postgresql.o     2934                 :            146 :     return true;
                               2935                 :                : }
                               2936                 :                : 
                               2937                 :                : /* Remember that we have skipped prepare */
                               2938                 :                : void
                               2939                 :            104 : ReorderBufferSkipPrepare(ReorderBuffer *rb, TransactionId xid)
                               2940                 :                : {
                               2941                 :                :     ReorderBufferTXN *txn;
                               2942                 :                : 
                               2943                 :            104 :     txn = ReorderBufferTXNByXid(rb, xid, false, NULL, InvalidXLogRecPtr, false);
                               2944                 :                : 
                               2945                 :                :     /* unknown transaction, nothing to do */
                               2946         [ -  + ]:            104 :     if (txn == NULL)
 1758 akapila@postgresql.o     2947                 :UBC           0 :         return;
                               2948                 :                : 
                               2949                 :                :     /* txn must have been marked as a prepared transaction */
  258 msawada@postgresql.o     2950         [ -  + ]:CBC         104 :     Assert((txn->txn_flags & RBTXN_PREPARE_STATUS_MASK) == RBTXN_IS_PREPARED);
 1758 akapila@postgresql.o     2951                 :            104 :     txn->txn_flags |= RBTXN_SKIPPED_PREPARE;
                               2952                 :                : }
                               2953                 :                : 
                               2954                 :                : /*
                               2955                 :                :  * Prepare a two-phase transaction.
                               2956                 :                :  *
                               2957                 :                :  * See comments for ReorderBufferReplay().
                               2958                 :                :  */
                               2959                 :                : void
                               2960                 :             42 : ReorderBufferPrepare(ReorderBuffer *rb, TransactionId xid,
                               2961                 :                :                      char *gid)
                               2962                 :                : {
                               2963                 :                :     ReorderBufferTXN *txn;
                               2964                 :                : 
                               2965                 :             42 :     txn = ReorderBufferTXNByXid(rb, xid, false, NULL, InvalidXLogRecPtr,
                               2966                 :                :                                 false);
                               2967                 :                : 
                               2968                 :                :     /* unknown transaction, nothing to replay */
                               2969         [ -  + ]:             42 :     if (txn == NULL)
 1758 akapila@postgresql.o     2970                 :UBC           0 :         return;
                               2971                 :                : 
                               2972                 :                :     /*
                               2973                 :                :      * txn must have been marked as a prepared transaction and must have
                               2974                 :                :      * neither been skipped nor sent a prepare. Also, the prepare info must
                               2975                 :                :      * have been updated in it by now.
                               2976                 :                :      */
  258 msawada@postgresql.o     2977         [ -  + ]:CBC          42 :     Assert((txn->txn_flags & RBTXN_PREPARE_STATUS_MASK) == RBTXN_IS_PREPARED);
 1758 akapila@postgresql.o     2978         [ -  + ]:             42 :     Assert(txn->final_lsn != InvalidXLogRecPtr);
                               2979                 :                : 
  258 msawada@postgresql.o     2980                 :             42 :     txn->gid = pstrdup(gid);
                               2981                 :                : 
 1758 akapila@postgresql.o     2982                 :             42 :     ReorderBufferReplay(txn, rb, xid, txn->final_lsn, txn->end_lsn,
   28 peter@eisentraut.org     2983                 :GNC          42 :                         txn->prepare_time, txn->origin_id, txn->origin_lsn);
                               2984                 :                : 
                               2985                 :                :     /*
                               2986                 :                :      * Send a prepare if not already done so. This might occur if we have
                               2987                 :                :      * detected a concurrent abort while replaying the non-streaming
                               2988                 :                :      * transaction.
                               2989                 :                :      */
  258 msawada@postgresql.o     2990         [ -  + ]:CBC          42 :     if (!rbtxn_sent_prepare(txn))
                               2991                 :                :     {
 1671 akapila@postgresql.o     2992                 :UBC           0 :         rb->prepare(rb, txn, txn->final_lsn);
  258 msawada@postgresql.o     2993                 :              0 :         txn->txn_flags |= RBTXN_SENT_PREPARE;
                               2994                 :                :     }
                               2995                 :                : }
                               2996                 :                : 
                               2997                 :                : /*
                               2998                 :                :  * This is used to handle COMMIT/ROLLBACK PREPARED.
                               2999                 :                :  */
                               3000                 :                : void
 1758 akapila@postgresql.o     3001                 :CBC          43 : ReorderBufferFinishPrepared(ReorderBuffer *rb, TransactionId xid,
                               3002                 :                :                             XLogRecPtr commit_lsn, XLogRecPtr end_lsn,
                               3003                 :                :                             XLogRecPtr two_phase_at,
                               3004                 :                :                             TimestampTz commit_time, RepOriginId origin_id,
                               3005                 :                :                             XLogRecPtr origin_lsn, char *gid, bool is_commit)
                               3006                 :                : {
                               3007                 :                :     ReorderBufferTXN *txn;
                               3008                 :                :     XLogRecPtr  prepare_end_lsn;
                               3009                 :                :     TimestampTz prepare_time;
                               3010                 :                : 
 1708                          3011                 :             43 :     txn = ReorderBufferTXNByXid(rb, xid, false, NULL, commit_lsn, false);
                               3012                 :                : 
                               3013                 :                :     /* unknown transaction, nothing to do */
 1758                          3014         [ -  + ]:             43 :     if (txn == NULL)
 1758 akapila@postgresql.o     3015                 :UBC           0 :         return;
                               3016                 :                : 
                               3017                 :                :     /*
                               3018                 :                :      * By this time the txn has the prepare record information, remember it to
                               3019                 :                :      * be later used for rollback.
                               3020                 :                :      */
 1758 akapila@postgresql.o     3021                 :CBC          43 :     prepare_end_lsn = txn->end_lsn;
   28 peter@eisentraut.org     3022                 :GNC          43 :     prepare_time = txn->prepare_time;
                               3023                 :                : 
                               3024                 :                :     /* add the gid in the txn */
 1758 akapila@postgresql.o     3025                 :CBC          43 :     txn->gid = pstrdup(gid);
                               3026                 :                : 
                               3027                 :                :     /*
                               3028                 :                :      * It is possible that this transaction is not decoded at prepare time
                               3029                 :                :      * either because by that time we didn't have a consistent snapshot, or
                               3030                 :                :      * two_phase was not enabled, or it was decoded earlier but we have
                               3031                 :                :      * restarted. We only need to send the prepare if it was not decoded
                               3032                 :                :      * earlier. We don't need to decode the xact for aborts if it is not done
                               3033                 :                :      * already.
                               3034                 :                :      */
 1567                          3035   [ +  +  +  - ]:             43 :     if ((txn->final_lsn < two_phase_at) && is_commit)
                               3036                 :                :     {
                               3037                 :                :         /*
                               3038                 :                :          * txn must have been marked as a prepared transaction and skipped but
                               3039                 :                :          * not sent a prepare. Also, the prepare info must have been updated
                               3040                 :                :          * in txn even if we skip prepare.
                               3041                 :                :          */
  258 msawada@postgresql.o     3042         [ -  + ]:              3 :         Assert((txn->txn_flags & RBTXN_PREPARE_STATUS_MASK) ==
                               3043                 :                :                (RBTXN_IS_PREPARED | RBTXN_SKIPPED_PREPARE));
 1758 akapila@postgresql.o     3044         [ -  + ]:              3 :         Assert(txn->final_lsn != InvalidXLogRecPtr);
                               3045                 :                : 
                               3046                 :                :         /*
                               3047                 :                :          * By this time the txn has the prepare record information and it is
                               3048                 :                :          * important to use that so that downstream gets the accurate
                               3049                 :                :          * information. If instead, we have passed commit information here
                               3050                 :                :          * then downstream can behave as it has already replayed commit
                               3051                 :                :          * prepared after the restart.
                               3052                 :                :          */
                               3053                 :              3 :         ReorderBufferReplay(txn, rb, xid, txn->final_lsn, txn->end_lsn,
   28 peter@eisentraut.org     3054                 :GNC           3 :                             txn->prepare_time, txn->origin_id, txn->origin_lsn);
                               3055                 :                :     }
                               3056                 :                : 
 1758 akapila@postgresql.o     3057                 :CBC          43 :     txn->final_lsn = commit_lsn;
                               3058                 :             43 :     txn->end_lsn = end_lsn;
   28 peter@eisentraut.org     3059                 :GNC          43 :     txn->commit_time = commit_time;
 1758 akapila@postgresql.o     3060                 :CBC          43 :     txn->origin_id = origin_id;
                               3061                 :             43 :     txn->origin_lsn = origin_lsn;
                               3062                 :                : 
                               3063         [ +  + ]:             43 :     if (is_commit)
                               3064                 :             32 :         rb->commit_prepared(rb, txn, commit_lsn);
                               3065                 :                :     else
                               3066                 :             11 :         rb->rollback_prepared(rb, txn, prepare_end_lsn, prepare_time);
                               3067                 :                : 
                               3068                 :                :     /* cleanup: make sure there's no cache pollution */
                               3069                 :             43 :     ReorderBufferExecuteInvalidations(txn->ninvalidations,
                               3070                 :                :                                       txn->invalidations);
                               3071                 :             43 :     ReorderBufferCleanupTXN(rb, txn);
                               3072                 :                : }
                               3073                 :                : 
                               3074                 :                : /*
                               3075                 :                :  * Abort a transaction that possibly has previous changes. Needs to be first
                               3076                 :                :  * called for subtransactions and then for the toplevel xid.
                               3077                 :                :  *
                               3078                 :                :  * NB: Transactions handled here have to have actively aborted (i.e. have
                               3079                 :                :  * produced an abort record). Implicitly aborted transactions are handled via
                               3080                 :                :  * ReorderBufferAbortOld(); transactions we're just not interested in, but
                               3081                 :                :  * which have committed are handled in ReorderBufferForget().
                               3082                 :                :  *
                               3083                 :                :  * This function purges this transaction and its contents from memory and
                               3084                 :                :  * disk.
                               3085                 :                :  */
                               3086                 :                : void
 1023                          3087                 :            164 : ReorderBufferAbort(ReorderBuffer *rb, TransactionId xid, XLogRecPtr lsn,
                               3088                 :                :                    TimestampTz abort_time)
                               3089                 :                : {
                               3090                 :                :     ReorderBufferTXN *txn;
                               3091                 :                : 
 4257 rhaas@postgresql.org     3092                 :            164 :     txn = ReorderBufferTXNByXid(rb, xid, false, NULL, InvalidXLogRecPtr,
                               3093                 :                :                                 false);
                               3094                 :                : 
                               3095                 :                :     /* unknown, nothing to remove */
                               3096         [ -  + ]:            164 :     if (txn == NULL)
 4257 rhaas@postgresql.org     3097                 :UBC           0 :         return;
                               3098                 :                : 
   28 peter@eisentraut.org     3099                 :GNC         164 :     txn->abort_time = abort_time;
                               3100                 :                : 
                               3101                 :                :     /* For streamed transactions notify the remote node about the abort. */
 1907 akapila@postgresql.o     3102         [ +  + ]:CBC         164 :     if (rbtxn_is_streamed(txn))
                               3103                 :                :     {
                               3104                 :             30 :         rb->stream_abort(rb, txn, lsn);
                               3105                 :                : 
                               3106                 :                :         /*
                               3107                 :                :          * We might have decoded changes for this transaction that could load
                               3108                 :                :          * the cache as per the current transaction's view (consider DDL's
                               3109                 :                :          * happened in this transaction). We don't want the decoding of future
                               3110                 :                :          * transactions to use those cache entries so execute only the inval
                               3111                 :                :          * messages in this transaction.
                               3112                 :                :          */
                               3113         [ -  + ]:             30 :         if (txn->ninvalidations > 0)
 1907 akapila@postgresql.o     3114                 :UBC           0 :             ReorderBufferImmediateInvalidation(rb, txn->ninvalidations,
                               3115                 :                :                                                txn->invalidations);
                               3116                 :                :     }
                               3117                 :                : 
                               3118                 :                :     /* cosmetic... */
 4257 rhaas@postgresql.org     3119                 :CBC         164 :     txn->final_lsn = lsn;
                               3120                 :                : 
                               3121                 :                :     /* remove potential on-disk data, and deallocate */
                               3122                 :            164 :     ReorderBufferCleanupTXN(rb, txn);
                               3123                 :                : }
                               3124                 :                : 
                               3125                 :                : /*
                               3126                 :                :  * Abort all transactions that aren't actually running anymore because the
                               3127                 :                :  * server restarted.
                               3128                 :                :  *
                               3129                 :                :  * NB: These really have to be transactions that have aborted due to a server
                               3130                 :                :  * crash/immediate restart, as we don't deal with invalidations here.
                               3131                 :                :  */
                               3132                 :                : void
                               3133                 :           1433 : ReorderBufferAbortOld(ReorderBuffer *rb, TransactionId oldestRunningXid)
                               3134                 :                : {
                               3135                 :                :     dlist_mutable_iter it;
                               3136                 :                : 
                               3137                 :                :     /*
                               3138                 :                :      * Iterate through all (potential) toplevel TXNs and abort all that are
                               3139                 :                :      * older than what possibly can be running. Once we've found the first
                               3140                 :                :      * that is alive we stop, there might be some that acquired an xid earlier
                               3141                 :                :      * but started writing later, but it's unlikely and they will be cleaned
                               3142                 :                :      * up in a later call to this function.
                               3143                 :                :      */
                               3144   [ +  -  +  + ]:           1437 :     dlist_foreach_modify(it, &rb->toplevel_by_lsn)
                               3145                 :                :     {
                               3146                 :                :         ReorderBufferTXN *txn;
                               3147                 :                : 
                               3148                 :             60 :         txn = dlist_container(ReorderBufferTXN, node, it.cur);
                               3149                 :                : 
                               3150         [ +  + ]:             60 :         if (TransactionIdPrecedes(txn->xid, oldestRunningXid))
                               3151                 :                :         {
 3090 andres@anarazel.de       3152         [ -  + ]:              4 :             elog(DEBUG2, "aborting old transaction %u", txn->xid);
                               3153                 :                : 
                               3154                 :                :             /* Notify the remote node about the crash/immediate restart. */
 1025 akapila@postgresql.o     3155         [ -  + ]:              4 :             if (rbtxn_is_streamed(txn))
 1025 akapila@postgresql.o     3156                 :UBC           0 :                 rb->stream_abort(rb, txn, InvalidXLogRecPtr);
                               3157                 :                : 
                               3158                 :                :             /* remove potential on-disk data, and deallocate this tx */
 4257 rhaas@postgresql.org     3159                 :CBC           4 :             ReorderBufferCleanupTXN(rb, txn);
                               3160                 :                :         }
                               3161                 :                :         else
                               3162                 :             56 :             return;
                               3163                 :                :     }
                               3164                 :                : }
                               3165                 :                : 
                               3166                 :                : /*
                               3167                 :                :  * Forget the contents of a transaction if we aren't interested in its
                               3168                 :                :  * contents. Needs to be first called for subtransactions and then for the
                               3169                 :                :  * toplevel xid.
                               3170                 :                :  *
                               3171                 :                :  * This is significantly different to ReorderBufferAbort() because
                               3172                 :                :  * transactions that have committed need to be treated differently from aborted
                               3173                 :                :  * ones since they may have modified the catalog.
                               3174                 :                :  *
                               3175                 :                :  * Note that this is only allowed to be called in the moment a transaction
                               3176                 :                :  * commit has just been read, not earlier; otherwise later records referring
                               3177                 :                :  * to this xid might re-create the transaction incompletely.
                               3178                 :                :  */
                               3179                 :                : void
                               3180                 :           2593 : ReorderBufferForget(ReorderBuffer *rb, TransactionId xid, XLogRecPtr lsn)
                               3181                 :                : {
                               3182                 :                :     ReorderBufferTXN *txn;
                               3183                 :                : 
                               3184                 :           2593 :     txn = ReorderBufferTXNByXid(rb, xid, false, NULL, InvalidXLogRecPtr,
                               3185                 :                :                                 false);
                               3186                 :                : 
                               3187                 :                :     /* unknown, nothing to forget */
                               3188         [ +  + ]:           2593 :     if (txn == NULL)
                               3189                 :            561 :         return;
                               3190                 :                : 
                               3191                 :                :     /* this transaction mustn't be streamed */
 1055 akapila@postgresql.o     3192         [ -  + ]:           2032 :     Assert(!rbtxn_is_streamed(txn));
                               3193                 :                : 
                               3194                 :                :     /* cosmetic... */
 4257 rhaas@postgresql.org     3195                 :           2032 :     txn->final_lsn = lsn;
                               3196                 :                : 
                               3197                 :                :     /*
                               3198                 :                :      * Process only cache invalidation messages in this transaction if there
                               3199                 :                :      * are any. Even if we're not interested in the transaction's contents, it
                               3200                 :                :      * could have manipulated the catalog and we need to update the caches
                               3201                 :                :      * according to that.
                               3202                 :                :      */
                               3203   [ +  +  +  + ]:           2032 :     if (txn->base_snapshot != NULL && txn->ninvalidations > 0)
 3475 andres@anarazel.de       3204                 :            562 :         ReorderBufferImmediateInvalidation(rb, txn->ninvalidations,
                               3205                 :                :                                            txn->invalidations);
                               3206                 :                :     else
 4257 rhaas@postgresql.org     3207         [ -  + ]:           1470 :         Assert(txn->ninvalidations == 0);
                               3208                 :                : 
                               3209                 :                :     /* remove potential on-disk data, and deallocate */
                               3210                 :           2032 :     ReorderBufferCleanupTXN(rb, txn);
                               3211                 :                : }
                               3212                 :                : 
                               3213                 :                : /*
                               3214                 :                :  * Invalidate cache for those transactions that need to be skipped just in case
                               3215                 :                :  * catalogs were manipulated as part of the transaction.
                               3216                 :                :  *
                               3217                 :                :  * Note that this is a special-purpose function for prepared transactions where
                               3218                 :                :  * we don't want to clean up the TXN even when we decide to skip it. See
                               3219                 :                :  * DecodePrepare.
                               3220                 :                :  */
                               3221                 :                : void
 1758 akapila@postgresql.o     3222                 :            101 : ReorderBufferInvalidate(ReorderBuffer *rb, TransactionId xid, XLogRecPtr lsn)
                               3223                 :                : {
                               3224                 :                :     ReorderBufferTXN *txn;
                               3225                 :                : 
                               3226                 :            101 :     txn = ReorderBufferTXNByXid(rb, xid, false, NULL, InvalidXLogRecPtr,
                               3227                 :                :                                 false);
                               3228                 :                : 
                               3229                 :                :     /* unknown, nothing to do */
                               3230         [ -  + ]:            101 :     if (txn == NULL)
 1758 akapila@postgresql.o     3231                 :UBC           0 :         return;
                               3232                 :                : 
                               3233                 :                :     /*
                               3234                 :                :      * Process cache invalidation messages if there are any. Even if we're not
                               3235                 :                :      * interested in the transaction's contents, it could have manipulated the
                               3236                 :                :      * catalog and we need to update the caches according to that.
                               3237                 :                :      */
 1758 akapila@postgresql.o     3238   [ +  -  +  + ]:CBC         101 :     if (txn->base_snapshot != NULL && txn->ninvalidations > 0)
                               3239                 :             29 :         ReorderBufferImmediateInvalidation(rb, txn->ninvalidations,
                               3240                 :                :                                            txn->invalidations);
                               3241                 :                :     else
                               3242         [ -  + ]:             72 :         Assert(txn->ninvalidations == 0);
                               3243                 :                : }
                               3244                 :                : 
                               3245                 :                : 
                               3246                 :                : /*
                               3247                 :                :  * Execute invalidations happening outside the context of a decoded
                               3248                 :                :  * transaction. That currently happens either for xid-less commits
                               3249                 :                :  * (cf. RecordTransactionCommit()) or for invalidations in uninteresting
                               3250                 :                :  * transactions (via ReorderBufferForget()).
                               3251                 :                :  */
                               3252                 :                : void
 3475 andres@anarazel.de       3253                 :            600 : ReorderBufferImmediateInvalidation(ReorderBuffer *rb, uint32 ninvalidations,
                               3254                 :                :                                    SharedInvalidationMessage *invalidations)
                               3255                 :                : {
                               3256                 :            600 :     bool        use_subtxn = IsTransactionOrTransactionBlock();
   46 alvherre@kurilemu.de     3257                 :GNC         600 :     MemoryContext ccxt = CurrentMemoryContext;
                               3258                 :            600 :     ResourceOwner cowner = CurrentResourceOwner;
                               3259                 :                :     int         i;
                               3260                 :                : 
 3475 andres@anarazel.de       3261         [ +  + ]:CBC         600 :     if (use_subtxn)
                               3262                 :            431 :         BeginInternalSubTransaction("replay");
                               3263                 :                : 
                               3264                 :                :     /*
                               3265                 :                :      * Force invalidations to happen outside of a valid transaction - that way
                               3266                 :                :      * entries will just be marked as invalid without accessing the catalog.
                               3267                 :                :      * That's advantageous because we don't need to setup the full state
                               3268                 :                :      * necessary for catalog access.
                               3269                 :                :      */
                               3270         [ +  + ]:            600 :     if (use_subtxn)
                               3271                 :            431 :         AbortCurrentTransaction();
                               3272                 :                : 
                               3273         [ +  + ]:          25274 :     for (i = 0; i < ninvalidations; i++)
                               3274                 :          24674 :         LocalExecuteInvalidationMessage(&invalidations[i]);
                               3275                 :                : 
                               3276         [ +  + ]:            600 :     if (use_subtxn)
                               3277                 :                :     {
                               3278                 :            431 :         RollbackAndReleaseCurrentSubTransaction();
   46 alvherre@kurilemu.de     3279                 :GNC         431 :         MemoryContextSwitchTo(ccxt);
                               3280                 :            431 :         CurrentResourceOwner = cowner;
                               3281                 :                :     }
 3475 andres@anarazel.de       3282                 :CBC         600 : }
                               3283                 :                : 
                               3284                 :                : /*
                               3285                 :                :  * Tell reorderbuffer about an xid seen in the WAL stream. Has to be called at
                               3286                 :                :  * least once for every xid in XLogRecord->xl_xid (other places in records
                               3287                 :                :  * may, but do not have to be passed through here).
                               3288                 :                :  *
                               3289                 :                :  * Reorderbuffer keeps some data structures about transactions in LSN order,
                               3290                 :                :  * for efficiency. To do that it has to know about when transactions are seen
                               3291                 :                :  * first in the WAL. As many types of records are not actually interesting for
                               3292                 :                :  * logical decoding, they do not necessarily pass through here.
                               3293                 :                :  */
                               3294                 :                : void
 3524                          3295                 :        1999994 : ReorderBufferProcessXid(ReorderBuffer *rb, TransactionId xid, XLogRecPtr lsn)
                               3296                 :                : {
                               3297                 :                :     /* many records won't have an xid assigned, centralize check here */
                               3298         [ +  + ]:        1999994 :     if (xid != InvalidTransactionId)
                               3299                 :        1997932 :         ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
 4257 rhaas@postgresql.org     3300                 :        1999994 : }
                               3301                 :                : 
                               3302                 :                : /*
                               3303                 :                :  * Add a new snapshot to this transaction that may only used after lsn 'lsn'
                               3304                 :                :  * because the previous snapshot doesn't describe the catalog correctly for
                               3305                 :                :  * following rows.
                               3306                 :                :  */
                               3307                 :                : void
                               3308                 :           1283 : ReorderBufferAddSnapshot(ReorderBuffer *rb, TransactionId xid,
                               3309                 :                :                          XLogRecPtr lsn, Snapshot snap)
                               3310                 :                : {
  230 heikki.linnakangas@i     3311                 :           1283 :     ReorderBufferChange *change = ReorderBufferAllocChange(rb);
                               3312                 :                : 
 4253 tgl@sss.pgh.pa.us        3313                 :           1283 :     change->data.snapshot = snap;
                               3314                 :           1283 :     change->action = REORDER_BUFFER_CHANGE_INTERNAL_SNAPSHOT;
                               3315                 :                : 
 1907 akapila@postgresql.o     3316                 :           1283 :     ReorderBufferQueueChange(rb, xid, lsn, change, false);
 4257 rhaas@postgresql.org     3317                 :           1283 : }
                               3318                 :                : 
                               3319                 :                : /*
                               3320                 :                :  * Set up the transaction's base snapshot.
                               3321                 :                :  *
                               3322                 :                :  * If we know that xid is a subtransaction, set the base snapshot on the
                               3323                 :                :  * top-level transaction instead.
                               3324                 :                :  */
                               3325                 :                : void
                               3326                 :           3286 : ReorderBufferSetBaseSnapshot(ReorderBuffer *rb, TransactionId xid,
                               3327                 :                :                              XLogRecPtr lsn, Snapshot snap)
                               3328                 :                : {
                               3329                 :                :     ReorderBufferTXN *txn;
                               3330                 :                :     bool        is_new;
                               3331                 :                : 
 1096 peter@eisentraut.org     3332         [ -  + ]:           3286 :     Assert(snap != NULL);
                               3333                 :                : 
                               3334                 :                :     /*
                               3335                 :                :      * Fetch the transaction to operate on.  If we know it's a subtransaction,
                               3336                 :                :      * operate on its top-level transaction instead.
                               3337                 :                :      */
 4257 rhaas@postgresql.org     3338                 :           3286 :     txn = ReorderBufferTXNByXid(rb, xid, true, &is_new, lsn, true);
 2118 alvherre@alvh.no-ip.     3339         [ +  + ]:           3286 :     if (rbtxn_is_known_subxact(txn))
 2681                          3340                 :            106 :         txn = ReorderBufferTXNByXid(rb, txn->toplevel_xid, false,
                               3341                 :                :                                     NULL, InvalidXLogRecPtr, false);
 4257 rhaas@postgresql.org     3342         [ -  + ]:           3286 :     Assert(txn->base_snapshot == NULL);
                               3343                 :                : 
                               3344                 :           3286 :     txn->base_snapshot = snap;
                               3345                 :           3286 :     txn->base_snapshot_lsn = lsn;
 2681 alvherre@alvh.no-ip.     3346                 :           3286 :     dlist_push_tail(&rb->txns_by_base_snapshot_lsn, &txn->base_snapshot_node);
                               3347                 :                : 
                               3348                 :           3286 :     AssertTXNLsnOrder(rb);
 4257 rhaas@postgresql.org     3349                 :           3286 : }
                               3350                 :                : 
                               3351                 :                : /*
                               3352                 :                :  * Access the catalog with this CommandId at this point in the changestream.
                               3353                 :                :  *
                               3354                 :                :  * May only be called for command ids > 1
                               3355                 :                :  */
                               3356                 :                : void
                               3357                 :          24452 : ReorderBufferAddNewCommandId(ReorderBuffer *rb, TransactionId xid,
                               3358                 :                :                              XLogRecPtr lsn, CommandId cid)
                               3359                 :                : {
  230 heikki.linnakangas@i     3360                 :          24452 :     ReorderBufferChange *change = ReorderBufferAllocChange(rb);
                               3361                 :                : 
 4253 tgl@sss.pgh.pa.us        3362                 :          24452 :     change->data.command_id = cid;
                               3363                 :          24452 :     change->action = REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID;
                               3364                 :                : 
 1907 akapila@postgresql.o     3365                 :          24452 :     ReorderBufferQueueChange(rb, xid, lsn, change, false);
 4257 rhaas@postgresql.org     3366                 :          24452 : }
                               3367                 :                : 
                               3368                 :                : /*
                               3369                 :                :  * Update memory counters to account for the new or removed change.
                               3370                 :                :  *
                               3371                 :                :  * We update two counters - in the reorder buffer, and in the transaction
                               3372                 :                :  * containing the change. The reorder buffer counter allows us to quickly
                               3373                 :                :  * decide if we reached the memory limit, the transaction counter allows
                               3374                 :                :  * us to quickly pick the largest transaction for eviction.
                               3375                 :                :  *
                               3376                 :                :  * Either txn or change must be non-NULL at least. We update the memory
                               3377                 :                :  * counter of txn if it's non-NULL, otherwise change->txn.
                               3378                 :                :  *
                               3379                 :                :  * When streaming is enabled, we need to update the toplevel transaction
                               3380                 :                :  * counters instead - we don't really care about subtransactions as we
                               3381                 :                :  * can't stream them individually anyway, and we only pick toplevel
                               3382                 :                :  * transactions for eviction. So only toplevel transactions matter.
                               3383                 :                :  */
                               3384                 :                : static void
 2173 akapila@postgresql.o     3385                 :        1746806 : ReorderBufferChangeMemoryUpdate(ReorderBuffer *rb,
                               3386                 :                :                                 ReorderBufferChange *change,
                               3387                 :                :                                 ReorderBufferTXN *txn,
                               3388                 :                :                                 bool addition, Size sz)
                               3389                 :                : {
                               3390                 :                :     ReorderBufferTXN *toptxn;
                               3391                 :                : 
  573 msawada@postgresql.o     3392   [ +  +  -  + ]:        1746806 :     Assert(txn || change);
                               3393                 :                : 
                               3394                 :                :     /*
                               3395                 :                :      * Ignore tuple CID changes, because those are not evicted when reaching
                               3396                 :                :      * memory limit. So we just don't count them, because it might easily
                               3397                 :                :      * trigger a pointless attempt to spill.
                               3398                 :                :      */
                               3399   [ +  +  +  + ]:        1746806 :     if (change && change->action == REORDER_BUFFER_CHANGE_INTERNAL_TUPLECID)
 2173 akapila@postgresql.o     3400                 :          24334 :         return;
                               3401                 :                : 
  573 msawada@postgresql.o     3402         [ +  + ]:        1722472 :     if (sz == 0)
                               3403                 :            972 :         return;
                               3404                 :                : 
                               3405         [ +  + ]:        1721500 :     if (txn == NULL)
                               3406                 :        1713917 :         txn = change->txn;
                               3407         [ -  + ]:        1721500 :     Assert(txn != NULL);
                               3408                 :                : 
                               3409                 :                :     /*
                               3410                 :                :      * Update the total size in top level as well. This is later used to
                               3411                 :                :      * compute the decoding stats.
                               3412                 :                :      */
  956 akapila@postgresql.o     3413         [ +  + ]:        1721500 :     toptxn = rbtxn_get_toptxn(txn);
                               3414                 :                : 
 2173                          3415         [ +  + ]:        1721500 :     if (addition)
                               3416                 :                :     {
  565 msawada@postgresql.o     3417                 :        1540437 :         Size        oldsize = txn->size;
                               3418                 :                : 
 1907 akapila@postgresql.o     3419                 :        1540437 :         txn->size += sz;
 2173                          3420                 :        1540437 :         rb->size += sz;
                               3421                 :                : 
                               3422                 :                :         /* Update the total size in the top transaction. */
 1639                          3423                 :        1540437 :         toptxn->total_size += sz;
                               3424                 :                : 
                               3425                 :                :         /* Update the max-heap */
  565 msawada@postgresql.o     3426         [ +  + ]:        1540437 :         if (oldsize != 0)
                               3427                 :        1532789 :             pairingheap_remove(rb->txn_heap, &txn->txn_node);
                               3428                 :        1540437 :         pairingheap_add(rb->txn_heap, &txn->txn_node);
                               3429                 :                :     }
                               3430                 :                :     else
                               3431                 :                :     {
 1907 akapila@postgresql.o     3432   [ +  -  -  + ]:         181063 :         Assert((rb->size >= sz) && (txn->size >= sz));
                               3433                 :         181063 :         txn->size -= sz;
 2173                          3434                 :         181063 :         rb->size -= sz;
                               3435                 :                : 
                               3436                 :                :         /* Update the total size in the top transaction. */
 1639                          3437                 :         181063 :         toptxn->total_size -= sz;
                               3438                 :                : 
                               3439                 :                :         /* Update the max-heap */
  565 msawada@postgresql.o     3440                 :         181063 :         pairingheap_remove(rb->txn_heap, &txn->txn_node);
                               3441         [ +  + ]:         181063 :         if (txn->size != 0)
                               3442                 :         173452 :             pairingheap_add(rb->txn_heap, &txn->txn_node);
                               3443                 :                :     }
                               3444                 :                : 
 1907 akapila@postgresql.o     3445         [ -  + ]:        1721500 :     Assert(txn->size <= rb->size);
                               3446                 :                : }
                               3447                 :                : 
                               3448                 :                : /*
                               3449                 :                :  * Add new (relfilelocator, tid) -> (cmin, cmax) mappings.
                               3450                 :                :  *
                               3451                 :                :  * We do not include this change type in memory accounting, because we
                               3452                 :                :  * keep CIDs in a separate list and do not evict them when reaching
                               3453                 :                :  * the memory limit.
                               3454                 :                :  */
                               3455                 :                : void
 4257 rhaas@postgresql.org     3456                 :          24452 : ReorderBufferAddNewTupleCids(ReorderBuffer *rb, TransactionId xid,
                               3457                 :                :                              XLogRecPtr lsn, RelFileLocator locator,
                               3458                 :                :                              ItemPointerData tid, CommandId cmin,
                               3459                 :                :                              CommandId cmax, CommandId combocid)
                               3460                 :                : {
  230 heikki.linnakangas@i     3461                 :          24452 :     ReorderBufferChange *change = ReorderBufferAllocChange(rb);
                               3462                 :                :     ReorderBufferTXN *txn;
                               3463                 :                : 
 4257 rhaas@postgresql.org     3464                 :          24452 :     txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
                               3465                 :                : 
 1210                          3466                 :          24452 :     change->data.tuplecid.locator = locator;
 4253 tgl@sss.pgh.pa.us        3467                 :          24452 :     change->data.tuplecid.tid = tid;
                               3468                 :          24452 :     change->data.tuplecid.cmin = cmin;
                               3469                 :          24452 :     change->data.tuplecid.cmax = cmax;
                               3470                 :          24452 :     change->data.tuplecid.combocid = combocid;
 4257 rhaas@postgresql.org     3471                 :          24452 :     change->lsn = lsn;
 2173 akapila@postgresql.o     3472                 :          24452 :     change->txn = txn;
 4253 tgl@sss.pgh.pa.us        3473                 :          24452 :     change->action = REORDER_BUFFER_CHANGE_INTERNAL_TUPLECID;
                               3474                 :                : 
 4257 rhaas@postgresql.org     3475                 :          24452 :     dlist_push_tail(&txn->tuplecids, &change->node);
                               3476                 :          24452 :     txn->ntuplecids++;
                               3477                 :          24452 : }
                               3478                 :                : 
                               3479                 :                : /*
                               3480                 :                :  * Add new invalidation messages to the reorder buffer queue.
                               3481                 :                :  */
                               3482                 :                : static void
  134 msawada@postgresql.o     3483                 :           5242 : ReorderBufferQueueInvalidations(ReorderBuffer *rb, TransactionId xid,
                               3484                 :                :                                 XLogRecPtr lsn, Size nmsgs,
                               3485                 :                :                                 SharedInvalidationMessage *msgs)
                               3486                 :                : {
                               3487                 :                :     ReorderBufferChange *change;
                               3488                 :                : 
                               3489                 :           5242 :     change = ReorderBufferAllocChange(rb);
                               3490                 :           5242 :     change->action = REORDER_BUFFER_CHANGE_INVALIDATION;
                               3491                 :           5242 :     change->data.inval.ninvalidations = nmsgs;
                               3492                 :           5242 :     change->data.inval.invalidations = (SharedInvalidationMessage *)
                               3493                 :           5242 :         palloc(sizeof(SharedInvalidationMessage) * nmsgs);
                               3494                 :           5242 :     memcpy(change->data.inval.invalidations, msgs,
                               3495                 :                :            sizeof(SharedInvalidationMessage) * nmsgs);
                               3496                 :                : 
                               3497                 :           5242 :     ReorderBufferQueueChange(rb, xid, lsn, change, false);
                               3498                 :           5242 : }
                               3499                 :                : 
                               3500                 :                : /*
                               3501                 :                :  * A helper function for ReorderBufferAddInvalidations() and
                               3502                 :                :  * ReorderBufferAddDistributedInvalidations() to accumulate the invalidation
                               3503                 :                :  * messages to the **invals_out.
                               3504                 :                :  */
                               3505                 :                : static void
                               3506                 :           5242 : ReorderBufferAccumulateInvalidations(SharedInvalidationMessage **invals_out,
                               3507                 :                :                                      uint32 *ninvals_out,
                               3508                 :                :                                      SharedInvalidationMessage *msgs_new,
                               3509                 :                :                                      Size nmsgs_new)
                               3510                 :                : {
                               3511         [ +  + ]:           5242 :     if (*ninvals_out == 0)
                               3512                 :                :     {
                               3513                 :           1296 :         *ninvals_out = nmsgs_new;
                               3514                 :           1296 :         *invals_out = (SharedInvalidationMessage *)
                               3515                 :           1296 :             palloc(sizeof(SharedInvalidationMessage) * nmsgs_new);
                               3516                 :           1296 :         memcpy(*invals_out, msgs_new, sizeof(SharedInvalidationMessage) * nmsgs_new);
                               3517                 :                :     }
                               3518                 :                :     else
                               3519                 :                :     {
                               3520                 :                :         /* Enlarge the array of inval messages */
                               3521                 :           3946 :         *invals_out = (SharedInvalidationMessage *)
                               3522                 :           3946 :             repalloc(*invals_out, sizeof(SharedInvalidationMessage) *
                               3523                 :           3946 :                      (*ninvals_out + nmsgs_new));
                               3524                 :           3946 :         memcpy(*invals_out + *ninvals_out, msgs_new,
                               3525                 :                :                nmsgs_new * sizeof(SharedInvalidationMessage));
                               3526                 :           3946 :         *ninvals_out += nmsgs_new;
                               3527                 :                :     }
                               3528                 :           5242 : }
                               3529                 :                : 
                               3530                 :                : /*
                               3531                 :                :  * Accumulate the invalidations for executing them later.
                               3532                 :                :  *
                               3533                 :                :  * This needs to be called for each XLOG_XACT_INVALIDATIONS message and
                               3534                 :                :  * accumulates all the invalidation messages in the toplevel transaction, if
                               3535                 :                :  * available, otherwise in the current transaction, as well as in the form of
                               3536                 :                :  * change in reorder buffer.  We require to record it in form of the change
                               3537                 :                :  * so that we can execute only the required invalidations instead of executing
                               3538                 :                :  * all the invalidations on each CommandId increment.  We also need to
                               3539                 :                :  * accumulate these in the txn buffer because in some cases where we skip
                               3540                 :                :  * processing the transaction (see ReorderBufferForget), we need to execute
                               3541                 :                :  * all the invalidations together.
                               3542                 :                :  */
                               3543                 :                : void
 4257 rhaas@postgresql.org     3544                 :           5214 : ReorderBufferAddInvalidations(ReorderBuffer *rb, TransactionId xid,
                               3545                 :                :                               XLogRecPtr lsn, Size nmsgs,
                               3546                 :                :                               SharedInvalidationMessage *msgs)
                               3547                 :                : {
                               3548                 :                :     ReorderBufferTXN *txn;
                               3549                 :                :     MemoryContext oldcontext;
                               3550                 :                : 
                               3551                 :           5214 :     txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
                               3552                 :                : 
 1839 akapila@postgresql.o     3553                 :           5214 :     oldcontext = MemoryContextSwitchTo(rb->context);
                               3554                 :                : 
                               3555                 :                :     /*
                               3556                 :                :      * Collect all the invalidations under the top transaction, if available,
                               3557                 :                :      * so that we can execute them all together.  See comments atop this
                               3558                 :                :      * function.
                               3559                 :                :      */
  956                          3560         [ +  + ]:           5214 :     txn = rbtxn_get_toptxn(txn);
                               3561                 :                : 
 4257 rhaas@postgresql.org     3562         [ -  + ]:           5214 :     Assert(nmsgs > 0);
                               3563                 :                : 
  134 msawada@postgresql.o     3564                 :           5214 :     ReorderBufferAccumulateInvalidations(&txn->invalidations,
                               3565                 :                :                                          &txn->ninvalidations,
                               3566                 :                :                                          msgs, nmsgs);
                               3567                 :                : 
                               3568                 :           5214 :     ReorderBufferQueueInvalidations(rb, xid, lsn, nmsgs, msgs);
                               3569                 :                : 
                               3570                 :           5214 :     MemoryContextSwitchTo(oldcontext);
                               3571                 :           5214 : }
                               3572                 :                : 
                               3573                 :                : /*
                               3574                 :                :  * Accumulate the invalidations distributed by other committed transactions
                               3575                 :                :  * for executing them later.
                               3576                 :                :  *
                               3577                 :                :  * This function is similar to ReorderBufferAddInvalidations() but stores
                               3578                 :                :  * the given inval messages to the txn->invalidations_distributed with the
                               3579                 :                :  * overflow check.
                               3580                 :                :  *
                               3581                 :                :  * This needs to be called by committed transactions to distribute their
                               3582                 :                :  * inval messages to in-progress transactions.
                               3583                 :                :  */
                               3584                 :                : void
                               3585                 :             28 : ReorderBufferAddDistributedInvalidations(ReorderBuffer *rb, TransactionId xid,
                               3586                 :                :                                          XLogRecPtr lsn, Size nmsgs,
                               3587                 :                :                                          SharedInvalidationMessage *msgs)
                               3588                 :                : {
                               3589                 :                :     ReorderBufferTXN *txn;
                               3590                 :                :     MemoryContext oldcontext;
                               3591                 :                : 
                               3592                 :             28 :     txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
                               3593                 :                : 
                               3594                 :             28 :     oldcontext = MemoryContextSwitchTo(rb->context);
                               3595                 :                : 
                               3596                 :                :     /*
                               3597                 :                :      * Collect all the invalidations under the top transaction, if available,
                               3598                 :                :      * so that we can execute them all together.  See comments
                               3599                 :                :      * ReorderBufferAddInvalidations.
                               3600                 :                :      */
                               3601         [ -  + ]:             28 :     txn = rbtxn_get_toptxn(txn);
                               3602                 :                : 
                               3603         [ -  + ]:             28 :     Assert(nmsgs > 0);
                               3604                 :                : 
                               3605         [ +  - ]:             28 :     if (!rbtxn_distr_inval_overflowed(txn))
                               3606                 :                :     {
                               3607                 :                :         /*
                               3608                 :                :          * Check the transaction has enough space for storing distributed
                               3609                 :                :          * invalidation messages.
                               3610                 :                :          */
                               3611         [ -  + ]:             28 :         if (txn->ninvalidations_distributed + nmsgs >= MAX_DISTR_INVAL_MSG_PER_TXN)
                               3612                 :                :         {
                               3613                 :                :             /*
                               3614                 :                :              * Mark the invalidation message as overflowed and free up the
                               3615                 :                :              * messages accumulated so far.
                               3616                 :                :              */
  134 msawada@postgresql.o     3617                 :UBC           0 :             txn->txn_flags |= RBTXN_DISTR_INVAL_OVERFLOWED;
                               3618                 :                : 
                               3619         [ #  # ]:              0 :             if (txn->invalidations_distributed)
                               3620                 :                :             {
                               3621                 :              0 :                 pfree(txn->invalidations_distributed);
                               3622                 :              0 :                 txn->invalidations_distributed = NULL;
                               3623                 :              0 :                 txn->ninvalidations_distributed = 0;
                               3624                 :                :             }
                               3625                 :                :         }
                               3626                 :                :         else
  134 msawada@postgresql.o     3627                 :CBC          28 :             ReorderBufferAccumulateInvalidations(&txn->invalidations_distributed,
                               3628                 :                :                                                  &txn->ninvalidations_distributed,
                               3629                 :                :                                                  msgs, nmsgs);
                               3630                 :                :     }
                               3631                 :                : 
                               3632                 :                :     /* Queue the invalidation messages into the transaction */
                               3633                 :             28 :     ReorderBufferQueueInvalidations(rb, xid, lsn, nmsgs, msgs);
                               3634                 :                : 
 1839 akapila@postgresql.o     3635                 :             28 :     MemoryContextSwitchTo(oldcontext);
 4257 rhaas@postgresql.org     3636                 :             28 : }
                               3637                 :                : 
                               3638                 :                : /*
                               3639                 :                :  * Apply all invalidations we know. Possibly we only need parts at this point
                               3640                 :                :  * in the changestream but we don't know which those are.
                               3641                 :                :  */
                               3642                 :                : static void
 1839 akapila@postgresql.o     3643                 :           6849 : ReorderBufferExecuteInvalidations(uint32 nmsgs, SharedInvalidationMessage *msgs)
                               3644                 :                : {
                               3645                 :                :     int         i;
                               3646                 :                : 
                               3647         [ +  + ]:          49887 :     for (i = 0; i < nmsgs; i++)
                               3648                 :          43038 :         LocalExecuteInvalidationMessage(&msgs[i]);
 4257 rhaas@postgresql.org     3649                 :           6849 : }
                               3650                 :                : 
                               3651                 :                : /*
                               3652                 :                :  * Mark a transaction as containing catalog changes
                               3653                 :                :  */
                               3654                 :                : void
                               3655                 :          29699 : ReorderBufferXidSetCatalogChanges(ReorderBuffer *rb, TransactionId xid,
                               3656                 :                :                                   XLogRecPtr lsn)
                               3657                 :                : {
                               3658                 :                :     ReorderBufferTXN *txn;
                               3659                 :                : 
                               3660                 :          29699 :     txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
                               3661                 :                : 
 1174 akapila@postgresql.o     3662         [ +  + ]:          29699 :     if (!rbtxn_has_catalog_changes(txn))
                               3663                 :                :     {
                               3664                 :           1306 :         txn->txn_flags |= RBTXN_HAS_CATALOG_CHANGES;
 1091 drowley@postgresql.o     3665                 :           1306 :         dclist_push_tail(&rb->catchange_txns, &txn->catchange_node);
                               3666                 :                :     }
                               3667                 :                : 
                               3668                 :                :     /*
                               3669                 :                :      * Mark top-level transaction as having catalog changes too if one of its
                               3670                 :                :      * children has so that the ReorderBufferBuildTupleCidHash can
                               3671                 :                :      * conveniently check just top-level transaction and decide whether to
                               3672                 :                :      * build the hash table or not.
                               3673                 :                :      */
  956 akapila@postgresql.o     3674         [ +  + ]:          29699 :     if (rbtxn_is_subtxn(txn))
                               3675                 :                :     {
                               3676         [ +  - ]:            896 :         ReorderBufferTXN *toptxn = rbtxn_get_toptxn(txn);
                               3677                 :                : 
                               3678         [ +  + ]:            896 :         if (!rbtxn_has_catalog_changes(toptxn))
                               3679                 :                :         {
                               3680                 :             20 :             toptxn->txn_flags |= RBTXN_HAS_CATALOG_CHANGES;
                               3681                 :             20 :             dclist_push_tail(&rb->catchange_txns, &toptxn->catchange_node);
                               3682                 :                :         }
                               3683                 :                :     }
 1174                          3684                 :          29699 : }
                               3685                 :                : 
                               3686                 :                : /*
                               3687                 :                :  * Return palloc'ed array of the transactions that have changed catalogs.
                               3688                 :                :  * The returned array is sorted in xidComparator order.
                               3689                 :                :  *
                               3690                 :                :  * The caller must free the returned array when done with it.
                               3691                 :                :  */
                               3692                 :                : TransactionId *
                               3693                 :            291 : ReorderBufferGetCatalogChangesXacts(ReorderBuffer *rb)
                               3694                 :                : {
                               3695                 :                :     dlist_iter  iter;
                               3696                 :            291 :     TransactionId *xids = NULL;
                               3697                 :            291 :     size_t      xcnt = 0;
                               3698                 :                : 
                               3699                 :                :     /* Quick return if the list is empty */
 1091 drowley@postgresql.o     3700         [ +  + ]:            291 :     if (dclist_count(&rb->catchange_txns) == 0)
 1174 akapila@postgresql.o     3701                 :            283 :         return NULL;
                               3702                 :                : 
                               3703                 :                :     /* Initialize XID array */
 1091 drowley@postgresql.o     3704                 :              8 :     xids = (TransactionId *) palloc(sizeof(TransactionId) *
                               3705                 :              8 :                                     dclist_count(&rb->catchange_txns));
                               3706   [ +  -  +  + ]:             19 :     dclist_foreach(iter, &rb->catchange_txns)
                               3707                 :                :     {
                               3708                 :             11 :         ReorderBufferTXN *txn = dclist_container(ReorderBufferTXN,
                               3709                 :                :                                                  catchange_node,
                               3710                 :                :                                                  iter.cur);
                               3711                 :                : 
 1174 akapila@postgresql.o     3712         [ -  + ]:             11 :         Assert(rbtxn_has_catalog_changes(txn));
                               3713                 :                : 
                               3714                 :             11 :         xids[xcnt++] = txn->xid;
                               3715                 :                :     }
                               3716                 :                : 
                               3717                 :              8 :     qsort(xids, xcnt, sizeof(TransactionId), xidComparator);
                               3718                 :                : 
 1091 drowley@postgresql.o     3719         [ -  + ]:              8 :     Assert(xcnt == dclist_count(&rb->catchange_txns));
 1174 akapila@postgresql.o     3720                 :              8 :     return xids;
                               3721                 :                : }
                               3722                 :                : 
                               3723                 :                : /*
                               3724                 :                :  * Query whether a transaction is already *known* to contain catalog
                               3725                 :                :  * changes. This can be wrong until directly before the commit!
                               3726                 :                :  */
                               3727                 :                : bool
 4257 rhaas@postgresql.org     3728                 :           4389 : ReorderBufferXidHasCatalogChanges(ReorderBuffer *rb, TransactionId xid)
                               3729                 :                : {
                               3730                 :                :     ReorderBufferTXN *txn;
                               3731                 :                : 
                               3732                 :           4389 :     txn = ReorderBufferTXNByXid(rb, xid, false, NULL, InvalidXLogRecPtr,
                               3733                 :                :                                 false);
                               3734         [ +  + ]:           4389 :     if (txn == NULL)
                               3735                 :            662 :         return false;
                               3736                 :                : 
 2118 alvherre@alvh.no-ip.     3737                 :           3727 :     return rbtxn_has_catalog_changes(txn);
                               3738                 :                : }
                               3739                 :                : 
                               3740                 :                : /*
                               3741                 :                :  * ReorderBufferXidHasBaseSnapshot
                               3742                 :                :  *      Have we already set the base snapshot for the given txn/subtxn?
                               3743                 :                :  */
                               3744                 :                : bool
 4257 rhaas@postgresql.org     3745                 :        1357123 : ReorderBufferXidHasBaseSnapshot(ReorderBuffer *rb, TransactionId xid)
                               3746                 :                : {
                               3747                 :                :     ReorderBufferTXN *txn;
                               3748                 :                : 
 2681 alvherre@alvh.no-ip.     3749                 :        1357123 :     txn = ReorderBufferTXNByXid(rb, xid, false,
                               3750                 :                :                                 NULL, InvalidXLogRecPtr, false);
                               3751                 :                : 
                               3752                 :                :     /* transaction isn't known yet, ergo no snapshot */
 4257 rhaas@postgresql.org     3753         [ +  + ]:        1357123 :     if (txn == NULL)
                               3754                 :              3 :         return false;
                               3755                 :                : 
                               3756                 :                :     /* a known subtxn? operate on top-level txn instead */
 2118 alvherre@alvh.no-ip.     3757         [ +  + ]:        1357120 :     if (rbtxn_is_known_subxact(txn))
 2681                          3758                 :         382026 :         txn = ReorderBufferTXNByXid(rb, txn->toplevel_xid, false,
                               3759                 :                :                                     NULL, InvalidXLogRecPtr, false);
                               3760                 :                : 
 4257 rhaas@postgresql.org     3761                 :        1357120 :     return txn->base_snapshot != NULL;
                               3762                 :                : }
                               3763                 :                : 
                               3764                 :                : 
                               3765                 :                : /*
                               3766                 :                :  * ---------------------------------------
                               3767                 :                :  * Disk serialization support
                               3768                 :                :  * ---------------------------------------
                               3769                 :                :  */
                               3770                 :                : 
                               3771                 :                : /*
                               3772                 :                :  * Ensure the IO buffer is >= sz.
                               3773                 :                :  */
                               3774                 :                : static void
                               3775                 :        2599235 : ReorderBufferSerializeReserve(ReorderBuffer *rb, Size sz)
                               3776                 :                : {
                               3777         [ +  + ]:        2599235 :     if (!rb->outbufsize)
                               3778                 :                :     {
                               3779                 :             46 :         rb->outbuf = MemoryContextAlloc(rb->context, sz);
                               3780                 :             46 :         rb->outbufsize = sz;
                               3781                 :                :     }
                               3782         [ +  + ]:        2599189 :     else if (rb->outbufsize < sz)
                               3783                 :                :     {
                               3784                 :            264 :         rb->outbuf = repalloc(rb->outbuf, sz);
                               3785                 :            264 :         rb->outbufsize = sz;
                               3786                 :                :     }
                               3787                 :        2599235 : }
                               3788                 :                : 
                               3789                 :                : 
                               3790                 :                : /* Compare two transactions by size */
                               3791                 :                : static int
  565 msawada@postgresql.o     3792                 :         287597 : ReorderBufferTXNSizeCompare(const pairingheap_node *a, const pairingheap_node *b, void *arg)
                               3793                 :                : {
                               3794                 :         287597 :     const ReorderBufferTXN *ta = pairingheap_const_container(ReorderBufferTXN, txn_node, a);
                               3795                 :         287597 :     const ReorderBufferTXN *tb = pairingheap_const_container(ReorderBufferTXN, txn_node, b);
                               3796                 :                : 
  573                          3797         [ +  + ]:         287597 :     if (ta->size < tb->size)
                               3798                 :         200672 :         return -1;
                               3799         [ +  + ]:          86925 :     if (ta->size > tb->size)
                               3800                 :          86093 :         return 1;
                               3801                 :            832 :     return 0;
                               3802                 :                : }
                               3803                 :                : 
                               3804                 :                : /*
                               3805                 :                :  * Find the largest transaction (toplevel or subxact) to evict (spill to disk).
                               3806                 :                :  */
                               3807                 :                : static ReorderBufferTXN *
                               3808                 :           3323 : ReorderBufferLargestTXN(ReorderBuffer *rb)
                               3809                 :                : {
                               3810                 :                :     ReorderBufferTXN *largest;
                               3811                 :                : 
                               3812                 :                :     /* Get the largest transaction from the max-heap */
  565                          3813                 :           3323 :     largest = pairingheap_container(ReorderBufferTXN, txn_node,
                               3814                 :                :                                     pairingheap_first(rb->txn_heap));
                               3815                 :                : 
 2173 akapila@postgresql.o     3816         [ -  + ]:           3323 :     Assert(largest);
                               3817         [ -  + ]:           3323 :     Assert(largest->size > 0);
                               3818         [ -  + ]:           3323 :     Assert(largest->size <= rb->size);
                               3819                 :                : 
                               3820                 :           3323 :     return largest;
                               3821                 :                : }
                               3822                 :                : 
                               3823                 :                : /*
                               3824                 :                :  * Find the largest streamable (and non-aborted) toplevel transaction to evict
                               3825                 :                :  * (by streaming).
                               3826                 :                :  *
                               3827                 :                :  * This can be seen as an optimized version of ReorderBufferLargestTXN, which
                               3828                 :                :  * should give us the same transaction (because we don't update memory account
                               3829                 :                :  * for subtransaction with streaming, so it's always 0). But we can simply
                               3830                 :                :  * iterate over the limited number of toplevel transactions that have a base
                               3831                 :                :  * snapshot. There is no use of selecting a transaction that doesn't have base
                               3832                 :                :  * snapshot because we don't decode such transactions.  Also, we do not select
                               3833                 :                :  * the transaction which doesn't have any streamable change.
                               3834                 :                :  *
                               3835                 :                :  * Note that, we skip transactions that contain incomplete changes. There
                               3836                 :                :  * is a scope of optimization here such that we can select the largest
                               3837                 :                :  * transaction which has incomplete changes.  But that will make the code and
                               3838                 :                :  * design quite complex and that might not be worth the benefit.  If we plan to
                               3839                 :                :  * stream the transactions that contain incomplete changes then we need to
                               3840                 :                :  * find a way to partially stream/truncate the transaction changes in-memory
                               3841                 :                :  * and build a mechanism to partially truncate the spilled files.
                               3842                 :                :  * Additionally, whenever we partially stream the transaction we need to
                               3843                 :                :  * maintain the last streamed lsn and next time we need to restore from that
                               3844                 :                :  * segment and the offset in WAL.  As we stream the changes from the top
                               3845                 :                :  * transaction and restore them subtransaction wise, we need to even remember
                               3846                 :                :  * the subxact from where we streamed the last change.
                               3847                 :                :  */
                               3848                 :                : static ReorderBufferTXN *
 1055                          3849                 :            828 : ReorderBufferLargestStreamableTopTXN(ReorderBuffer *rb)
                               3850                 :                : {
                               3851                 :                :     dlist_iter  iter;
 1907                          3852                 :            828 :     Size        largest_size = 0;
                               3853                 :            828 :     ReorderBufferTXN *largest = NULL;
                               3854                 :                : 
                               3855                 :                :     /* Find the largest top-level transaction having a base snapshot. */
 1642                          3856   [ +  -  +  + ]:           1768 :     dlist_foreach(iter, &rb->txns_by_base_snapshot_lsn)
                               3857                 :                :     {
                               3858                 :                :         ReorderBufferTXN *txn;
                               3859                 :                : 
                               3860                 :            940 :         txn = dlist_container(ReorderBufferTXN, base_snapshot_node, iter.cur);
                               3861                 :                : 
                               3862                 :                :         /* must not be a subtxn */
                               3863         [ -  + ]:            940 :         Assert(!rbtxn_is_known_subxact(txn));
                               3864                 :                :         /* base_snapshot must be set */
                               3865         [ -  + ]:            940 :         Assert(txn->base_snapshot != NULL);
                               3866                 :                : 
                               3867                 :                :         /* Don't consider these kinds of transactions for eviction. */
  258 msawada@postgresql.o     3868         [ +  + ]:            940 :         if (rbtxn_has_partial_change(txn) ||
                               3869         [ +  + ]:            793 :             !rbtxn_has_streamable_change(txn) ||
                               3870         [ -  + ]:            763 :             rbtxn_is_aborted(txn))
                               3871                 :            177 :             continue;
                               3872                 :                : 
                               3873                 :                :         /* Find the largest of the eviction candidates. */
 1642 akapila@postgresql.o     3874   [ +  +  +  - ]:            763 :         if ((largest == NULL || txn->total_size > largest_size) &&
  258 msawada@postgresql.o     3875         [ +  + ]:            763 :             (txn->total_size > 0))
                               3876                 :                :         {
 1907 akapila@postgresql.o     3877                 :            717 :             largest = txn;
                               3878                 :            717 :             largest_size = txn->total_size;
                               3879                 :                :         }
                               3880                 :                :     }
                               3881                 :                : 
                               3882                 :            828 :     return largest;
                               3883                 :                : }
                               3884                 :                : 
                               3885                 :                : /*
                               3886                 :                :  * Check whether the logical_decoding_work_mem limit was reached, and if yes
                               3887                 :                :  * pick the largest (sub)transaction at-a-time to evict and spill its changes to
                               3888                 :                :  * disk or send to the output plugin until we reach under the memory limit.
                               3889                 :                :  *
                               3890                 :                :  * If debug_logical_replication_streaming is set to "immediate", stream or
                               3891                 :                :  * serialize the changes immediately.
                               3892                 :                :  *
                               3893                 :                :  * XXX At this point we select the transactions until we reach under the memory
                               3894                 :                :  * limit, but we might also adapt a more elaborate eviction strategy - for example
                               3895                 :                :  * evicting enough transactions to free certain fraction (e.g. 50%) of the memory
                               3896                 :                :  * limit.
                               3897                 :                :  */
                               3898                 :                : static void
 2173                          3899                 :        1366970 : ReorderBufferCheckMemoryLimit(ReorderBuffer *rb)
                               3900                 :                : {
                               3901                 :                :     ReorderBufferTXN *txn;
   20 msawada@postgresql.o     3902                 :GNC     1366970 :     bool        update_stats = true;
                               3903                 :                : 
                               3904         [ +  + ]:        1366970 :     if (rb->size >= logical_decoding_work_mem * (Size) 1024)
                               3905                 :                :     {
                               3906                 :                :         /*
                               3907                 :                :          * Update the statistics as the memory usage has reached the limit. We
                               3908                 :                :          * report the statistics update later in this function since we can
                               3909                 :                :          * update the slot statistics altogether while streaming or
                               3910                 :                :          * serializing transactions in most cases.
                               3911                 :                :          */
                               3912                 :           3012 :         rb->memExceededCount += 1;
                               3913                 :                :     }
                               3914         [ +  + ]:        1363958 :     else if (debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_BUFFERED)
                               3915                 :                :     {
                               3916                 :                :         /*
                               3917                 :                :          * Bail out if debug_logical_replication_streaming is buffered and we
                               3918                 :                :          * haven't exceeded the memory limit.
                               3919                 :                :          */
 2173 akapila@postgresql.o     3920                 :CBC     1362993 :         return;
                               3921                 :                :     }
                               3922                 :                : 
                               3923                 :                :     /*
                               3924                 :                :      * If debug_logical_replication_streaming is immediate, loop until there's
                               3925                 :                :      * no change. Otherwise, loop until we reach under the memory limit. One
                               3926                 :                :      * might think that just by evicting the largest (sub)transaction we will
                               3927                 :                :      * come under the memory limit based on assumption that the selected
                               3928                 :                :      * transaction is at least as large as the most recent change (which
                               3929                 :                :      * caused us to go over the memory limit). However, that is not true
                               3930                 :                :      * because a user can reduce the logical_decoding_work_mem to a smaller
                               3931                 :                :      * value before the most recent change.
                               3932                 :                :      */
  270 tgl@sss.pgh.pa.us        3933         [ +  + ]:           7951 :     while (rb->size >= logical_decoding_work_mem * (Size) 1024 ||
  791 peter@eisentraut.org     3934         [ +  + ]:           4939 :            (debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_IMMEDIATE &&
 1037 akapila@postgresql.o     3935         [ +  + ]:           1927 :             rb->size > 0))
                               3936                 :                :     {
                               3937                 :                :         /*
                               3938                 :                :          * Pick the largest non-aborted transaction and evict it from memory
                               3939                 :                :          * by streaming, if possible.  Otherwise, spill to disk.
                               3940                 :                :          */
 1907                          3941   [ +  +  +  + ]:           4802 :         if (ReorderBufferCanStartStreaming(rb) &&
 1055                          3942                 :            828 :             (txn = ReorderBufferLargestStreamableTopTXN(rb)) != NULL)
                               3943                 :                :         {
                               3944                 :                :             /* we know there has to be one, because the size is not zero */
  956                          3945   [ +  -  -  + ]:            651 :             Assert(txn && rbtxn_is_toptxn(txn));
 1907                          3946         [ -  + ]:            651 :             Assert(txn->total_size > 0);
                               3947         [ -  + ]:            651 :             Assert(rb->size >= txn->total_size);
                               3948                 :                : 
                               3949                 :                :             /* skip the transaction if aborted */
  258 msawada@postgresql.o     3950         [ -  + ]:            651 :             if (ReorderBufferCheckAndTruncateAbortedTXN(rb, txn))
  258 msawada@postgresql.o     3951                 :UBC           0 :                 continue;
                               3952                 :                : 
 1907 akapila@postgresql.o     3953                 :CBC         651 :             ReorderBufferStreamTXN(rb, txn);
                               3954                 :                :         }
                               3955                 :                :         else
                               3956                 :                :         {
                               3957                 :                :             /*
                               3958                 :                :              * Pick the largest transaction (or subtransaction) and evict it
                               3959                 :                :              * from memory by serializing it to disk.
                               3960                 :                :              */
                               3961                 :           3323 :             txn = ReorderBufferLargestTXN(rb);
                               3962                 :                : 
                               3963                 :                :             /* we know there has to be one, because the size is not zero */
                               3964         [ -  + ]:           3323 :             Assert(txn);
                               3965         [ -  + ]:           3323 :             Assert(txn->size > 0);
                               3966         [ -  + ]:           3323 :             Assert(rb->size >= txn->size);
                               3967                 :                : 
                               3968                 :                :             /* skip the transaction if aborted */
  258 msawada@postgresql.o     3969         [ +  + ]:           3323 :             if (ReorderBufferCheckAndTruncateAbortedTXN(rb, txn))
                               3970                 :              9 :                 continue;
                               3971                 :                : 
 1907 akapila@postgresql.o     3972                 :           3314 :             ReorderBufferSerializeTXN(rb, txn);
                               3973                 :                :         }
                               3974                 :                : 
                               3975                 :                :         /*
                               3976                 :                :          * After eviction, the transaction should have no entries in memory,
                               3977                 :                :          * and should use 0 bytes for changes.
                               3978                 :                :          */
 1966                          3979         [ -  + ]:           3965 :         Assert(txn->size == 0);
                               3980         [ -  + ]:           3965 :         Assert(txn->nentries_mem == 0);
                               3981                 :                : 
                               3982                 :                :         /*
                               3983                 :                :          * We've reported the memExceededCount update while streaming or
                               3984                 :                :          * serializing the transaction.
                               3985                 :                :          */
   20 msawada@postgresql.o     3986                 :GNC        3965 :         update_stats = false;
                               3987                 :                :     }
                               3988                 :                : 
                               3989         [ +  + ]:           3977 :     if (update_stats)
                               3990                 :             12 :         UpdateDecodingStats((LogicalDecodingContext *) rb->private_data);
                               3991                 :                : 
                               3992                 :                :     /* We must be under the memory limit now. */
  270 tgl@sss.pgh.pa.us        3993         [ -  + ]:CBC        3977 :     Assert(rb->size < logical_decoding_work_mem * (Size) 1024);
                               3994                 :                : }
                               3995                 :                : 
                               3996                 :                : /*
                               3997                 :                :  * Spill data of a large transaction (and its subtransactions) to disk.
                               3998                 :                :  */
                               3999                 :                : static void
 4257 rhaas@postgresql.org     4000                 :           3574 : ReorderBufferSerializeTXN(ReorderBuffer *rb, ReorderBufferTXN *txn)
                               4001                 :                : {
                               4002                 :                :     dlist_iter  subtxn_i;
                               4003                 :                :     dlist_mutable_iter change_i;
                               4004                 :           3574 :     int         fd = -1;
                               4005                 :           3574 :     XLogSegNo   curOpenSegNo = 0;
                               4006                 :           3574 :     Size        spilled = 0;
 1846 akapila@postgresql.o     4007                 :           3574 :     Size        size = txn->size;
                               4008                 :                : 
 4199 tgl@sss.pgh.pa.us        4009         [ -  + ]:           3574 :     elog(DEBUG2, "spill %u changes in XID %u to disk",
                               4010                 :                :          (uint32) txn->nentries_mem, txn->xid);
                               4011                 :                : 
                               4012                 :                :     /* do the same to all child TXs */
 4257 rhaas@postgresql.org     4013   [ +  -  +  + ]:           3795 :     dlist_foreach(subtxn_i, &txn->subtxns)
                               4014                 :                :     {
                               4015                 :                :         ReorderBufferTXN *subtxn;
                               4016                 :                : 
                               4017                 :            221 :         subtxn = dlist_container(ReorderBufferTXN, node, subtxn_i.cur);
                               4018                 :            221 :         ReorderBufferSerializeTXN(rb, subtxn);
                               4019                 :                :     }
                               4020                 :                : 
                               4021                 :                :     /* serialize changestream */
                               4022   [ +  -  +  + ]:        1138605 :     dlist_foreach_modify(change_i, &txn->changes)
                               4023                 :                :     {
                               4024                 :                :         ReorderBufferChange *change;
                               4025                 :                : 
                               4026                 :        1135031 :         change = dlist_container(ReorderBufferChange, node, change_i.cur);
                               4027                 :                : 
                               4028                 :                :         /*
                               4029                 :                :          * store in segment in which it belongs by start lsn, don't split over
                               4030                 :                :          * multiple segments tho
                               4031                 :                :          */
 2961 andres@anarazel.de       4032         [ +  + ]:        1135031 :         if (fd == -1 ||
                               4033         [ +  + ]:        1131665 :             !XLByteInSeg(change->lsn, curOpenSegNo, wal_segment_size))
                               4034                 :                :         {
                               4035                 :                :             char        path[MAXPGPATH];
                               4036                 :                : 
 4257 rhaas@postgresql.org     4037         [ +  + ]:           3384 :             if (fd != -1)
                               4038                 :             18 :                 CloseTransientFile(fd);
                               4039                 :                : 
 2961 andres@anarazel.de       4040                 :           3384 :             XLByteToSeg(change->lsn, curOpenSegNo, wal_segment_size);
                               4041                 :                : 
                               4042                 :                :             /*
                               4043                 :                :              * No need to care about TLIs here, only used during a single run,
                               4044                 :                :              * so each LSN only maps to a specific WAL record.
                               4045                 :                :              */
 2793 alvherre@alvh.no-ip.     4046                 :           3384 :             ReorderBufferSerializedPath(path, MyReplicationSlot, txn->xid,
                               4047                 :                :                                         curOpenSegNo);
                               4048                 :                : 
                               4049                 :                :             /* open segment, create it if necessary */
 4257 rhaas@postgresql.org     4050                 :           3384 :             fd = OpenTransientFile(path,
                               4051                 :                :                                    O_CREAT | O_WRONLY | O_APPEND | PG_BINARY);
                               4052                 :                : 
                               4053         [ -  + ]:           3384 :             if (fd < 0)
 4257 rhaas@postgresql.org     4054         [ #  # ]:UBC           0 :                 ereport(ERROR,
                               4055                 :                :                         (errcode_for_file_access(),
                               4056                 :                :                          errmsg("could not open file \"%s\": %m", path)));
                               4057                 :                :         }
                               4058                 :                : 
 4257 rhaas@postgresql.org     4059                 :CBC     1135031 :         ReorderBufferSerializeChange(rb, txn, fd, change);
                               4060                 :        1135031 :         dlist_delete(&change->node);
  230 heikki.linnakangas@i     4061                 :        1135031 :         ReorderBufferFreeChange(rb, change, false);
                               4062                 :                : 
 4257 rhaas@postgresql.org     4063                 :        1135031 :         spilled++;
                               4064                 :                :     }
                               4065                 :                : 
                               4066                 :                :     /* Update the memory counter */
  573 msawada@postgresql.o     4067                 :           3574 :     ReorderBufferChangeMemoryUpdate(rb, NULL, txn, false, size);
                               4068                 :                : 
                               4069                 :                :     /* update the statistics iff we have spilled anything */
 1846 akapila@postgresql.o     4070         [ +  + ]:           3574 :     if (spilled)
                               4071                 :                :     {
                               4072                 :           3366 :         rb->spillCount += 1;
                               4073                 :           3366 :         rb->spillBytes += size;
                               4074                 :                : 
                               4075                 :                :         /* don't consider already serialized transactions */
                               4076   [ +  +  +  - ]:           3366 :         rb->spillTxns += (rbtxn_is_serialized(txn) || rbtxn_is_serialized_clear(txn)) ? 0 : 1;
                               4077                 :                : 
                               4078                 :                :         /* update the decoding stats */
 1636                          4079                 :           3366 :         UpdateDecodingStats((LogicalDecodingContext *) rb->private_data);
                               4080                 :                :     }
                               4081                 :                : 
 4257 rhaas@postgresql.org     4082         [ -  + ]:           3574 :     Assert(spilled == txn->nentries_mem);
                               4083         [ -  + ]:           3574 :     Assert(dlist_is_empty(&txn->changes));
                               4084                 :           3574 :     txn->nentries_mem = 0;
 2118 alvherre@alvh.no-ip.     4085                 :           3574 :     txn->txn_flags |= RBTXN_IS_SERIALIZED;
                               4086                 :                : 
 4257 rhaas@postgresql.org     4087         [ +  + ]:           3574 :     if (fd != -1)
                               4088                 :           3366 :         CloseTransientFile(fd);
                               4089                 :           3574 : }
                               4090                 :                : 
                               4091                 :                : /*
                               4092                 :                :  * Serialize individual change to disk.
                               4093                 :                :  */
                               4094                 :                : static void
                               4095                 :        1135031 : ReorderBufferSerializeChange(ReorderBuffer *rb, ReorderBufferTXN *txn,
                               4096                 :                :                              int fd, ReorderBufferChange *change)
                               4097                 :                : {
                               4098                 :                :     ReorderBufferDiskChange *ondisk;
                               4099                 :        1135031 :     Size        sz = sizeof(ReorderBufferDiskChange);
                               4100                 :                : 
                               4101                 :        1135031 :     ReorderBufferSerializeReserve(rb, sz);
                               4102                 :                : 
                               4103                 :        1135031 :     ondisk = (ReorderBufferDiskChange *) rb->outbuf;
                               4104                 :        1135031 :     memcpy(&ondisk->change, change, sizeof(ReorderBufferChange));
                               4105                 :                : 
 4253 tgl@sss.pgh.pa.us        4106   [ +  +  +  +  :        1135031 :     switch (change->action)
                                           +  +  - ]
                               4107                 :                :     {
                               4108                 :                :             /* fall through these, they're all similar enough */
                               4109                 :        1117543 :         case REORDER_BUFFER_CHANGE_INSERT:
                               4110                 :                :         case REORDER_BUFFER_CHANGE_UPDATE:
                               4111                 :                :         case REORDER_BUFFER_CHANGE_DELETE:
                               4112                 :                :         case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT:
                               4113                 :                :             {
                               4114                 :                :                 char       *data;
                               4115                 :                :                 HeapTuple   oldtup,
                               4116                 :                :                             newtup;
 4257 rhaas@postgresql.org     4117                 :        1117543 :                 Size        oldlen = 0;
                               4118                 :        1117543 :                 Size        newlen = 0;
                               4119                 :                : 
 4253 tgl@sss.pgh.pa.us        4120                 :        1117543 :                 oldtup = change->data.tp.oldtuple;
                               4121                 :        1117543 :                 newtup = change->data.tp.newtuple;
                               4122                 :                : 
                               4123         [ +  + ]:        1117543 :                 if (oldtup)
                               4124                 :                :                 {
 3524 andres@anarazel.de       4125                 :          97001 :                     sz += sizeof(HeapTupleData);
  638 msawada@postgresql.o     4126                 :          97001 :                     oldlen = oldtup->t_len;
 3524 andres@anarazel.de       4127                 :          97001 :                     sz += oldlen;
                               4128                 :                :                 }
                               4129                 :                : 
 4253 tgl@sss.pgh.pa.us        4130         [ +  + ]:        1117543 :                 if (newtup)
                               4131                 :                :                 {
 3524 andres@anarazel.de       4132                 :         966827 :                     sz += sizeof(HeapTupleData);
  638 msawada@postgresql.o     4133                 :         966827 :                     newlen = newtup->t_len;
 3524 andres@anarazel.de       4134                 :         966827 :                     sz += newlen;
                               4135                 :                :                 }
                               4136                 :                : 
                               4137                 :                :                 /* make sure we have enough space */
 4257 rhaas@postgresql.org     4138                 :        1117543 :                 ReorderBufferSerializeReserve(rb, sz);
                               4139                 :                : 
                               4140                 :        1117543 :                 data = ((char *) rb->outbuf) + sizeof(ReorderBufferDiskChange);
                               4141                 :                :                 /* might have been reallocated above */
                               4142                 :        1117543 :                 ondisk = (ReorderBufferDiskChange *) rb->outbuf;
                               4143                 :                : 
                               4144         [ +  + ]:        1117543 :                 if (oldlen)
                               4145                 :                :                 {
  638 msawada@postgresql.o     4146                 :          97001 :                     memcpy(data, oldtup, sizeof(HeapTupleData));
 3524 andres@anarazel.de       4147                 :          97001 :                     data += sizeof(HeapTupleData);
                               4148                 :                : 
  638 msawada@postgresql.o     4149                 :          97001 :                     memcpy(data, oldtup->t_data, oldlen);
 4257 rhaas@postgresql.org     4150                 :          97001 :                     data += oldlen;
                               4151                 :                :                 }
                               4152                 :                : 
                               4153         [ +  + ]:        1117543 :                 if (newlen)
                               4154                 :                :                 {
  638 msawada@postgresql.o     4155                 :         966827 :                     memcpy(data, newtup, sizeof(HeapTupleData));
 3524 andres@anarazel.de       4156                 :         966827 :                     data += sizeof(HeapTupleData);
                               4157                 :                : 
  638 msawada@postgresql.o     4158                 :         966827 :                     memcpy(data, newtup->t_data, newlen);
 3522 andres@anarazel.de       4159                 :         966827 :                     data += newlen;
                               4160                 :                :                 }
 3492 simon@2ndQuadrant.co     4161                 :        1117543 :                 break;
                               4162                 :                :             }
                               4163                 :             13 :         case REORDER_BUFFER_CHANGE_MESSAGE:
                               4164                 :                :             {
                               4165                 :                :                 char       *data;
                               4166                 :             13 :                 Size        prefix_size = strlen(change->data.msg.prefix) + 1;
                               4167                 :                : 
                               4168                 :             13 :                 sz += prefix_size + change->data.msg.message_size +
                               4169                 :                :                     sizeof(Size) + sizeof(Size);
                               4170                 :             13 :                 ReorderBufferSerializeReserve(rb, sz);
                               4171                 :                : 
                               4172                 :             13 :                 data = ((char *) rb->outbuf) + sizeof(ReorderBufferDiskChange);
                               4173                 :                : 
                               4174                 :                :                 /* might have been reallocated above */
 3317 rhaas@postgresql.org     4175                 :             13 :                 ondisk = (ReorderBufferDiskChange *) rb->outbuf;
                               4176                 :                : 
                               4177                 :                :                 /* write the prefix including the size */
 3492 simon@2ndQuadrant.co     4178                 :             13 :                 memcpy(data, &prefix_size, sizeof(Size));
                               4179                 :             13 :                 data += sizeof(Size);
                               4180                 :             13 :                 memcpy(data, change->data.msg.prefix,
                               4181                 :                :                        prefix_size);
                               4182                 :             13 :                 data += prefix_size;
                               4183                 :                : 
                               4184                 :                :                 /* write the message including the size */
                               4185                 :             13 :                 memcpy(data, &change->data.msg.message_size, sizeof(Size));
                               4186                 :             13 :                 data += sizeof(Size);
                               4187                 :             13 :                 memcpy(data, change->data.msg.message,
                               4188                 :                :                        change->data.msg.message_size);
                               4189                 :             13 :                 data += change->data.msg.message_size;
                               4190                 :                : 
 1839 akapila@postgresql.o     4191                 :             13 :                 break;
                               4192                 :                :             }
                               4193                 :            154 :         case REORDER_BUFFER_CHANGE_INVALIDATION:
                               4194                 :                :             {
                               4195                 :                :                 char       *data;
                               4196                 :            154 :                 Size        inval_size = sizeof(SharedInvalidationMessage) *
  893 tgl@sss.pgh.pa.us        4197                 :            154 :                     change->data.inval.ninvalidations;
                               4198                 :                : 
 1839 akapila@postgresql.o     4199                 :            154 :                 sz += inval_size;
                               4200                 :                : 
                               4201                 :            154 :                 ReorderBufferSerializeReserve(rb, sz);
                               4202                 :            154 :                 data = ((char *) rb->outbuf) + sizeof(ReorderBufferDiskChange);
                               4203                 :                : 
                               4204                 :                :                 /* might have been reallocated above */
                               4205                 :            154 :                 ondisk = (ReorderBufferDiskChange *) rb->outbuf;
                               4206                 :            154 :                 memcpy(data, change->data.inval.invalidations, inval_size);
                               4207                 :            154 :                 data += inval_size;
                               4208                 :                : 
 4257 rhaas@postgresql.org     4209                 :            154 :                 break;
                               4210                 :                :             }
                               4211                 :              8 :         case REORDER_BUFFER_CHANGE_INTERNAL_SNAPSHOT:
                               4212                 :                :             {
                               4213                 :                :                 Snapshot    snap;
                               4214                 :                :                 char       *data;
                               4215                 :                : 
 4253 tgl@sss.pgh.pa.us        4216                 :              8 :                 snap = change->data.snapshot;
                               4217                 :                : 
 4257 rhaas@postgresql.org     4218                 :              8 :                 sz += sizeof(SnapshotData) +
 4253 tgl@sss.pgh.pa.us        4219                 :              8 :                     sizeof(TransactionId) * snap->xcnt +
 2111 alvherre@alvh.no-ip.     4220                 :              8 :                     sizeof(TransactionId) * snap->subxcnt;
                               4221                 :                : 
                               4222                 :                :                 /* make sure we have enough space */
 4257 rhaas@postgresql.org     4223                 :              8 :                 ReorderBufferSerializeReserve(rb, sz);
                               4224                 :              8 :                 data = ((char *) rb->outbuf) + sizeof(ReorderBufferDiskChange);
                               4225                 :                :                 /* might have been reallocated above */
                               4226                 :              8 :                 ondisk = (ReorderBufferDiskChange *) rb->outbuf;
                               4227                 :                : 
 4253 tgl@sss.pgh.pa.us        4228                 :              8 :                 memcpy(data, snap, sizeof(SnapshotData));
 4257 rhaas@postgresql.org     4229                 :              8 :                 data += sizeof(SnapshotData);
                               4230                 :                : 
 4253 tgl@sss.pgh.pa.us        4231         [ +  - ]:              8 :                 if (snap->xcnt)
                               4232                 :                :                 {
                               4233                 :              8 :                     memcpy(data, snap->xip,
 4190 rhaas@postgresql.org     4234                 :              8 :                            sizeof(TransactionId) * snap->xcnt);
                               4235                 :              8 :                     data += sizeof(TransactionId) * snap->xcnt;
                               4236                 :                :                 }
                               4237                 :                : 
 4253 tgl@sss.pgh.pa.us        4238         [ -  + ]:              8 :                 if (snap->subxcnt)
                               4239                 :                :                 {
 4253 tgl@sss.pgh.pa.us        4240                 :UBC           0 :                     memcpy(data, snap->subxip,
 4190 rhaas@postgresql.org     4241                 :              0 :                            sizeof(TransactionId) * snap->subxcnt);
                               4242                 :              0 :                     data += sizeof(TransactionId) * snap->subxcnt;
                               4243                 :                :                 }
 4257 rhaas@postgresql.org     4244                 :CBC           8 :                 break;
                               4245                 :                :             }
 2761 peter_e@gmx.net          4246                 :              2 :         case REORDER_BUFFER_CHANGE_TRUNCATE:
                               4247                 :                :             {
                               4248                 :                :                 Size        size;
                               4249                 :                :                 char       *data;
                               4250                 :                : 
                               4251                 :                :                 /* account for the OIDs of truncated relations */
 2612 tomas.vondra@postgre     4252                 :              2 :                 size = sizeof(Oid) * change->data.truncate.nrelids;
                               4253                 :              2 :                 sz += size;
                               4254                 :                : 
                               4255                 :                :                 /* make sure we have enough space */
                               4256                 :              2 :                 ReorderBufferSerializeReserve(rb, sz);
                               4257                 :                : 
                               4258                 :              2 :                 data = ((char *) rb->outbuf) + sizeof(ReorderBufferDiskChange);
                               4259                 :                :                 /* might have been reallocated above */
                               4260                 :              2 :                 ondisk = (ReorderBufferDiskChange *) rb->outbuf;
                               4261                 :                : 
                               4262                 :              2 :                 memcpy(data, change->data.truncate.relids, size);
                               4263                 :              2 :                 data += size;
                               4264                 :                : 
                               4265                 :              2 :                 break;
                               4266                 :                :             }
 3826 andres@anarazel.de       4267                 :          17311 :         case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM:
                               4268                 :                :         case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT:
                               4269                 :                :         case REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID:
                               4270                 :                :         case REORDER_BUFFER_CHANGE_INTERNAL_TUPLECID:
                               4271                 :                :             /* ReorderBufferChange contains everything important */
 4257 rhaas@postgresql.org     4272                 :          17311 :             break;
                               4273                 :                :     }
                               4274                 :                : 
                               4275                 :        1135031 :     ondisk->size = sz;
                               4276                 :                : 
 2641 michael@paquier.xyz      4277                 :        1135031 :     errno = 0;
 3146 rhaas@postgresql.org     4278                 :        1135031 :     pgstat_report_wait_start(WAIT_EVENT_REORDER_BUFFER_WRITE);
 4257                          4279         [ -  + ]:        1135031 :     if (write(fd, rb->outbuf, ondisk->size) != ondisk->size)
                               4280                 :                :     {
 3349 tgl@sss.pgh.pa.us        4281                 :UBC           0 :         int         save_errno = errno;
                               4282                 :                : 
 4257 rhaas@postgresql.org     4283                 :              0 :         CloseTransientFile(fd);
                               4284                 :                : 
                               4285                 :                :         /* if write didn't set errno, assume problem is no disk space */
 2682 michael@paquier.xyz      4286         [ #  # ]:              0 :         errno = save_errno ? save_errno : ENOSPC;
 4257 rhaas@postgresql.org     4287         [ #  # ]:              0 :         ereport(ERROR,
                               4288                 :                :                 (errcode_for_file_access(),
                               4289                 :                :                  errmsg("could not write to data file for XID %u: %m",
                               4290                 :                :                         txn->xid)));
                               4291                 :                :     }
 3146 rhaas@postgresql.org     4292                 :CBC     1135031 :     pgstat_report_wait_end();
                               4293                 :                : 
                               4294                 :                :     /*
                               4295                 :                :      * Keep the transaction's final_lsn up to date with each change we send to
                               4296                 :                :      * disk, so that ReorderBufferRestoreCleanup works correctly.  (We used to
                               4297                 :                :      * only do this on commit and abort records, but that doesn't work if a
                               4298                 :                :      * system crash leaves a transaction without its abort record).
                               4299                 :                :      *
                               4300                 :                :      * Make sure not to move it backwards.
                               4301                 :                :      */
 2111 alvherre@alvh.no-ip.     4302         [ +  + ]:        1135031 :     if (txn->final_lsn < change->lsn)
                               4303                 :        1130548 :         txn->final_lsn = change->lsn;
                               4304                 :                : 
 4253 tgl@sss.pgh.pa.us        4305         [ -  + ]:        1135031 :     Assert(ondisk->change.action == change->action);
 4257 rhaas@postgresql.org     4306                 :        1135031 : }
                               4307                 :                : 
                               4308                 :                : /* Returns true, if the output plugin supports streaming, false, otherwise. */
                               4309                 :                : static inline bool
 1907 akapila@postgresql.o     4310                 :        1688318 : ReorderBufferCanStream(ReorderBuffer *rb)
                               4311                 :                : {
                               4312                 :        1688318 :     LogicalDecodingContext *ctx = rb->private_data;
                               4313                 :                : 
                               4314                 :        1688318 :     return ctx->streaming;
                               4315                 :                : }
                               4316                 :                : 
                               4317                 :                : /* Returns true, if the streaming can be started now, false, otherwise. */
                               4318                 :                : static inline bool
                               4319                 :         321348 : ReorderBufferCanStartStreaming(ReorderBuffer *rb)
                               4320                 :                : {
                               4321                 :         321348 :     LogicalDecodingContext *ctx = rb->private_data;
                               4322                 :         321348 :     SnapBuild  *builder = ctx->snapshot_builder;
                               4323                 :                : 
                               4324                 :                :     /* We can't start streaming unless a consistent state is reached. */
 1789                          4325         [ -  + ]:         321348 :     if (SnapBuildCurrentState(builder) < SNAPBUILD_CONSISTENT)
 1789 akapila@postgresql.o     4326                 :UBC           0 :         return false;
                               4327                 :                : 
                               4328                 :                :     /*
                               4329                 :                :      * We can't start streaming immediately even if the streaming is enabled
                               4330                 :                :      * because we previously decoded this transaction and now just are
                               4331                 :                :      * restarting.
                               4332                 :                :      */
 1907 akapila@postgresql.o     4333         [ +  + ]:CBC      321348 :     if (ReorderBufferCanStream(rb) &&
 1055                          4334         [ +  + ]:         319090 :         !SnapBuildXactNeedsSkip(builder, ctx->reader->ReadRecPtr))
 1907                          4335                 :         177166 :         return true;
                               4336                 :                : 
                               4337                 :         144182 :     return false;
                               4338                 :                : }
                               4339                 :                : 
                               4340                 :                : /*
                               4341                 :                :  * Send data of a large transaction (and its subtransactions) to the
                               4342                 :                :  * output plugin, but using the stream API.
                               4343                 :                :  */
                               4344                 :                : static void
                               4345                 :            725 : ReorderBufferStreamTXN(ReorderBuffer *rb, ReorderBufferTXN *txn)
                               4346                 :                : {
                               4347                 :                :     Snapshot    snapshot_now;
                               4348                 :                :     CommandId   command_id;
                               4349                 :                :     Size        stream_bytes;
                               4350                 :                :     bool        txn_is_streamed;
                               4351                 :                : 
                               4352                 :                :     /* We can never reach here for a subtransaction. */
  956                          4353         [ -  + ]:            725 :     Assert(rbtxn_is_toptxn(txn));
                               4354                 :                : 
                               4355                 :                :     /*
                               4356                 :                :      * We can't make any assumptions about base snapshot here, similar to what
                               4357                 :                :      * ReorderBufferCommit() does. That relies on base_snapshot getting
                               4358                 :                :      * transferred from subxact in ReorderBufferCommitChild(), but that was
                               4359                 :                :      * not yet called as the transaction is in-progress.
                               4360                 :                :      *
                               4361                 :                :      * So just walk the subxacts and use the same logic here. But we only need
                               4362                 :                :      * to do that once, when the transaction is streamed for the first time.
                               4363                 :                :      * After that we need to reuse the snapshot from the previous run.
                               4364                 :                :      *
                               4365                 :                :      * Unlike DecodeCommit which adds xids of all the subtransactions in
                               4366                 :                :      * snapshot's xip array via SnapBuildCommitTxn, we can't do that here but
                               4367                 :                :      * we do add them to subxip array instead via ReorderBufferCopySnap. This
                               4368                 :                :      * allows the catalog changes made in subtransactions decoded till now to
                               4369                 :                :      * be visible.
                               4370                 :                :      */
 1907                          4371         [ +  + ]:            725 :     if (txn->snapshot_now == NULL)
                               4372                 :                :     {
                               4373                 :                :         dlist_iter  subxact_i;
                               4374                 :                : 
                               4375                 :                :         /* make sure this transaction is streamed for the first time */
                               4376         [ -  + ]:             72 :         Assert(!rbtxn_is_streamed(txn));
                               4377                 :                : 
                               4378                 :                :         /* at the beginning we should have invalid command ID */
                               4379         [ -  + ]:             72 :         Assert(txn->command_id == InvalidCommandId);
                               4380                 :                : 
                               4381   [ +  -  +  + ]:             76 :         dlist_foreach(subxact_i, &txn->subtxns)
                               4382                 :                :         {
                               4383                 :                :             ReorderBufferTXN *subtxn;
                               4384                 :                : 
                               4385                 :              4 :             subtxn = dlist_container(ReorderBufferTXN, node, subxact_i.cur);
                               4386                 :              4 :             ReorderBufferTransferSnapToParent(txn, subtxn);
                               4387                 :                :         }
                               4388                 :                : 
                               4389                 :                :         /*
                               4390                 :                :          * If this transaction has no snapshot, it didn't make any changes to
                               4391                 :                :          * the database till now, so there's nothing to decode.
                               4392                 :                :          */
                               4393         [ -  + ]:             72 :         if (txn->base_snapshot == NULL)
                               4394                 :                :         {
 1907 akapila@postgresql.o     4395         [ #  # ]:UBC           0 :             Assert(txn->ninvalidations == 0);
                               4396                 :              0 :             return;
                               4397                 :                :         }
                               4398                 :                : 
 1907 akapila@postgresql.o     4399                 :CBC          72 :         command_id = FirstCommandId;
                               4400                 :             72 :         snapshot_now = ReorderBufferCopySnap(rb, txn->base_snapshot,
                               4401                 :                :                                              txn, command_id);
                               4402                 :                :     }
                               4403                 :                :     else
                               4404                 :                :     {
                               4405                 :                :         /* the transaction must have been already streamed */
                               4406         [ -  + ]:            653 :         Assert(rbtxn_is_streamed(txn));
                               4407                 :                : 
                               4408                 :                :         /*
                               4409                 :                :          * Nah, we already have snapshot from the previous streaming run. We
                               4410                 :                :          * assume new subxacts can't move the LSN backwards, and so can't beat
                               4411                 :                :          * the LSN condition in the previous branch (so no need to walk
                               4412                 :                :          * through subxacts again). In fact, we must not do that as we may be
                               4413                 :                :          * using snapshot half-way through the subxact.
                               4414                 :                :          */
                               4415                 :            653 :         command_id = txn->command_id;
                               4416                 :                : 
                               4417                 :                :         /*
                               4418                 :                :          * We can't use txn->snapshot_now directly because after the last
                               4419                 :                :          * streaming run, we might have got some new sub-transactions. So we
                               4420                 :                :          * need to add them to the snapshot.
                               4421                 :                :          */
                               4422                 :            653 :         snapshot_now = ReorderBufferCopySnap(rb, txn->snapshot_now,
                               4423                 :                :                                              txn, command_id);
                               4424                 :                : 
                               4425                 :                :         /* Free the previously copied snapshot. */
                               4426         [ -  + ]:            653 :         Assert(txn->snapshot_now->copied);
                               4427                 :            653 :         ReorderBufferFreeSnap(rb, txn->snapshot_now);
                               4428                 :            653 :         txn->snapshot_now = NULL;
                               4429                 :                :     }
                               4430                 :                : 
                               4431                 :                :     /*
                               4432                 :                :      * Remember this information to be used later to update stats. We can't
                               4433                 :                :      * update the stats here as an error while processing the changes would
                               4434                 :                :      * lead to the accumulation of stats even though we haven't streamed all
                               4435                 :                :      * the changes.
                               4436                 :                :      */
 1825                          4437                 :            725 :     txn_is_streamed = rbtxn_is_streamed(txn);
                               4438                 :            725 :     stream_bytes = txn->total_size;
                               4439                 :                : 
                               4440                 :                :     /* Process and send the changes to output plugin. */
 1907                          4441                 :            725 :     ReorderBufferProcessTXN(rb, txn, InvalidXLogRecPtr, snapshot_now,
                               4442                 :                :                             command_id, true);
                               4443                 :                : 
 1825                          4444                 :            725 :     rb->streamCount += 1;
                               4445                 :            725 :     rb->streamBytes += stream_bytes;
                               4446                 :                : 
                               4447                 :                :     /* Don't consider already streamed transaction. */
                               4448                 :            725 :     rb->streamTxns += (txn_is_streamed) ? 0 : 1;
                               4449                 :                : 
                               4450                 :                :     /* update the decoding stats */
 1636                          4451                 :            725 :     UpdateDecodingStats((LogicalDecodingContext *) rb->private_data);
                               4452                 :                : 
 1907                          4453         [ -  + ]:            725 :     Assert(dlist_is_empty(&txn->changes));
                               4454         [ -  + ]:            725 :     Assert(txn->nentries == 0);
                               4455         [ -  + ]:            725 :     Assert(txn->nentries_mem == 0);
                               4456                 :                : }
                               4457                 :                : 
                               4458                 :                : /*
                               4459                 :                :  * Size of a change in memory.
                               4460                 :                :  */
                               4461                 :                : static Size
 2173                          4462                 :        1968840 : ReorderBufferChangeSize(ReorderBufferChange *change)
                               4463                 :                : {
                               4464                 :        1968840 :     Size        sz = sizeof(ReorderBufferChange);
                               4465                 :                : 
                               4466   [ +  +  +  +  :        1968840 :     switch (change->action)
                                           +  +  - ]
                               4467                 :                :     {
                               4468                 :                :             /* fall through these, they're all similar enough */
                               4469                 :        1860479 :         case REORDER_BUFFER_CHANGE_INSERT:
                               4470                 :                :         case REORDER_BUFFER_CHANGE_UPDATE:
                               4471                 :                :         case REORDER_BUFFER_CHANGE_DELETE:
                               4472                 :                :         case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT:
                               4473                 :                :             {
                               4474                 :                :                 HeapTuple   oldtup,
                               4475                 :                :                             newtup;
                               4476                 :        1860479 :                 Size        oldlen = 0;
                               4477                 :        1860479 :                 Size        newlen = 0;
                               4478                 :                : 
                               4479                 :        1860479 :                 oldtup = change->data.tp.oldtuple;
                               4480                 :        1860479 :                 newtup = change->data.tp.newtuple;
                               4481                 :                : 
                               4482         [ +  + ]:        1860479 :                 if (oldtup)
                               4483                 :                :                 {
                               4484                 :         195363 :                     sz += sizeof(HeapTupleData);
  638 msawada@postgresql.o     4485                 :         195363 :                     oldlen = oldtup->t_len;
 2173 akapila@postgresql.o     4486                 :         195363 :                     sz += oldlen;
                               4487                 :                :                 }
                               4488                 :                : 
                               4489         [ +  + ]:        1860479 :                 if (newtup)
                               4490                 :                :                 {
                               4491                 :        1582052 :                     sz += sizeof(HeapTupleData);
  638 msawada@postgresql.o     4492                 :        1582052 :                     newlen = newtup->t_len;
 2173 akapila@postgresql.o     4493                 :        1582052 :                     sz += newlen;
                               4494                 :                :                 }
                               4495                 :                : 
                               4496                 :        1860479 :                 break;
                               4497                 :                :             }
                               4498                 :             67 :         case REORDER_BUFFER_CHANGE_MESSAGE:
                               4499                 :                :             {
                               4500                 :             67 :                 Size        prefix_size = strlen(change->data.msg.prefix) + 1;
                               4501                 :                : 
                               4502                 :             67 :                 sz += prefix_size + change->data.msg.message_size +
                               4503                 :                :                     sizeof(Size) + sizeof(Size);
                               4504                 :                : 
                               4505                 :             67 :                 break;
                               4506                 :                :             }
 1839                          4507                 :          10271 :         case REORDER_BUFFER_CHANGE_INVALIDATION:
                               4508                 :                :             {
                               4509                 :          10271 :                 sz += sizeof(SharedInvalidationMessage) *
                               4510                 :          10271 :                     change->data.inval.ninvalidations;
                               4511                 :          10271 :                 break;
                               4512                 :                :             }
 2173                          4513                 :           2554 :         case REORDER_BUFFER_CHANGE_INTERNAL_SNAPSHOT:
                               4514                 :                :             {
                               4515                 :                :                 Snapshot    snap;
                               4516                 :                : 
                               4517                 :           2554 :                 snap = change->data.snapshot;
                               4518                 :                : 
                               4519                 :           2554 :                 sz += sizeof(SnapshotData) +
                               4520                 :           2554 :                     sizeof(TransactionId) * snap->xcnt +
                               4521                 :           2554 :                     sizeof(TransactionId) * snap->subxcnt;
                               4522                 :                : 
                               4523                 :           2554 :                 break;
                               4524                 :                :             }
                               4525                 :             79 :         case REORDER_BUFFER_CHANGE_TRUNCATE:
                               4526                 :                :             {
                               4527                 :             79 :                 sz += sizeof(Oid) * change->data.truncate.nrelids;
                               4528                 :                : 
                               4529                 :             79 :                 break;
                               4530                 :                :             }
                               4531                 :          95390 :         case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM:
                               4532                 :                :         case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT:
                               4533                 :                :         case REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID:
                               4534                 :                :         case REORDER_BUFFER_CHANGE_INTERNAL_TUPLECID:
                               4535                 :                :             /* ReorderBufferChange contains everything important */
                               4536                 :          95390 :             break;
                               4537                 :                :     }
                               4538                 :                : 
                               4539                 :        1968840 :     return sz;
                               4540                 :                : }
                               4541                 :                : 
                               4542                 :                : 
                               4543                 :                : /*
                               4544                 :                :  * Restore a number of changes spilled to disk back into memory.
                               4545                 :                :  */
                               4546                 :                : static Size
 4257 rhaas@postgresql.org     4547                 :            102 : ReorderBufferRestoreChanges(ReorderBuffer *rb, ReorderBufferTXN *txn,
                               4548                 :                :                             TXNEntryFile *file, XLogSegNo *segno)
                               4549                 :                : {
                               4550                 :            102 :     Size        restored = 0;
                               4551                 :                :     XLogSegNo   last_segno;
                               4552                 :                :     dlist_mutable_iter cleanup_iter;
 2145 akapila@postgresql.o     4553                 :            102 :     File       *fd = &file->vfd;
                               4554                 :                : 
 4257 rhaas@postgresql.org     4555         [ -  + ]:            102 :     Assert(txn->first_lsn != InvalidXLogRecPtr);
                               4556         [ -  + ]:            102 :     Assert(txn->final_lsn != InvalidXLogRecPtr);
                               4557                 :                : 
                               4558                 :                :     /* free current entries, so we have memory for more */
                               4559   [ +  -  +  + ]:         169662 :     dlist_foreach_modify(cleanup_iter, &txn->changes)
                               4560                 :                :     {
                               4561                 :         169560 :         ReorderBufferChange *cleanup =
  893 tgl@sss.pgh.pa.us        4562                 :         169560 :             dlist_container(ReorderBufferChange, node, cleanup_iter.cur);
                               4563                 :                : 
 4257 rhaas@postgresql.org     4564                 :         169560 :         dlist_delete(&cleanup->node);
  230 heikki.linnakangas@i     4565                 :         169560 :         ReorderBufferFreeChange(rb, cleanup, true);
                               4566                 :                :     }
 4257 rhaas@postgresql.org     4567                 :            102 :     txn->nentries_mem = 0;
                               4568         [ -  + ]:            102 :     Assert(dlist_is_empty(&txn->changes));
                               4569                 :                : 
 2961 andres@anarazel.de       4570                 :            102 :     XLByteToSeg(txn->final_lsn, last_segno, wal_segment_size);
                               4571                 :                : 
 4257 rhaas@postgresql.org     4572   [ +  +  +  + ]:         173365 :     while (restored < max_changes_in_memory && *segno <= last_segno)
                               4573                 :                :     {
                               4574                 :                :         int         readBytes;
                               4575                 :                :         ReorderBufferDiskChange *ondisk;
                               4576                 :                : 
 1103 akapila@postgresql.o     4577         [ -  + ]:         173263 :         CHECK_FOR_INTERRUPTS();
                               4578                 :                : 
 4257 rhaas@postgresql.org     4579         [ +  + ]:         173263 :         if (*fd == -1)
                               4580                 :                :         {
                               4581                 :                :             char        path[MAXPGPATH];
                               4582                 :                : 
                               4583                 :                :             /* first time in */
                               4584         [ +  + ]:             42 :             if (*segno == 0)
 2961 andres@anarazel.de       4585                 :             39 :                 XLByteToSeg(txn->first_lsn, *segno, wal_segment_size);
                               4586                 :                : 
 4257 rhaas@postgresql.org     4587   [ -  +  -  - ]:             42 :             Assert(*segno != 0 || dlist_is_empty(&txn->changes));
                               4588                 :                : 
                               4589                 :                :             /*
                               4590                 :                :              * No need to care about TLIs here, only used during a single run,
                               4591                 :                :              * so each LSN only maps to a specific WAL record.
                               4592                 :                :              */
 2793 alvherre@alvh.no-ip.     4593                 :             42 :             ReorderBufferSerializedPath(path, MyReplicationSlot, txn->xid,
                               4594                 :                :                                         *segno);
                               4595                 :                : 
 2145 akapila@postgresql.o     4596                 :             42 :             *fd = PathNameOpenFile(path, O_RDONLY | PG_BINARY);
                               4597                 :                : 
                               4598                 :                :             /* No harm in resetting the offset even in case of failure */
                               4599                 :             42 :             file->curOffset = 0;
                               4600                 :                : 
 4257 rhaas@postgresql.org     4601   [ -  +  -  - ]:             42 :             if (*fd < 0 && errno == ENOENT)
                               4602                 :                :             {
 4257 rhaas@postgresql.org     4603                 :LBC         (1) :                 *fd = -1;
                               4604                 :            (1) :                 (*segno)++;
                               4605                 :            (1) :                 continue;
                               4606                 :                :             }
 4257 rhaas@postgresql.org     4607         [ -  + ]:CBC          42 :             else if (*fd < 0)
 4257 rhaas@postgresql.org     4608         [ #  # ]:UBC           0 :                 ereport(ERROR,
                               4609                 :                :                         (errcode_for_file_access(),
                               4610                 :                :                          errmsg("could not open file \"%s\": %m",
                               4611                 :                :                                 path)));
                               4612                 :                :         }
                               4613                 :                : 
                               4614                 :                :         /*
                               4615                 :                :          * Read the statically sized part of a change which has information
                               4616                 :                :          * about the total size. If we couldn't read a record, we're at the
                               4617                 :                :          * end of this file.
                               4618                 :                :          */
 4190 rhaas@postgresql.org     4619                 :CBC      173263 :         ReorderBufferSerializeReserve(rb, sizeof(ReorderBufferDiskChange));
 2145 akapila@postgresql.o     4620                 :         173263 :         readBytes = FileRead(file->vfd, rb->outbuf,
                               4621                 :                :                              sizeof(ReorderBufferDiskChange),
                               4622                 :                :                              file->curOffset, WAIT_EVENT_REORDER_BUFFER_READ);
                               4623                 :                : 
                               4624                 :                :         /* eof */
 4257 rhaas@postgresql.org     4625         [ +  + ]:         173263 :         if (readBytes == 0)
                               4626                 :                :         {
 2145 akapila@postgresql.o     4627                 :             42 :             FileClose(*fd);
 4257 rhaas@postgresql.org     4628                 :             42 :             *fd = -1;
                               4629                 :             42 :             (*segno)++;
                               4630                 :             42 :             continue;
                               4631                 :                :         }
                               4632         [ -  + ]:         173221 :         else if (readBytes < 0)
 4257 rhaas@postgresql.org     4633         [ #  # ]:UBC           0 :             ereport(ERROR,
                               4634                 :                :                     (errcode_for_file_access(),
                               4635                 :                :                      errmsg("could not read from reorderbuffer spill file: %m")));
 4257 rhaas@postgresql.org     4636         [ -  + ]:CBC      173221 :         else if (readBytes != sizeof(ReorderBufferDiskChange))
 4257 rhaas@postgresql.org     4637         [ #  # ]:UBC           0 :             ereport(ERROR,
                               4638                 :                :                     (errcode_for_file_access(),
                               4639                 :                :                      errmsg("could not read from reorderbuffer spill file: read %d instead of %u bytes",
                               4640                 :                :                             readBytes,
                               4641                 :                :                             (uint32) sizeof(ReorderBufferDiskChange))));
                               4642                 :                : 
 2145 akapila@postgresql.o     4643                 :CBC      173221 :         file->curOffset += readBytes;
                               4644                 :                : 
 4257 rhaas@postgresql.org     4645                 :         173221 :         ondisk = (ReorderBufferDiskChange *) rb->outbuf;
                               4646                 :                : 
                               4647                 :         173221 :         ReorderBufferSerializeReserve(rb,
 3051 tgl@sss.pgh.pa.us        4648                 :         173221 :                                       sizeof(ReorderBufferDiskChange) + ondisk->size);
 4257 rhaas@postgresql.org     4649                 :         173221 :         ondisk = (ReorderBufferDiskChange *) rb->outbuf;
                               4650                 :                : 
 2145 akapila@postgresql.o     4651                 :         346442 :         readBytes = FileRead(file->vfd,
                               4652                 :         173221 :                              rb->outbuf + sizeof(ReorderBufferDiskChange),
                               4653                 :         173221 :                              ondisk->size - sizeof(ReorderBufferDiskChange),
                               4654                 :                :                              file->curOffset,
                               4655                 :                :                              WAIT_EVENT_REORDER_BUFFER_READ);
                               4656                 :                : 
 4257 rhaas@postgresql.org     4657         [ -  + ]:         173221 :         if (readBytes < 0)
 4257 rhaas@postgresql.org     4658         [ #  # ]:UBC           0 :             ereport(ERROR,
                               4659                 :                :                     (errcode_for_file_access(),
                               4660                 :                :                      errmsg("could not read from reorderbuffer spill file: %m")));
 4257 rhaas@postgresql.org     4661         [ -  + ]:CBC      173221 :         else if (readBytes != ondisk->size - sizeof(ReorderBufferDiskChange))
 4257 rhaas@postgresql.org     4662         [ #  # ]:UBC           0 :             ereport(ERROR,
                               4663                 :                :                     (errcode_for_file_access(),
                               4664                 :                :                      errmsg("could not read from reorderbuffer spill file: read %d instead of %u bytes",
                               4665                 :                :                             readBytes,
                               4666                 :                :                             (uint32) (ondisk->size - sizeof(ReorderBufferDiskChange)))));
                               4667                 :                : 
 2145 akapila@postgresql.o     4668                 :CBC      173221 :         file->curOffset += readBytes;
                               4669                 :                : 
                               4670                 :                :         /*
                               4671                 :                :          * ok, read a full change from disk, now restore it into proper
                               4672                 :                :          * in-memory format
                               4673                 :                :          */
 4257 rhaas@postgresql.org     4674                 :         173221 :         ReorderBufferRestoreChange(rb, txn, rb->outbuf);
                               4675                 :         173221 :         restored++;
                               4676                 :                :     }
                               4677                 :                : 
                               4678                 :            102 :     return restored;
                               4679                 :                : }
                               4680                 :                : 
                               4681                 :                : /*
                               4682                 :                :  * Convert change from its on-disk format to in-memory format and queue it onto
                               4683                 :                :  * the TXN's ->changes list.
                               4684                 :                :  *
                               4685                 :                :  * Note: although "data" is declared char*, at entry it points to a
                               4686                 :                :  * maxalign'd buffer, making it safe in most of this function to assume
                               4687                 :                :  * that the pointed-to data is suitably aligned for direct access.
                               4688                 :                :  */
                               4689                 :                : static void
                               4690                 :         173221 : ReorderBufferRestoreChange(ReorderBuffer *rb, ReorderBufferTXN *txn,
                               4691                 :                :                            char *data)
                               4692                 :                : {
                               4693                 :                :     ReorderBufferDiskChange *ondisk;
                               4694                 :                :     ReorderBufferChange *change;
                               4695                 :                : 
                               4696                 :         173221 :     ondisk = (ReorderBufferDiskChange *) data;
                               4697                 :                : 
  230 heikki.linnakangas@i     4698                 :         173221 :     change = ReorderBufferAllocChange(rb);
                               4699                 :                : 
                               4700                 :                :     /* copy static part */
 4257 rhaas@postgresql.org     4701                 :         173221 :     memcpy(change, &ondisk->change, sizeof(ReorderBufferChange));
                               4702                 :                : 
                               4703                 :         173221 :     data += sizeof(ReorderBufferDiskChange);
                               4704                 :                : 
                               4705                 :                :     /* restore individual stuff */
 4253 tgl@sss.pgh.pa.us        4706   [ +  +  +  +  :         173221 :     switch (change->action)
                                           -  +  - ]
                               4707                 :                :     {
                               4708                 :                :             /* fall through these, they're all similar enough */
                               4709                 :         171292 :         case REORDER_BUFFER_CHANGE_INSERT:
                               4710                 :                :         case REORDER_BUFFER_CHANGE_UPDATE:
                               4711                 :                :         case REORDER_BUFFER_CHANGE_DELETE:
                               4712                 :                :         case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT:
 3524 andres@anarazel.de       4713         [ +  + ]:         171292 :             if (change->data.tp.oldtuple)
                               4714                 :                :             {
 3484 tgl@sss.pgh.pa.us        4715                 :           5006 :                 uint32      tuplelen = ((HeapTuple) data)->t_len;
                               4716                 :                : 
 3524 andres@anarazel.de       4717                 :           5006 :                 change->data.tp.oldtuple =
  230 heikki.linnakangas@i     4718                 :           5006 :                     ReorderBufferAllocTupleBuf(rb, tuplelen - SizeofHeapTupleHeader);
                               4719                 :                : 
                               4720                 :                :                 /* restore ->tuple */
  638 msawada@postgresql.o     4721                 :           5006 :                 memcpy(change->data.tp.oldtuple, data,
                               4722                 :                :                        sizeof(HeapTupleData));
 3524 andres@anarazel.de       4723                 :           5006 :                 data += sizeof(HeapTupleData);
                               4724                 :                : 
                               4725                 :                :                 /* reset t_data pointer into the new tuplebuf */
  638 msawada@postgresql.o     4726                 :           5006 :                 change->data.tp.oldtuple->t_data =
                               4727                 :           5006 :                     (HeapTupleHeader) ((char *) change->data.tp.oldtuple + HEAPTUPLESIZE);
                               4728                 :                : 
                               4729                 :                :                 /* restore tuple data itself */
                               4730                 :           5006 :                 memcpy(change->data.tp.oldtuple->t_data, data, tuplelen);
 3524 andres@anarazel.de       4731                 :           5006 :                 data += tuplelen;
                               4732                 :                :             }
                               4733                 :                : 
                               4734         [ +  + ]:         171292 :             if (change->data.tp.newtuple)
                               4735                 :                :             {
                               4736                 :                :                 /* here, data might not be suitably aligned! */
                               4737                 :                :                 uint32      tuplelen;
                               4738                 :                : 
 3484 tgl@sss.pgh.pa.us        4739                 :         161071 :                 memcpy(&tuplelen, data + offsetof(HeapTupleData, t_len),
                               4740                 :                :                        sizeof(uint32));
                               4741                 :                : 
 3524 andres@anarazel.de       4742                 :         161071 :                 change->data.tp.newtuple =
  230 heikki.linnakangas@i     4743                 :         161071 :                     ReorderBufferAllocTupleBuf(rb, tuplelen - SizeofHeapTupleHeader);
                               4744                 :                : 
                               4745                 :                :                 /* restore ->tuple */
  638 msawada@postgresql.o     4746                 :         161071 :                 memcpy(change->data.tp.newtuple, data,
                               4747                 :                :                        sizeof(HeapTupleData));
 3524 andres@anarazel.de       4748                 :         161071 :                 data += sizeof(HeapTupleData);
                               4749                 :                : 
                               4750                 :                :                 /* reset t_data pointer into the new tuplebuf */
  638 msawada@postgresql.o     4751                 :         161071 :                 change->data.tp.newtuple->t_data =
                               4752                 :         161071 :                     (HeapTupleHeader) ((char *) change->data.tp.newtuple + HEAPTUPLESIZE);
                               4753                 :                : 
                               4754                 :                :                 /* restore tuple data itself */
                               4755                 :         161071 :                 memcpy(change->data.tp.newtuple->t_data, data, tuplelen);
 3524 andres@anarazel.de       4756                 :         161071 :                 data += tuplelen;
                               4757                 :                :             }
                               4758                 :                : 
 4257 rhaas@postgresql.org     4759                 :         171292 :             break;
 3492 simon@2ndQuadrant.co     4760                 :              1 :         case REORDER_BUFFER_CHANGE_MESSAGE:
                               4761                 :                :             {
                               4762                 :                :                 Size        prefix_size;
                               4763                 :                : 
                               4764                 :                :                 /* read prefix */
                               4765                 :              1 :                 memcpy(&prefix_size, data, sizeof(Size));
                               4766                 :              1 :                 data += sizeof(Size);
                               4767                 :              1 :                 change->data.msg.prefix = MemoryContextAlloc(rb->context,
                               4768                 :                :                                                              prefix_size);
                               4769                 :              1 :                 memcpy(change->data.msg.prefix, data, prefix_size);
 3428 rhaas@postgresql.org     4770         [ -  + ]:              1 :                 Assert(change->data.msg.prefix[prefix_size - 1] == '\0');
 3492 simon@2ndQuadrant.co     4771                 :              1 :                 data += prefix_size;
                               4772                 :                : 
                               4773                 :                :                 /* read the message */
                               4774                 :              1 :                 memcpy(&change->data.msg.message_size, data, sizeof(Size));
                               4775                 :              1 :                 data += sizeof(Size);
                               4776                 :              1 :                 change->data.msg.message = MemoryContextAlloc(rb->context,
                               4777                 :                :                                                               change->data.msg.message_size);
                               4778                 :              1 :                 memcpy(change->data.msg.message, data,
                               4779                 :                :                        change->data.msg.message_size);
                               4780                 :              1 :                 data += change->data.msg.message_size;
                               4781                 :                : 
 1839 akapila@postgresql.o     4782                 :              1 :                 break;
                               4783                 :                :             }
                               4784                 :             23 :         case REORDER_BUFFER_CHANGE_INVALIDATION:
                               4785                 :                :             {
                               4786                 :             23 :                 Size        inval_size = sizeof(SharedInvalidationMessage) *
  893 tgl@sss.pgh.pa.us        4787                 :             23 :                     change->data.inval.ninvalidations;
                               4788                 :                : 
 1839 akapila@postgresql.o     4789                 :             23 :                 change->data.inval.invalidations =
                               4790                 :             23 :                     MemoryContextAlloc(rb->context, inval_size);
                               4791                 :                : 
                               4792                 :                :                 /* read the message */
                               4793                 :             23 :                 memcpy(change->data.inval.invalidations, data, inval_size);
                               4794                 :                : 
 3492 simon@2ndQuadrant.co     4795                 :             23 :                 break;
                               4796                 :                :             }
 4257 rhaas@postgresql.org     4797                 :              2 :         case REORDER_BUFFER_CHANGE_INTERNAL_SNAPSHOT:
                               4798                 :                :             {
                               4799                 :                :                 Snapshot    oldsnap;
                               4800                 :                :                 Snapshot    newsnap;
                               4801                 :                :                 Size        size;
                               4802                 :                : 
 4253 tgl@sss.pgh.pa.us        4803                 :              2 :                 oldsnap = (Snapshot) data;
                               4804                 :                : 
                               4805                 :              2 :                 size = sizeof(SnapshotData) +
                               4806                 :              2 :                     sizeof(TransactionId) * oldsnap->xcnt +
                               4807                 :              2 :                     sizeof(TransactionId) * (oldsnap->subxcnt + 0);
                               4808                 :                : 
                               4809                 :              2 :                 change->data.snapshot = MemoryContextAllocZero(rb->context, size);
                               4810                 :                : 
                               4811                 :              2 :                 newsnap = change->data.snapshot;
                               4812                 :                : 
                               4813                 :              2 :                 memcpy(newsnap, data, size);
                               4814                 :              2 :                 newsnap->xip = (TransactionId *)
                               4815                 :                :                     (((char *) newsnap) + sizeof(SnapshotData));
                               4816                 :              2 :                 newsnap->subxip = newsnap->xip + newsnap->xcnt;
                               4817                 :              2 :                 newsnap->copied = true;
 4257 rhaas@postgresql.org     4818                 :              2 :                 break;
                               4819                 :                :             }
                               4820                 :                :             /* the base struct contains all the data, easy peasy */
 2761 peter_e@gmx.net          4821                 :UBC           0 :         case REORDER_BUFFER_CHANGE_TRUNCATE:
                               4822                 :                :             {
                               4823                 :                :                 Oid        *relids;
                               4824                 :                : 
  230 heikki.linnakangas@i     4825                 :              0 :                 relids = ReorderBufferAllocRelids(rb, change->data.truncate.nrelids);
 2612 tomas.vondra@postgre     4826                 :              0 :                 memcpy(relids, data, change->data.truncate.nrelids * sizeof(Oid));
                               4827                 :              0 :                 change->data.truncate.relids = relids;
                               4828                 :                : 
                               4829                 :              0 :                 break;
                               4830                 :                :             }
 3826 andres@anarazel.de       4831                 :CBC        1903 :         case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM:
                               4832                 :                :         case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT:
                               4833                 :                :         case REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID:
                               4834                 :                :         case REORDER_BUFFER_CHANGE_INTERNAL_TUPLECID:
 4257 rhaas@postgresql.org     4835                 :           1903 :             break;
                               4836                 :                :     }
                               4837                 :                : 
                               4838                 :         173221 :     dlist_push_tail(&txn->changes, &change->node);
                               4839                 :         173221 :     txn->nentries_mem++;
                               4840                 :                : 
                               4841                 :                :     /*
                               4842                 :                :      * Update memory accounting for the restored change.  We need to do this
                               4843                 :                :      * although we don't check the memory limit when restoring the changes in
                               4844                 :                :      * this branch (we only do that when initially queueing the changes after
                               4845                 :                :      * decoding), because we will release the changes later, and that will
                               4846                 :                :      * update the accounting too (subtracting the size from the counters). And
                               4847                 :                :      * we don't want to underflow there.
                               4848                 :                :      */
  573 msawada@postgresql.o     4849                 :         173221 :     ReorderBufferChangeMemoryUpdate(rb, change, NULL, true,
                               4850                 :                :                                     ReorderBufferChangeSize(change));
 4257 rhaas@postgresql.org     4851                 :         173221 : }
                               4852                 :                : 
                               4853                 :                : /*
                               4854                 :                :  * Remove all on-disk stored for the passed in transaction.
                               4855                 :                :  */
                               4856                 :                : static void
                               4857                 :            252 : ReorderBufferRestoreCleanup(ReorderBuffer *rb, ReorderBufferTXN *txn)
                               4858                 :                : {
                               4859                 :                :     XLogSegNo   first;
                               4860                 :                :     XLogSegNo   cur;
                               4861                 :                :     XLogSegNo   last;
                               4862                 :                : 
                               4863         [ -  + ]:            252 :     Assert(txn->first_lsn != InvalidXLogRecPtr);
                               4864         [ -  + ]:            252 :     Assert(txn->final_lsn != InvalidXLogRecPtr);
                               4865                 :                : 
 2961 andres@anarazel.de       4866                 :            252 :     XLByteToSeg(txn->first_lsn, first, wal_segment_size);
                               4867                 :            252 :     XLByteToSeg(txn->final_lsn, last, wal_segment_size);
                               4868                 :                : 
                               4869                 :                :     /* iterate over all possible filenames, and delete them */
 4257 rhaas@postgresql.org     4870         [ +  + ]:            522 :     for (cur = first; cur <= last; cur++)
                               4871                 :                :     {
                               4872                 :                :         char        path[MAXPGPATH];
                               4873                 :                : 
 2793 alvherre@alvh.no-ip.     4874                 :            270 :         ReorderBufferSerializedPath(path, MyReplicationSlot, txn->xid, cur);
 4257 rhaas@postgresql.org     4875   [ -  +  -  - ]:            270 :         if (unlink(path) != 0 && errno != ENOENT)
 4257 rhaas@postgresql.org     4876         [ #  # ]:UBC           0 :             ereport(ERROR,
                               4877                 :                :                     (errcode_for_file_access(),
                               4878                 :                :                      errmsg("could not remove file \"%s\": %m", path)));
                               4879                 :                :     }
 4257 rhaas@postgresql.org     4880                 :CBC         252 : }
                               4881                 :                : 
                               4882                 :                : /*
                               4883                 :                :  * Remove any leftover serialized reorder buffers from a slot directory after a
                               4884                 :                :  * prior crash or decoding session exit.
                               4885                 :                :  */
                               4886                 :                : static void
 2793 alvherre@alvh.no-ip.     4887                 :           1994 : ReorderBufferCleanupSerializedTXNs(const char *slotname)
                               4888                 :                : {
                               4889                 :                :     DIR        *spill_dir;
                               4890                 :                :     struct dirent *spill_de;
                               4891                 :                :     struct stat statbuf;
                               4892                 :                :     char        path[MAXPGPATH * 2 + sizeof(PG_REPLSLOT_DIR)];
                               4893                 :                : 
  424 michael@paquier.xyz      4894                 :           1994 :     sprintf(path, "%s/%s", PG_REPLSLOT_DIR, slotname);
                               4895                 :                : 
                               4896                 :                :     /* we're only handling directories here, skip if it's not ours */
 2793 alvherre@alvh.no-ip.     4897   [ +  -  -  + ]:           1994 :     if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
 2793 alvherre@alvh.no-ip.     4898                 :UBC           0 :         return;
                               4899                 :                : 
 2793 alvherre@alvh.no-ip.     4900                 :CBC        1994 :     spill_dir = AllocateDir(path);
                               4901         [ +  + ]:           9970 :     while ((spill_de = ReadDirExtended(spill_dir, path, INFO)) != NULL)
                               4902                 :                :     {
                               4903                 :                :         /* only look at names that can be ours */
                               4904         [ -  + ]:           5982 :         if (strncmp(spill_de->d_name, "xid", 3) == 0)
                               4905                 :                :         {
 2793 alvherre@alvh.no-ip.     4906                 :UBC           0 :             snprintf(path, sizeof(path),
                               4907                 :                :                      "%s/%s/%s", PG_REPLSLOT_DIR, slotname,
                               4908                 :              0 :                      spill_de->d_name);
                               4909                 :                : 
                               4910         [ #  # ]:              0 :             if (unlink(path) != 0)
                               4911         [ #  # ]:              0 :                 ereport(ERROR,
                               4912                 :                :                         (errcode_for_file_access(),
                               4913                 :                :                          errmsg("could not remove file \"%s\" during removal of %s/%s/xid*: %m",
                               4914                 :                :                                 path, PG_REPLSLOT_DIR, slotname)));
                               4915                 :                :         }
                               4916                 :                :     }
 2793 alvherre@alvh.no-ip.     4917                 :CBC        1994 :     FreeDir(spill_dir);
                               4918                 :                : }
                               4919                 :                : 
                               4920                 :                : /*
                               4921                 :                :  * Given a replication slot, transaction ID and segment number, fill in the
                               4922                 :                :  * corresponding spill file into 'path', which is a caller-owned buffer of size
                               4923                 :                :  * at least MAXPGPATH.
                               4924                 :                :  */
                               4925                 :                : static void
                               4926                 :           3696 : ReorderBufferSerializedPath(char *path, ReplicationSlot *slot, TransactionId xid,
                               4927                 :                :                             XLogSegNo segno)
                               4928                 :                : {
                               4929                 :                :     XLogRecPtr  recptr;
                               4930                 :                : 
 2668                          4931                 :           3696 :     XLogSegNoOffsetToRecPtr(segno, 0, wal_segment_size, recptr);
                               4932                 :                : 
  424 michael@paquier.xyz      4933                 :           3696 :     snprintf(path, MAXPGPATH, "%s/%s/xid-%u-lsn-%X-%X.spill",
                               4934                 :                :              PG_REPLSLOT_DIR,
 2742 tgl@sss.pgh.pa.us        4935                 :           3696 :              NameStr(MyReplicationSlot->data.name),
 1708 peter@eisentraut.org     4936                 :           3696 :              xid, LSN_FORMAT_ARGS(recptr));
 2793 alvherre@alvh.no-ip.     4937                 :           3696 : }
                               4938                 :                : 
                               4939                 :                : /*
                               4940                 :                :  * Delete all data spilled to disk after we've restarted/crashed. It will be
                               4941                 :                :  * recreated when the respective slots are reused.
                               4942                 :                :  */
                               4943                 :                : void
 4257 rhaas@postgresql.org     4944                 :            907 : StartupReorderBuffer(void)
                               4945                 :                : {
                               4946                 :                :     DIR        *logical_dir;
                               4947                 :                :     struct dirent *logical_de;
                               4948                 :                : 
  424 michael@paquier.xyz      4949                 :            907 :     logical_dir = AllocateDir(PG_REPLSLOT_DIR);
                               4950         [ +  + ]:           2819 :     while ((logical_de = ReadDir(logical_dir, PG_REPLSLOT_DIR)) != NULL)
                               4951                 :                :     {
 4257 rhaas@postgresql.org     4952         [ +  + ]:           1912 :         if (strcmp(logical_de->d_name, ".") == 0 ||
                               4953         [ +  + ]:           1005 :             strcmp(logical_de->d_name, "..") == 0)
                               4954                 :           1814 :             continue;
                               4955                 :                : 
                               4956                 :                :         /* if it cannot be a slot, skip the directory */
   97 akapila@postgresql.o     4957         [ -  + ]:GNC          98 :         if (!ReplicationSlotValidateName(logical_de->d_name, true, DEBUG2))
 4257 rhaas@postgresql.org     4958                 :UBC           0 :             continue;
                               4959                 :                : 
                               4960                 :                :         /*
                               4961                 :                :          * ok, has to be a surviving logical slot, iterate and delete
                               4962                 :                :          * everything starting with xid-*
                               4963                 :                :          */
 2793 alvherre@alvh.no-ip.     4964                 :CBC          98 :         ReorderBufferCleanupSerializedTXNs(logical_de->d_name);
                               4965                 :                :     }
 4257 rhaas@postgresql.org     4966                 :            907 :     FreeDir(logical_dir);
                               4967                 :            907 : }
                               4968                 :                : 
                               4969                 :                : /* ---------------------------------------
                               4970                 :                :  * toast reassembly support
                               4971                 :                :  * ---------------------------------------
                               4972                 :                :  */
                               4973                 :                : 
                               4974                 :                : /*
                               4975                 :                :  * Initialize per tuple toast reconstruction support.
                               4976                 :                :  */
                               4977                 :                : static void
                               4978                 :             35 : ReorderBufferToastInitHash(ReorderBuffer *rb, ReorderBufferTXN *txn)
                               4979                 :                : {
                               4980                 :                :     HASHCTL     hash_ctl;
                               4981                 :                : 
                               4982         [ -  + ]:             35 :     Assert(txn->toast_hash == NULL);
                               4983                 :                : 
                               4984                 :             35 :     hash_ctl.keysize = sizeof(Oid);
                               4985                 :             35 :     hash_ctl.entrysize = sizeof(ReorderBufferToastEnt);
                               4986                 :             35 :     hash_ctl.hcxt = rb->context;
                               4987                 :             35 :     txn->toast_hash = hash_create("ReorderBufferToastHash", 5, &hash_ctl,
                               4988                 :                :                                   HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
                               4989                 :             35 : }
                               4990                 :                : 
                               4991                 :                : /*
                               4992                 :                :  * Per toast-chunk handling for toast reconstruction
                               4993                 :                :  *
                               4994                 :                :  * Appends a toast chunk so we can reconstruct it when the tuple "owning" the
                               4995                 :                :  * toasted Datum comes along.
                               4996                 :                :  */
                               4997                 :                : static void
                               4998                 :           1830 : ReorderBufferToastAppendChunk(ReorderBuffer *rb, ReorderBufferTXN *txn,
                               4999                 :                :                               Relation relation, ReorderBufferChange *change)
                               5000                 :                : {
                               5001                 :                :     ReorderBufferToastEnt *ent;
                               5002                 :                :     HeapTuple   newtup;
                               5003                 :                :     bool        found;
                               5004                 :                :     int32       chunksize;
                               5005                 :                :     bool        isnull;
                               5006                 :                :     Pointer     chunk;
                               5007                 :           1830 :     TupleDesc   desc = RelationGetDescr(relation);
                               5008                 :                :     Oid         chunk_id;
                               5009                 :                :     int32       chunk_seq;
                               5010                 :                : 
                               5011         [ +  + ]:           1830 :     if (txn->toast_hash == NULL)
                               5012                 :             35 :         ReorderBufferToastInitHash(rb, txn);
                               5013                 :                : 
                               5014         [ -  + ]:           1830 :     Assert(IsToastRelation(relation));
                               5015                 :                : 
 4253 tgl@sss.pgh.pa.us        5016                 :           1830 :     newtup = change->data.tp.newtuple;
  638 msawada@postgresql.o     5017                 :           1830 :     chunk_id = DatumGetObjectId(fastgetattr(newtup, 1, desc, &isnull));
 4257 rhaas@postgresql.org     5018         [ -  + ]:           1830 :     Assert(!isnull);
  638 msawada@postgresql.o     5019                 :           1830 :     chunk_seq = DatumGetInt32(fastgetattr(newtup, 2, desc, &isnull));
 4257 rhaas@postgresql.org     5020         [ -  + ]:           1830 :     Assert(!isnull);
                               5021                 :                : 
                               5022                 :                :     ent = (ReorderBufferToastEnt *)
  995 peter@eisentraut.org     5023                 :           1830 :         hash_search(txn->toast_hash, &chunk_id, HASH_ENTER, &found);
                               5024                 :                : 
 4257 rhaas@postgresql.org     5025         [ +  + ]:           1830 :     if (!found)
                               5026                 :                :     {
                               5027         [ -  + ]:             49 :         Assert(ent->chunk_id == chunk_id);
                               5028                 :             49 :         ent->num_chunks = 0;
                               5029                 :             49 :         ent->last_chunk_seq = 0;
                               5030                 :             49 :         ent->size = 0;
                               5031                 :             49 :         ent->reconstructed = NULL;
                               5032                 :             49 :         dlist_init(&ent->chunks);
                               5033                 :                : 
                               5034         [ -  + ]:             49 :         if (chunk_seq != 0)
 4257 rhaas@postgresql.org     5035         [ #  # ]:UBC           0 :             elog(ERROR, "got sequence entry %d for toast chunk %u instead of seq 0",
                               5036                 :                :                  chunk_seq, chunk_id);
                               5037                 :                :     }
 4257 rhaas@postgresql.org     5038   [ +  -  -  + ]:CBC        1781 :     else if (found && chunk_seq != ent->last_chunk_seq + 1)
 4257 rhaas@postgresql.org     5039         [ #  # ]:UBC           0 :         elog(ERROR, "got sequence entry %d for toast chunk %u instead of seq %d",
                               5040                 :                :              chunk_seq, chunk_id, ent->last_chunk_seq + 1);
                               5041                 :                : 
  638 msawada@postgresql.o     5042                 :CBC        1830 :     chunk = DatumGetPointer(fastgetattr(newtup, 3, desc, &isnull));
 4257 rhaas@postgresql.org     5043         [ -  + ]:           1830 :     Assert(!isnull);
                               5044                 :                : 
                               5045                 :                :     /* calculate size so we can allocate the right size at once later */
                               5046         [ +  - ]:           1830 :     if (!VARATT_IS_EXTENDED(chunk))
                               5047                 :           1830 :         chunksize = VARSIZE(chunk) - VARHDRSZ;
 4257 rhaas@postgresql.org     5048         [ #  # ]:UBC           0 :     else if (VARATT_IS_SHORT(chunk))
                               5049                 :                :         /* could happen due to heap_form_tuple doing its thing */
                               5050                 :              0 :         chunksize = VARSIZE_SHORT(chunk) - VARHDRSZ_SHORT;
                               5051                 :                :     else
                               5052         [ #  # ]:              0 :         elog(ERROR, "unexpected type of toast chunk");
                               5053                 :                : 
 4257 rhaas@postgresql.org     5054                 :CBC        1830 :     ent->size += chunksize;
                               5055                 :           1830 :     ent->last_chunk_seq = chunk_seq;
                               5056                 :           1830 :     ent->num_chunks++;
                               5057                 :           1830 :     dlist_push_tail(&ent->chunks, &change->node);
                               5058                 :           1830 : }
                               5059                 :                : 
                               5060                 :                : /*
                               5061                 :                :  * Rejigger change->newtuple to point to in-memory toast tuples instead of
                               5062                 :                :  * on-disk toast tuples that may no longer exist (think DROP TABLE or VACUUM).
                               5063                 :                :  *
                               5064                 :                :  * We cannot replace unchanged toast tuples though, so those will still point
                               5065                 :                :  * to on-disk toast data.
                               5066                 :                :  *
                               5067                 :                :  * While updating the existing change with detoasted tuple data, we need to
                               5068                 :                :  * update the memory accounting info, because the change size will differ.
                               5069                 :                :  * Otherwise the accounting may get out of sync, triggering serialization
                               5070                 :                :  * at unexpected times.
                               5071                 :                :  *
                               5072                 :                :  * We simply subtract size of the change before rejiggering the tuple, and
                               5073                 :                :  * then add the new size. This makes it look like the change was removed
                               5074                 :                :  * and then added back, except it only tweaks the accounting info.
                               5075                 :                :  *
                               5076                 :                :  * In particular it can't trigger serialization, which would be pointless
                               5077                 :                :  * anyway as it happens during commit processing right before handing
                               5078                 :                :  * the change to the output plugin.
                               5079                 :                :  */
                               5080                 :                : static void
                               5081                 :         334084 : ReorderBufferToastReplace(ReorderBuffer *rb, ReorderBufferTXN *txn,
                               5082                 :                :                           Relation relation, ReorderBufferChange *change)
                               5083                 :                : {
                               5084                 :                :     TupleDesc   desc;
                               5085                 :                :     int         natt;
                               5086                 :                :     Datum      *attrs;
                               5087                 :                :     bool       *isnull;
                               5088                 :                :     bool       *free;
                               5089                 :                :     HeapTuple   tmphtup;
                               5090                 :                :     Relation    toast_rel;
                               5091                 :                :     TupleDesc   toast_desc;
                               5092                 :                :     MemoryContext oldcontext;
                               5093                 :                :     HeapTuple   newtup;
                               5094                 :                :     Size        old_size;
                               5095                 :                : 
                               5096                 :                :     /* no toast tuples changed */
                               5097         [ +  + ]:         334084 :     if (txn->toast_hash == NULL)
                               5098                 :         333838 :         return;
                               5099                 :                : 
                               5100                 :                :     /*
                               5101                 :                :      * We're going to modify the size of the change. So, to make sure the
                               5102                 :                :      * accounting is correct we record the current change size and then after
                               5103                 :                :      * re-computing the change we'll subtract the recorded size and then
                               5104                 :                :      * re-add the new change size at the end. We don't immediately subtract
                               5105                 :                :      * the old size because if there is any error before we add the new size,
                               5106                 :                :      * we will release the changes and that will update the accounting info
                               5107                 :                :      * (subtracting the size from the counters). And we don't want to
                               5108                 :                :      * underflow there.
                               5109                 :                :      */
 1506 akapila@postgresql.o     5110                 :            246 :     old_size = ReorderBufferChangeSize(change);
                               5111                 :                : 
 4257 rhaas@postgresql.org     5112                 :            246 :     oldcontext = MemoryContextSwitchTo(rb->context);
                               5113                 :                : 
                               5114                 :                :     /* we should only have toast tuples in an INSERT or UPDATE */
 4253 tgl@sss.pgh.pa.us        5115         [ -  + ]:            246 :     Assert(change->data.tp.newtuple);
                               5116                 :                : 
 4257 rhaas@postgresql.org     5117                 :            246 :     desc = RelationGetDescr(relation);
                               5118                 :                : 
                               5119                 :            246 :     toast_rel = RelationIdGetRelation(relation->rd_rel->reltoastrelid);
 2242 tgl@sss.pgh.pa.us        5120         [ -  + ]:            246 :     if (!RelationIsValid(toast_rel))
 1497 akapila@postgresql.o     5121         [ #  # ]:UBC           0 :         elog(ERROR, "could not open toast relation with OID %u (base relation \"%s\")",
                               5122                 :                :              relation->rd_rel->reltoastrelid, RelationGetRelationName(relation));
                               5123                 :                : 
 4257 rhaas@postgresql.org     5124                 :CBC         246 :     toast_desc = RelationGetDescr(toast_rel);
                               5125                 :                : 
                               5126                 :                :     /* should we allocate from stack instead? */
                               5127                 :            246 :     attrs = palloc0(sizeof(Datum) * desc->natts);
                               5128                 :            246 :     isnull = palloc0(sizeof(bool) * desc->natts);
                               5129                 :            246 :     free = palloc0(sizeof(bool) * desc->natts);
                               5130                 :                : 
 4253 tgl@sss.pgh.pa.us        5131                 :            246 :     newtup = change->data.tp.newtuple;
                               5132                 :                : 
  638 msawada@postgresql.o     5133                 :            246 :     heap_deform_tuple(newtup, desc, attrs, isnull);
                               5134                 :                : 
 4257 rhaas@postgresql.org     5135         [ +  + ]:            757 :     for (natt = 0; natt < desc->natts; natt++)
                               5136                 :                :     {
    6 drowley@postgresql.o     5137                 :GNC         511 :         CompactAttribute *attr = TupleDescCompactAttr(desc, natt);
                               5138                 :                :         ReorderBufferToastEnt *ent;
                               5139                 :                :         struct varlena *varlena;
                               5140                 :                : 
                               5141                 :                :         /* va_rawsize is the size of the original datum -- including header */
                               5142                 :                :         struct varatt_external toast_pointer;
                               5143                 :                :         struct varatt_indirect redirect_pointer;
 4257 rhaas@postgresql.org     5144                 :CBC         511 :         struct varlena *new_datum = NULL;
                               5145                 :                :         struct varlena *reconstructed;
                               5146                 :                :         dlist_iter  it;
                               5147                 :            511 :         Size        data_done = 0;
                               5148                 :                : 
                               5149         [ -  + ]:            511 :         if (attr->attisdropped)
 4257 rhaas@postgresql.org     5150                 :GBC         463 :             continue;
                               5151                 :                : 
                               5152                 :                :         /* not a varlena datatype */
 4257 rhaas@postgresql.org     5153         [ +  + ]:CBC         511 :         if (attr->attlen != -1)
                               5154                 :            241 :             continue;
                               5155                 :                : 
                               5156                 :                :         /* no data */
                               5157         [ +  + ]:            270 :         if (isnull[natt])
                               5158                 :             12 :             continue;
                               5159                 :                : 
                               5160                 :                :         /* ok, we know we have a toast datum */
                               5161                 :            258 :         varlena = (struct varlena *) DatumGetPointer(attrs[natt]);
                               5162                 :                : 
                               5163                 :                :         /* no need to do anything if the tuple isn't external */
                               5164         [ +  + ]:            258 :         if (!VARATT_IS_EXTERNAL(varlena))
                               5165                 :            202 :             continue;
                               5166                 :                : 
                               5167   [ -  +  -  +  :             56 :         VARATT_EXTERNAL_GET_POINTER(toast_pointer, varlena);
                                     +  -  -  +  -  
                                                 + ]
                               5168                 :                : 
                               5169                 :                :         /*
                               5170                 :                :          * Check whether the toast tuple changed, replace if so.
                               5171                 :                :          */
                               5172                 :                :         ent = (ReorderBufferToastEnt *)
                               5173                 :             56 :             hash_search(txn->toast_hash,
                               5174                 :                :                         &toast_pointer.va_valueid,
                               5175                 :                :                         HASH_FIND,
                               5176                 :                :                         NULL);
                               5177         [ +  + ]:             56 :         if (ent == NULL)
                               5178                 :              8 :             continue;
                               5179                 :                : 
                               5180                 :                :         new_datum =
                               5181                 :             48 :             (struct varlena *) palloc0(INDIRECT_POINTER_SIZE);
                               5182                 :                : 
                               5183                 :             48 :         free[natt] = true;
                               5184                 :                : 
                               5185                 :             48 :         reconstructed = palloc0(toast_pointer.va_rawsize);
                               5186                 :                : 
                               5187                 :             48 :         ent->reconstructed = reconstructed;
                               5188                 :                : 
                               5189                 :                :         /* stitch toast tuple back together from its parts */
                               5190   [ +  -  +  + ]:           1827 :         dlist_foreach(it, &ent->chunks)
                               5191                 :                :         {
                               5192                 :                :             bool        cisnull;
                               5193                 :                :             ReorderBufferChange *cchange;
                               5194                 :                :             HeapTuple   ctup;
                               5195                 :                :             Pointer     chunk;
                               5196                 :                : 
 4253 tgl@sss.pgh.pa.us        5197                 :           1779 :             cchange = dlist_container(ReorderBufferChange, node, it.cur);
                               5198                 :           1779 :             ctup = cchange->data.tp.newtuple;
  638 msawada@postgresql.o     5199                 :           1779 :             chunk = DatumGetPointer(fastgetattr(ctup, 3, toast_desc, &cisnull));
                               5200                 :                : 
  789 michael@paquier.xyz      5201         [ -  + ]:           1779 :             Assert(!cisnull);
 4257 rhaas@postgresql.org     5202         [ -  + ]:           1779 :             Assert(!VARATT_IS_EXTERNAL(chunk));
                               5203         [ -  + ]:           1779 :             Assert(!VARATT_IS_SHORT(chunk));
                               5204                 :                : 
                               5205                 :           1779 :             memcpy(VARDATA(reconstructed) + data_done,
                               5206                 :           1779 :                    VARDATA(chunk),
                               5207                 :           1779 :                    VARSIZE(chunk) - VARHDRSZ);
                               5208                 :           1779 :             data_done += VARSIZE(chunk) - VARHDRSZ;
                               5209                 :                :         }
 1684                          5210         [ -  + ]:             48 :         Assert(data_done == VARATT_EXTERNAL_GET_EXTSIZE(toast_pointer));
                               5211                 :                : 
                               5212                 :                :         /* make sure its marked as compressed or not */
 4257                          5213         [ +  + ]:             48 :         if (VARATT_EXTERNAL_IS_COMPRESSED(toast_pointer))
                               5214                 :              5 :             SET_VARSIZE_COMPRESSED(reconstructed, data_done + VARHDRSZ);
                               5215                 :                :         else
                               5216                 :             43 :             SET_VARSIZE(reconstructed, data_done + VARHDRSZ);
                               5217                 :                : 
                               5218                 :             48 :         memset(&redirect_pointer, 0, sizeof(redirect_pointer));
                               5219                 :             48 :         redirect_pointer.pointer = reconstructed;
                               5220                 :                : 
                               5221                 :             48 :         SET_VARTAG_EXTERNAL(new_datum, VARTAG_INDIRECT);
                               5222                 :             48 :         memcpy(VARDATA_EXTERNAL(new_datum), &redirect_pointer,
                               5223                 :                :                sizeof(redirect_pointer));
                               5224                 :                : 
                               5225                 :             48 :         attrs[natt] = PointerGetDatum(new_datum);
                               5226                 :                :     }
                               5227                 :                : 
                               5228                 :                :     /*
                               5229                 :                :      * Build tuple in separate memory & copy tuple back into the tuplebuf
                               5230                 :                :      * passed to the output plugin. We can't directly heap_fill_tuple() into
                               5231                 :                :      * the tuplebuf because attrs[] will point back into the current content.
                               5232                 :                :      */
 4253 tgl@sss.pgh.pa.us        5233                 :            246 :     tmphtup = heap_form_tuple(desc, attrs, isnull);
  638 msawada@postgresql.o     5234         [ -  + ]:            246 :     Assert(newtup->t_len <= MaxHeapTupleSize);
                               5235         [ -  + ]:            246 :     Assert(newtup->t_data == (HeapTupleHeader) ((char *) newtup + HEAPTUPLESIZE));
                               5236                 :                : 
                               5237                 :            246 :     memcpy(newtup->t_data, tmphtup->t_data, tmphtup->t_len);
                               5238                 :            246 :     newtup->t_len = tmphtup->t_len;
                               5239                 :                : 
                               5240                 :                :     /*
                               5241                 :                :      * free resources we won't further need, more persistent stuff will be
                               5242                 :                :      * free'd in ReorderBufferToastReset().
                               5243                 :                :      */
 4257 rhaas@postgresql.org     5244                 :            246 :     RelationClose(toast_rel);
 4253 tgl@sss.pgh.pa.us        5245                 :            246 :     pfree(tmphtup);
 4257 rhaas@postgresql.org     5246         [ +  + ]:            757 :     for (natt = 0; natt < desc->natts; natt++)
                               5247                 :                :     {
                               5248         [ +  + ]:            511 :         if (free[natt])
                               5249                 :             48 :             pfree(DatumGetPointer(attrs[natt]));
                               5250                 :                :     }
                               5251                 :            246 :     pfree(attrs);
                               5252                 :            246 :     pfree(free);
                               5253                 :            246 :     pfree(isnull);
                               5254                 :                : 
                               5255                 :            246 :     MemoryContextSwitchTo(oldcontext);
                               5256                 :                : 
                               5257                 :                :     /* subtract the old change size */
  573 msawada@postgresql.o     5258                 :            246 :     ReorderBufferChangeMemoryUpdate(rb, change, NULL, false, old_size);
                               5259                 :                :     /* now add the change back, with the correct size */
                               5260                 :            246 :     ReorderBufferChangeMemoryUpdate(rb, change, NULL, true,
                               5261                 :                :                                     ReorderBufferChangeSize(change));
                               5262                 :                : }
                               5263                 :                : 
                               5264                 :                : /*
                               5265                 :                :  * Free all resources allocated for toast reconstruction.
                               5266                 :                :  */
                               5267                 :                : static void
 4257 rhaas@postgresql.org     5268                 :         337782 : ReorderBufferToastReset(ReorderBuffer *rb, ReorderBufferTXN *txn)
                               5269                 :                : {
                               5270                 :                :     HASH_SEQ_STATUS hstat;
                               5271                 :                :     ReorderBufferToastEnt *ent;
                               5272                 :                : 
                               5273         [ +  + ]:         337782 :     if (txn->toast_hash == NULL)
                               5274                 :         337747 :         return;
                               5275                 :                : 
                               5276                 :                :     /* sequentially walk over the hash and free everything */
                               5277                 :             35 :     hash_seq_init(&hstat, txn->toast_hash);
                               5278         [ +  + ]:             84 :     while ((ent = (ReorderBufferToastEnt *) hash_seq_search(&hstat)) != NULL)
                               5279                 :                :     {
                               5280                 :                :         dlist_mutable_iter it;
                               5281                 :                : 
                               5282         [ +  + ]:             49 :         if (ent->reconstructed != NULL)
                               5283                 :             48 :             pfree(ent->reconstructed);
                               5284                 :                : 
                               5285   [ +  -  +  + ]:           1879 :         dlist_foreach_modify(it, &ent->chunks)
                               5286                 :                :         {
                               5287                 :           1830 :             ReorderBufferChange *change =
  893 tgl@sss.pgh.pa.us        5288                 :           1830 :                 dlist_container(ReorderBufferChange, node, it.cur);
                               5289                 :                : 
 4257 rhaas@postgresql.org     5290                 :           1830 :             dlist_delete(&change->node);
  230 heikki.linnakangas@i     5291                 :           1830 :             ReorderBufferFreeChange(rb, change, true);
                               5292                 :                :         }
                               5293                 :                :     }
                               5294                 :                : 
 4257 rhaas@postgresql.org     5295                 :             35 :     hash_destroy(txn->toast_hash);
                               5296                 :             35 :     txn->toast_hash = NULL;
                               5297                 :                : }
                               5298                 :                : 
                               5299                 :                : 
                               5300                 :                : /* ---------------------------------------
                               5301                 :                :  * Visibility support for logical decoding
                               5302                 :                :  *
                               5303                 :                :  *
                               5304                 :                :  * Lookup actual cmin/cmax values when using decoding snapshot. We can't
                               5305                 :                :  * always rely on stored cmin/cmax values because of two scenarios:
                               5306                 :                :  *
                               5307                 :                :  * * A tuple got changed multiple times during a single transaction and thus
                               5308                 :                :  *   has got a combo CID. Combo CIDs are only valid for the duration of a
                               5309                 :                :  *   single transaction.
                               5310                 :                :  * * A tuple with a cmin but no cmax (and thus no combo CID) got
                               5311                 :                :  *   deleted/updated in another transaction than the one which created it
                               5312                 :                :  *   which we are looking at right now. As only one of cmin, cmax or combo CID
                               5313                 :                :  *   is actually stored in the heap we don't have access to the value we
                               5314                 :                :  *   need anymore.
                               5315                 :                :  *
                               5316                 :                :  * To resolve those problems we have a per-transaction hash of (cmin,
                               5317                 :                :  * cmax) tuples keyed by (relfilelocator, ctid) which contains the actual
                               5318                 :                :  * (cmin, cmax) values. That also takes care of combo CIDs by simply
                               5319                 :                :  * not caring about them at all. As we have the real cmin/cmax values
                               5320                 :                :  * combo CIDs aren't interesting.
                               5321                 :                :  *
                               5322                 :                :  * As we only care about catalog tuples here the overhead of this
                               5323                 :                :  * hashtable should be acceptable.
                               5324                 :                :  *
                               5325                 :                :  * Heap rewrites complicate this a bit, check rewriteheap.c for
                               5326                 :                :  * details.
                               5327                 :                :  * -------------------------------------------------------------------------
                               5328                 :                :  */
                               5329                 :                : 
                               5330                 :                : /* struct for sorting mapping files by LSN efficiently */
                               5331                 :                : typedef struct RewriteMappingFile
                               5332                 :                : {
                               5333                 :                :     XLogRecPtr  lsn;
                               5334                 :                :     char        fname[MAXPGPATH];
                               5335                 :                : } RewriteMappingFile;
                               5336                 :                : 
                               5337                 :                : #ifdef NOT_USED
                               5338                 :                : static void
                               5339                 :                : DisplayMapping(HTAB *tuplecid_data)
                               5340                 :                : {
                               5341                 :                :     HASH_SEQ_STATUS hstat;
                               5342                 :                :     ReorderBufferTupleCidEnt *ent;
                               5343                 :                : 
                               5344                 :                :     hash_seq_init(&hstat, tuplecid_data);
                               5345                 :                :     while ((ent = (ReorderBufferTupleCidEnt *) hash_seq_search(&hstat)) != NULL)
                               5346                 :                :     {
                               5347                 :                :         elog(DEBUG3, "mapping: node: %u/%u/%u tid: %u/%u cmin: %u, cmax: %u",
                               5348                 :                :              ent->key.rlocator.dbOid,
                               5349                 :                :              ent->key.rlocator.spcOid,
                               5350                 :                :              ent->key.rlocator.relNumber,
                               5351                 :                :              ItemPointerGetBlockNumber(&ent->key.tid),
                               5352                 :                :              ItemPointerGetOffsetNumber(&ent->key.tid),
                               5353                 :                :              ent->cmin,
                               5354                 :                :              ent->cmax
                               5355                 :                :             );
                               5356                 :                :     }
                               5357                 :                : }
                               5358                 :                : #endif
                               5359                 :                : 
                               5360                 :                : /*
                               5361                 :                :  * Apply a single mapping file to tuplecid_data.
                               5362                 :                :  *
                               5363                 :                :  * The mapping file has to have been verified to be a) committed b) for our
                               5364                 :                :  * transaction c) applied in LSN order.
                               5365                 :                :  */
                               5366                 :                : static void
                               5367                 :             27 : ApplyLogicalMappingFile(HTAB *tuplecid_data, Oid relid, const char *fname)
                               5368                 :                : {
                               5369                 :                :     char        path[MAXPGPATH];
                               5370                 :                :     int         fd;
                               5371                 :                :     int         readBytes;
                               5372                 :                :     LogicalRewriteMappingData map;
                               5373                 :                : 
  424 michael@paquier.xyz      5374                 :             27 :     sprintf(path, "%s/%s", PG_LOGICAL_MAPPINGS_DIR, fname);
 2957 peter_e@gmx.net          5375                 :             27 :     fd = OpenTransientFile(path, O_RDONLY | PG_BINARY);
 4257 rhaas@postgresql.org     5376         [ +  - ]:             27 :     if (fd < 0)
 4257 rhaas@postgresql.org     5377         [ #  # ]:UBC           0 :         ereport(ERROR,
                               5378                 :                :                 (errcode_for_file_access(),
                               5379                 :                :                  errmsg("could not open file \"%s\": %m", path)));
                               5380                 :                : 
                               5381                 :                :     while (true)
 4257 rhaas@postgresql.org     5382                 :CBC         209 :     {
                               5383                 :                :         ReorderBufferTupleCidKey key;
                               5384                 :                :         ReorderBufferTupleCidEnt *ent;
                               5385                 :                :         ReorderBufferTupleCidEnt *new_ent;
                               5386                 :                :         bool        found;
                               5387                 :                : 
                               5388                 :                :         /* be careful about padding */
                               5389                 :            236 :         memset(&key, 0, sizeof(ReorderBufferTupleCidKey));
                               5390                 :                : 
                               5391                 :                :         /* read all mappings till the end of the file */
 3146                          5392                 :            236 :         pgstat_report_wait_start(WAIT_EVENT_REORDER_LOGICAL_MAPPING_READ);
 4257                          5393                 :            236 :         readBytes = read(fd, &map, sizeof(LogicalRewriteMappingData));
 3146                          5394                 :            236 :         pgstat_report_wait_end();
                               5395                 :                : 
 4257                          5396         [ -  + ]:            236 :         if (readBytes < 0)
 4257 rhaas@postgresql.org     5397         [ #  # ]:UBC           0 :             ereport(ERROR,
                               5398                 :                :                     (errcode_for_file_access(),
                               5399                 :                :                      errmsg("could not read file \"%s\": %m",
                               5400                 :                :                             path)));
 4193 bruce@momjian.us         5401         [ +  + ]:CBC         236 :         else if (readBytes == 0)    /* EOF */
 4257 rhaas@postgresql.org     5402                 :             27 :             break;
                               5403         [ -  + ]:            209 :         else if (readBytes != sizeof(LogicalRewriteMappingData))
 4257 rhaas@postgresql.org     5404         [ #  # ]:UBC           0 :             ereport(ERROR,
                               5405                 :                :                     (errcode_for_file_access(),
                               5406                 :                :                      errmsg("could not read from file \"%s\": read %d instead of %d bytes",
                               5407                 :                :                             path, readBytes,
                               5408                 :                :                             (int32) sizeof(LogicalRewriteMappingData))));
                               5409                 :                : 
 1210 rhaas@postgresql.org     5410                 :CBC         209 :         key.rlocator = map.old_locator;
 4257                          5411                 :            209 :         ItemPointerCopy(&map.old_tid,
                               5412                 :                :                         &key.tid);
                               5413                 :                : 
                               5414                 :                : 
                               5415                 :                :         ent = (ReorderBufferTupleCidEnt *)
  995 peter@eisentraut.org     5416                 :            209 :             hash_search(tuplecid_data, &key, HASH_FIND, NULL);
                               5417                 :                : 
                               5418                 :                :         /* no existing mapping, no need to update */
 4257 rhaas@postgresql.org     5419         [ -  + ]:            209 :         if (!ent)
 4257 rhaas@postgresql.org     5420                 :UBC           0 :             continue;
                               5421                 :                : 
 1210 rhaas@postgresql.org     5422                 :CBC         209 :         key.rlocator = map.new_locator;
 4257                          5423                 :            209 :         ItemPointerCopy(&map.new_tid,
                               5424                 :                :                         &key.tid);
                               5425                 :                : 
                               5426                 :                :         new_ent = (ReorderBufferTupleCidEnt *)
  995 peter@eisentraut.org     5427                 :            209 :             hash_search(tuplecid_data, &key, HASH_ENTER, &found);
                               5428                 :                : 
 4257 rhaas@postgresql.org     5429         [ +  + ]:            209 :         if (found)
                               5430                 :                :         {
                               5431                 :                :             /*
                               5432                 :                :              * Make sure the existing mapping makes sense. We sometime update
                               5433                 :                :              * old records that did not yet have a cmax (e.g. pg_class' own
                               5434                 :                :              * entry while rewriting it) during rewrites, so allow that.
                               5435                 :                :              */
                               5436   [ +  -  -  + ]:              6 :             Assert(ent->cmin == InvalidCommandId || ent->cmin == new_ent->cmin);
                               5437   [ -  +  -  - ]:              6 :             Assert(ent->cmax == InvalidCommandId || ent->cmax == new_ent->cmax);
                               5438                 :                :         }
                               5439                 :                :         else
                               5440                 :                :         {
                               5441                 :                :             /* update mapping */
                               5442                 :            203 :             new_ent->cmin = ent->cmin;
                               5443                 :            203 :             new_ent->cmax = ent->cmax;
                               5444                 :            203 :             new_ent->combocid = ent->combocid;
                               5445                 :                :         }
                               5446                 :                :     }
                               5447                 :                : 
 2306 peter@eisentraut.org     5448         [ -  + ]:             27 :     if (CloseTransientFile(fd) != 0)
 2425 michael@paquier.xyz      5449         [ #  # ]:UBC           0 :         ereport(ERROR,
                               5450                 :                :                 (errcode_for_file_access(),
                               5451                 :                :                  errmsg("could not close file \"%s\": %m", path)));
 4257 rhaas@postgresql.org     5452                 :CBC          27 : }
                               5453                 :                : 
                               5454                 :                : 
                               5455                 :                : /*
                               5456                 :                :  * Check whether the TransactionId 'xid' is in the pre-sorted array 'xip'.
                               5457                 :                :  */
                               5458                 :                : static bool
                               5459                 :            348 : TransactionIdInArray(TransactionId xid, TransactionId *xip, Size num)
                               5460                 :                : {
                               5461                 :            348 :     return bsearch(&xid, xip, num,
                               5462                 :            348 :                    sizeof(TransactionId), xidComparator) != NULL;
                               5463                 :                : }
                               5464                 :                : 
                               5465                 :                : /*
                               5466                 :                :  * list_sort() comparator for sorting RewriteMappingFiles in LSN order.
                               5467                 :                :  */
                               5468                 :                : static int
 2296 tgl@sss.pgh.pa.us        5469                 :             33 : file_sort_by_lsn(const ListCell *a_p, const ListCell *b_p)
                               5470                 :                : {
                               5471                 :             33 :     RewriteMappingFile *a = (RewriteMappingFile *) lfirst(a_p);
                               5472                 :             33 :     RewriteMappingFile *b = (RewriteMappingFile *) lfirst(b_p);
                               5473                 :                : 
  620 nathan@postgresql.or     5474                 :             33 :     return pg_cmp_u64(a->lsn, b->lsn);
                               5475                 :                : }
                               5476                 :                : 
                               5477                 :                : /*
                               5478                 :                :  * Apply any existing logical remapping files if there are any targeted at our
                               5479                 :                :  * transaction for relid.
                               5480                 :                :  */
                               5481                 :                : static void
 4257 rhaas@postgresql.org     5482                 :             11 : UpdateLogicalMappings(HTAB *tuplecid_data, Oid relid, Snapshot snapshot)
                               5483                 :                : {
                               5484                 :                :     DIR        *mapping_dir;
                               5485                 :                :     struct dirent *mapping_de;
                               5486                 :             11 :     List       *files = NIL;
                               5487                 :                :     ListCell   *file;
                               5488         [ +  - ]:             11 :     Oid         dboid = IsSharedRelation(relid) ? InvalidOid : MyDatabaseId;
                               5489                 :                : 
  424 michael@paquier.xyz      5490                 :             11 :     mapping_dir = AllocateDir(PG_LOGICAL_MAPPINGS_DIR);
                               5491         [ +  + ]:            573 :     while ((mapping_de = ReadDir(mapping_dir, PG_LOGICAL_MAPPINGS_DIR)) != NULL)
                               5492                 :                :     {
                               5493                 :                :         Oid         f_dboid;
                               5494                 :                :         Oid         f_relid;
                               5495                 :                :         TransactionId f_mapped_xid;
                               5496                 :                :         TransactionId f_create_xid;
                               5497                 :                :         XLogRecPtr  f_lsn;
                               5498                 :                :         uint32      f_hi,
                               5499                 :                :                     f_lo;
                               5500                 :                :         RewriteMappingFile *f;
                               5501                 :                : 
 4257 rhaas@postgresql.org     5502         [ +  + ]:            562 :         if (strcmp(mapping_de->d_name, ".") == 0 ||
                               5503         [ +  + ]:            551 :             strcmp(mapping_de->d_name, "..") == 0)
                               5504                 :            535 :             continue;
                               5505                 :                : 
                               5506                 :                :         /* Ignore files that aren't ours */
                               5507         [ -  + ]:            540 :         if (strncmp(mapping_de->d_name, "map-", 4) != 0)
 4257 rhaas@postgresql.org     5508                 :UBC           0 :             continue;
                               5509                 :                : 
 4257 rhaas@postgresql.org     5510         [ -  + ]:CBC         540 :         if (sscanf(mapping_de->d_name, LOGICAL_REWRITE_FORMAT,
                               5511                 :                :                    &f_dboid, &f_relid, &f_hi, &f_lo,
                               5512                 :                :                    &f_mapped_xid, &f_create_xid) != 6)
 4199 tgl@sss.pgh.pa.us        5513         [ #  # ]:UBC           0 :             elog(ERROR, "could not parse filename \"%s\"", mapping_de->d_name);
                               5514                 :                : 
 4257 rhaas@postgresql.org     5515                 :CBC         540 :         f_lsn = ((uint64) f_hi) << 32 | f_lo;
                               5516                 :                : 
                               5517                 :                :         /* mapping for another database */
                               5518         [ -  + ]:            540 :         if (f_dboid != dboid)
 4257 rhaas@postgresql.org     5519                 :UBC           0 :             continue;
                               5520                 :                : 
                               5521                 :                :         /* mapping for another relation */
 4257 rhaas@postgresql.org     5522         [ +  + ]:CBC         540 :         if (f_relid != relid)
                               5523                 :             60 :             continue;
                               5524                 :                : 
                               5525                 :                :         /* did the creating transaction abort? */
                               5526         [ +  + ]:            480 :         if (!TransactionIdDidCommit(f_create_xid))
                               5527                 :            132 :             continue;
                               5528                 :                : 
                               5529                 :                :         /* not for our transaction */
                               5530         [ +  + ]:            348 :         if (!TransactionIdInArray(f_mapped_xid, snapshot->subxip, snapshot->subxcnt))
                               5531                 :            321 :             continue;
                               5532                 :                : 
                               5533                 :                :         /* ok, relevant, queue for apply */
                               5534                 :             27 :         f = palloc(sizeof(RewriteMappingFile));
                               5535                 :             27 :         f->lsn = f_lsn;
                               5536                 :             27 :         strcpy(f->fname, mapping_de->d_name);
                               5537                 :             27 :         files = lappend(files, f);
                               5538                 :                :     }
                               5539                 :             11 :     FreeDir(mapping_dir);
                               5540                 :                : 
                               5541                 :                :     /* sort files so we apply them in LSN order */
 2296 tgl@sss.pgh.pa.us        5542                 :             11 :     list_sort(files, file_sort_by_lsn);
                               5543                 :                : 
                               5544   [ +  +  +  +  :             38 :     foreach(file, files)
                                              +  + ]
                               5545                 :                :     {
                               5546                 :             27 :         RewriteMappingFile *f = (RewriteMappingFile *) lfirst(file);
                               5547                 :                : 
 4199                          5548         [ -  + ]:             27 :         elog(DEBUG1, "applying mapping: \"%s\" in %u", f->fname,
                               5549                 :                :              snapshot->subxip[0]);
 4257 rhaas@postgresql.org     5550                 :             27 :         ApplyLogicalMappingFile(tuplecid_data, relid, f->fname);
                               5551                 :             27 :         pfree(f);
                               5552                 :                :     }
                               5553                 :             11 : }
                               5554                 :                : 
                               5555                 :                : /*
                               5556                 :                :  * Lookup cmin/cmax of a tuple, during logical decoding where we can't rely on
                               5557                 :                :  * combo CIDs.
                               5558                 :                :  */
                               5559                 :                : bool
                               5560                 :            775 : ResolveCminCmaxDuringDecoding(HTAB *tuplecid_data,
                               5561                 :                :                               Snapshot snapshot,
                               5562                 :                :                               HeapTuple htup, Buffer buffer,
                               5563                 :                :                               CommandId *cmin, CommandId *cmax)
                               5564                 :                : {
                               5565                 :                :     ReorderBufferTupleCidKey key;
                               5566                 :                :     ReorderBufferTupleCidEnt *ent;
                               5567                 :                :     ForkNumber  forkno;
                               5568                 :                :     BlockNumber blockno;
 4193 bruce@momjian.us         5569                 :            775 :     bool        updated_mapping = false;
                               5570                 :                : 
                               5571                 :                :     /*
                               5572                 :                :      * Return unresolved if tuplecid_data is not valid.  That's because when
                               5573                 :                :      * streaming in-progress transactions we may run into tuples with the CID
                               5574                 :                :      * before actually decoding them.  Think e.g. about INSERT followed by
                               5575                 :                :      * TRUNCATE, where the TRUNCATE may not be decoded yet when applying the
                               5576                 :                :      * INSERT.  So in such cases, we assume the CID is from the future
                               5577                 :                :      * command.
                               5578                 :                :      */
 1907 akapila@postgresql.o     5579         [ +  + ]:            775 :     if (tuplecid_data == NULL)
                               5580                 :             11 :         return false;
                               5581                 :                : 
                               5582                 :                :     /* be careful about padding */
 4257 rhaas@postgresql.org     5583                 :            764 :     memset(&key, 0, sizeof(key));
                               5584                 :                : 
                               5585         [ -  + ]:            764 :     Assert(!BufferIsLocal(buffer));
                               5586                 :                : 
                               5587                 :                :     /*
                               5588                 :                :      * get relfilelocator from the buffer, no convenient way to access it
                               5589                 :                :      * other than that.
                               5590                 :                :      */
 1210                          5591                 :            764 :     BufferGetTag(buffer, &key.rlocator, &forkno, &blockno);
                               5592                 :                : 
                               5593                 :                :     /* tuples can only be in the main fork */
 4257                          5594         [ -  + ]:            764 :     Assert(forkno == MAIN_FORKNUM);
                               5595         [ -  + ]:            764 :     Assert(blockno == ItemPointerGetBlockNumber(&htup->t_self));
                               5596                 :                : 
                               5597                 :            764 :     ItemPointerCopy(&htup->t_self,
                               5598                 :                :                     &key.tid);
                               5599                 :                : 
                               5600                 :            775 : restart:
                               5601                 :                :     ent = (ReorderBufferTupleCidEnt *)
  995 peter@eisentraut.org     5602                 :            775 :         hash_search(tuplecid_data, &key, HASH_FIND, NULL);
                               5603                 :                : 
                               5604                 :                :     /*
                               5605                 :                :      * failed to find a mapping, check whether the table was rewritten and
                               5606                 :                :      * apply mapping if so, but only do that once - there can be no new
                               5607                 :                :      * mappings while we are in here since we have to hold a lock on the
                               5608                 :                :      * relation.
                               5609                 :                :      */
 4257 rhaas@postgresql.org     5610   [ +  +  +  + ]:            775 :     if (ent == NULL && !updated_mapping)
                               5611                 :                :     {
                               5612                 :             11 :         UpdateLogicalMappings(tuplecid_data, htup->t_tableOid, snapshot);
                               5613                 :                :         /* now check but don't update for a mapping again */
                               5614                 :             11 :         updated_mapping = true;
                               5615                 :             11 :         goto restart;
                               5616                 :                :     }
                               5617         [ +  + ]:            764 :     else if (ent == NULL)
                               5618                 :              5 :         return false;
                               5619                 :                : 
                               5620         [ +  - ]:            759 :     if (cmin)
                               5621                 :            759 :         *cmin = ent->cmin;
                               5622         [ +  - ]:            759 :     if (cmax)
                               5623                 :            759 :         *cmax = ent->cmax;
                               5624                 :            759 :     return true;
                               5625                 :                : }
                               5626                 :                : 
                               5627                 :                : /*
                               5628                 :                :  * Count invalidation messages of specified transaction.
                               5629                 :                :  *
                               5630                 :                :  * Returns number of messages, and msgs is set to the pointer of the linked
                               5631                 :                :  * list for the messages.
                               5632                 :                :  */
                               5633                 :                : uint32
  201 akapila@postgresql.o     5634                 :             32 : ReorderBufferGetInvalidations(ReorderBuffer *rb, TransactionId xid,
                               5635                 :                :                               SharedInvalidationMessage **msgs)
                               5636                 :                : {
                               5637                 :                :     ReorderBufferTXN *txn;
                               5638                 :                : 
                               5639                 :             32 :     txn = ReorderBufferTXNByXid(rb, xid, false, NULL, InvalidXLogRecPtr,
                               5640                 :                :                                 false);
                               5641                 :                : 
                               5642         [ -  + ]:             32 :     if (txn == NULL)
  201 akapila@postgresql.o     5643                 :UBC           0 :         return 0;
                               5644                 :                : 
  201 akapila@postgresql.o     5645                 :CBC          32 :     *msgs = txn->invalidations;
                               5646                 :                : 
                               5647                 :             32 :     return txn->ninvalidations;
                               5648                 :                : }
        

Generated by: LCOV version 2.4-beta