This post is in continuation of my previous post. Here I would explain main transaction working from code flow perspective. Sub-transaction, MVCC and other related details will be covered in subsequent posts. Request to see my previous post Basic of Transaction in order to get better understanding.
Internally each transaction is represented as unique number in increasing order with an exception in-case of overflow. Also whole of transaction flow is tracked using various state as explained below:
Command Execution:
Usually each command execution has 3 steps as:
- StartTransactionCommand
- Do the operation specific to command
- CommitTransactionCommmand
As mentioned in my earlier post, if transaction is not started explicitly then it will be started internally for any command execution. So this forms two way of startign the transaction. Below it explain both of the case and corresponding state transition involved.
Meaning of below nomenclature
X ====action====>Y
Current state is X, in this it does operation "action" and gets transferred to state "Y". In-case there is no action mentioned, means in current state it does not do any operation, it directly moves to state "Y".
There are mainly 3 actions mentioned below, the purpose of each actions are as below:
- StartTransaction: Assign resources in terms of memory, initializes the guc variable specific, initialize the transaction properties like read-only transaction or read-write, create a new resource owner etc.
- CommitTransaction: Undo all initialization done by StartTransaction. Execute any pending triggers, handle all on commit action if any. Generate a COMMIT WAL record and insert same in WAL buffer, Update commit TS data. Then depending on synchronous commit or asynchronous commit configured, wait for response from standby node or directly flush the transaction to CLOG respectively.
- AbortTransaction: Almost similar to CommitTransaction except it writes an ABORT WAL, marks the transaction as ABORTED.
State transition mentioned in subsequent section are states from client queries perspective. In addition to these state, there are few states from server perspective also, these are:
- TRANS_DEFAULT: Idle, its default state.
- TRANS_START: This is the state in which transaction initialization happens (StartTransaction)
- TRANS_INPROGRESS: Means transaction has been started.
- TRANS_COMMIT: This state shows transaction commit is in progress (CommitTransaction)
- TRANS_ABORT:This state shows transaction abort is in progress (AbortTransaction)
Case-1: Explicit Start of Transaction:
Execution of START TRANSACTION command:
- StartTransactionCommand (TBLOCK_DEFAULT)====StartTransaction====>TBLOCK_STARTED
- Processing this command: BeginTransactionBlock(TBLOCK_STARTED) =====> TBLOCK_BEGIN.
- CommitTransactionCommmand(TBLOCK_BEGIN) ====> TBLOCK_INPROGRESS
Execution of a normal command:
- StartTransactionCommand (TBLOCK_INPROGRESS): Nothing to do.
- Processing this command: Till this point no transaction ID has been assigned, as was not sure if any command requiring transaction id going to be executed. So now call the function AssignTransactionId to assign a new transaction id. Current number of transaction is maintained in a shared variable ShmemVariableCache->nextXid; So the current value of this variable is taken as transaction id for current transaction. Then value of ShmemVariableCache->nextXid is incremented (taking care of overflow case).
Also each transaction information needs to be made durable (its one of the properties), for which it maintains:
- Commit log for each transaction (Called CLOG stored in clog page)
- Each transaction commit timestamp (call CommitTs stores in separate page)
If current XID is going to be stored in a new page (either because its first transaction in the system or existing page is full), then it needs reset whole content of new page with zero. This should be done for all pages used for storing these information.
Finally it should:
- Also each session maintains MyPgXact, which maintains transaction information in memory. This is used by all other session for taking various decision. So assign this transaction to MyPgXact.
- The new transaction id is stored in the each tuple being created (more on this in coming post related to MVCC).
- Stores the current command id in each tuple being created.
- Then continue with normal command operation.
- CommitTransactionCommmand(TBLOCK_INPROGRESS): Does command counter increment (CommandCounterIncrement) i.e. increments the command id so that if multiple commands running in same transaction then next command can see the operation done by previous command. No state transition.
Executing again a command:
- StartTransactionCommand (TBLOCK_INPROGRESS): Nothing to do.
- Processing this command: Since already transaction assigned nothing to do. Just continue with command execution.
- CommitTransactionCommmand(TBLOCK_INPROGRESS): Does command counter increment (CommandCounterIncrement) i.e. increments the command id so that if multiple commands running in same transaction then next command can see the operation done by previous command. No state transition.
Executing COMMIT/END command:
- StartTransactionCommand (TBLOCK_INPROGRESS): Nothing to do.
- Processing this command: EndTransactionBlock(TBLOCK_INPROGRESS) ====> TBLOCK_END
- CommitTransactionCommmand(TBLOCK_END)====CommitTransaction====> TBLOCK_DEFAULT
Executing ROLLBACK command:
- StartTransactionCommand (TBLOCK_INPROGRESS): Nothing to do.
- Processing this command: UserAbortTransactionBlock(TBLOCK_INPROGRESS) ====> TBLOCK_ABORT_PENDING
- CommitTransactionCommmand(TBLOCK_ABORT_PENDING)====AbortTransaction====> TBLOCK_DEFAULT
Case-2: Implicit Start of Transaction:
Transaction is started automatically in-case a transaction block start command was not executed. This transaction is valid only for the current command and it gets committed or aborted automatically once command gets executed successfully or gets fail respectively
Command Execution- Success:
- StartTransactionCommand (TBLOCK_DEFAULT)====StartTransaction====>TBLOCK_STARTED
- Do the actual operation
- CommitTransactionCommmand(TBLOCK_STARTED)====CommitTransaction====> TBLOCK_DEFAULT
Command Execution - Fail:
- StartTransactionCommand (TBLOCK_DEFAULT)====StartTransaction====>TBLOCK_STARTED
- Do the actual operation
- AbortCurrentTransaction(TBLOCK_STARTED) ====AbortTransaction====> TBLOCK_DEFAULT
In my subsequent post, I would cover some more details about transaction and their implementation details.
Any comment/query/suggestion welcome.