Saturday, April 30, 2016

PostgreSQL Transaction Internal Code Flow

This post is in continuation of my previous post. Here I would explain main transaction working from code flow perspective. Sub-transaction, MVCC and other related details will be covered in subsequent posts. Request to see my previous post Basic of Transaction in order to get better understanding. 
Internally each transaction is represented as unique number in increasing order with an exception in-case of overflow. Also whole of transaction flow is tracked using various state as explained below:

Command Execution:

Usually each command execution has 3 steps as:
  • StartTransactionCommand
  • Do the operation specific to command
  • CommitTransactionCommmand
As mentioned in my earlier post, if transaction is not started explicitly then it will be started internally for any command execution. So this forms two way of startign the transaction. Below it explain both of the case and corresponding state transition involved. 

Meaning  of below nomenclature
                    X ====action====>Y
Current state is X, in this it does operation "action" and gets transferred to state "Y". In-case there is no action mentioned, means in current state it does not do any operation, it directly moves to state "Y".
There are mainly 3 actions mentioned below, the purpose of each actions are as below:
  • StartTransaction: Assign resources in terms of memory, initializes the guc variable specific, initialize the transaction properties like read-only transaction or read-write, create a new resource owner etc.
  • CommitTransaction: Undo all initialization done by StartTransaction. Execute any pending triggers, handle all on commit action if any. Generate a COMMIT WAL record and insert same in WAL buffer, Update commit TS data. Then depending on synchronous commit or asynchronous commit configured, wait for response from standby node or directly flush the transaction to CLOG respectively.
  • AbortTransaction: Almost similar to CommitTransaction except it writes an ABORT WAL, marks the transaction as ABORTED.

State transition mentioned in subsequent section are states from client queries perspective. In addition to these state, there are few states from server perspective also, these are:
  • TRANS_DEFAULT: Idle, its default state.
  • TRANS_START: This is the state in which transaction initialization happens (StartTransaction)
  • TRANS_INPROGRESS: Means transaction has been started.
  • TRANS_COMMIT: This state shows transaction commit is in progress (CommitTransaction)
  • TRANS_ABORT:This state shows transaction abort is in progress (AbortTransaction)

Case-1: Explicit Start of Transaction:


Execution of START TRANSACTION command:

  • StartTransactionCommand (TBLOCK_DEFAULT)====StartTransaction====>TBLOCK_STARTED
  • Processing this command:  BeginTransactionBlock(TBLOCK_STARTED) =====> TBLOCK_BEGIN
  • CommitTransactionCommmand(TBLOCK_BEGIN) ====> TBLOCK_INPROGRESS


Execution of a normal command:

  • StartTransactionCommand (TBLOCK_INPROGRESS): Nothing to do.
  • Processing this command: Till this point no transaction ID has been assigned, as was not sure if any command requiring transaction id going to be executed. So now call the function AssignTransactionId to assign a new transaction id. Current number of transaction is maintained in a shared variable ShmemVariableCache->nextXid; So the current value of this variable is taken as transaction id for current transaction. Then value of ShmemVariableCache->nextXid is incremented (taking care of overflow case). 
        Also each transaction information needs to be made durable (its one of the properties), for which it maintains:
  1. Commit log for each transaction (Called CLOG stored in clog page)
  2. Each transaction commit timestamp (call CommitTs stores in separate page)
        If current XID is going to be stored in a new page (either because its first transaction in the system or existing page is full), then it needs reset whole content of new page with zero. This should be done for all pages used for storing these information.

        Finally it should:
  1. Also each session maintains MyPgXact, which maintains transaction information in memory. This is used by all other session for taking various decision. So assign this transaction to MyPgXact.
  2. The new transaction id is stored in the each tuple being created (more on this in coming post related to MVCC).
  3. Stores the current command id in each tuple being created.
  4. Then continue with normal command operation.

  • CommitTransactionCommmand(TBLOCK_INPROGRESS): Does command counter increment (CommandCounterIncrement) i.e. increments the command id so that if multiple commands running in same transaction then next command can see the operation done by previous command. No state transition.


Executing again a command:

  • StartTransactionCommand (TBLOCK_INPROGRESS): Nothing to do.
  • Processing this command: Since already transaction assigned nothing to do. Just continue with command execution.
  • CommitTransactionCommmand(TBLOCK_INPROGRESS): Does command counter increment (CommandCounterIncrement) i.e. increments the command id so that if multiple commands running in same transaction then next command can see the operation done by previous command. No state transition.


Executing COMMIT/END command:

  • StartTransactionCommand (TBLOCK_INPROGRESS): Nothing to do.
  • Processing this command: EndTransactionBlock(TBLOCK_INPROGRESS) ====> TBLOCK_END
  • CommitTransactionCommmand(TBLOCK_END)====CommitTransaction====> TBLOCK_DEFAULT


Executing ROLLBACK command:

  • StartTransactionCommand (TBLOCK_INPROGRESS): Nothing to do.
  • Processing this command: UserAbortTransactionBlock(TBLOCK_INPROGRESS) ====> TBLOCK_ABORT_PENDING
  • CommitTransactionCommmand(TBLOCK_ABORT_PENDING)====AbortTransaction====> TBLOCK_DEFAULT

Case-2: Implicit Start of Transaction:

Transaction is started automatically in-case a transaction block start command was not executed. This transaction is valid only for the current command and it gets committed or aborted automatically once command gets executed successfully or gets fail respectively


Command Execution- Success:

  • StartTransactionCommand (TBLOCK_DEFAULT)====StartTransaction====>TBLOCK_STARTED
  • Do the actual operation
  • CommitTransactionCommmand(TBLOCK_STARTED)====CommitTransaction====> TBLOCK_DEFAULT


Command Execution - Fail:

  • StartTransactionCommand (TBLOCK_DEFAULT)====StartTransaction====>TBLOCK_STARTED
  • Do the actual operation
  • AbortCurrentTransaction(TBLOCK_STARTED) ====AbortTransaction====> TBLOCK_DEFAULT

In my subsequent post, I would cover some more details about transaction and their implementation details.

Any comment/query/suggestion welcome.

16 comments:

  1. Thanks for the post. I've been thinking about looking into transactions for postgres in order to add two things:
    1) Support for implicit transactions that don't auto commit/rollback
    2) Support for keeping transaction running even if a command fails

    Perhaps you could give some pointers here?

    -- john

    ReplyDelete
    Replies
    1. @John:
      1) implicit transactions do not issue COMMIT/ROLLBACK, those are explicit transactions;
      2) I don't see any point in having a transaction to conitnue even if a command (within the transaction) fails, this is totally against the concept of transaction.

      Delete
    2. @John,
      1. If I understand correctly, if you don't want to auto-commit/rollback means you may want to execute few more queries inside the same transaction block. So in that case better to use explicit transaction which does not do auto-commit/rollback. Another case may be that you want to execute only one query with-in a transaction and want it to be committed or rollback later then you can use prepare transaction (already available in PG. I shall cover this in my future post).
      2. Why do you want transaction to be running even if a command fails. As Luca Ferrari pointer it will be against transaction core properties.

      If you can share exact use-cases behind these requirement, I shall be in better position to help you.

      Regards,
      Rajeev

      Delete
    3. Hi guys,

      Thanks for following up!
      Regarding #2 If I have began a transaction and issued a bunch of update statements and then I issue a statement where I misspelled a table or perhapps I wrote "selct" why on earth would I want the entire transaction to rollback then? When all I want is to see the error message, correct my statement and continue until I decide things should be rolled back or committed.

      prod=> begin;
      BEGIN
      prod=> select * from x;
      ERROR: relation "x" does not exist
      LINE 1: select * from x;
      ^
      prod=> create table x(y integer);
      ERROR: current transaction is aborted, commands ignored until end of transaction block

      Why, ohh why?
      *BUT* I just discovered that setting autocommit to off makes it behave just like I want regarding both the implicit begin + ignore errors!!

      prod=> \set autocommit off;
      prod=> select * from x;
      ERROR: relation "x" does not exist
      LINE 1: select * from x;
      ^
      prod=> create table x(y integer);
      CREATE TABLE
      prod=> select * from y;
      ERROR: relation "y" does not exist
      LINE 1: select * from y;
      ^
      prod=> create table y(x integer);
      CREATE TABLE
      prod=> commit;
      WARNING: there is no transaction in progress
      COMMIT
      prod=> select * from x;
      y
      ---
      (0 rows)

      prod=> select * from y;
      x
      ---
      (0 rows)


      I've used PGAdmin-III and there you have an option to disable AutoCommit but it still won't let you correct an error that "\set autocommit off" did.
      Then one could probably argue if that is a bug or feature of psql to allow this?

      Anyway, "\set autocommit off;" seems to make it behave exactly like how Oracle Psql does this is what I want.


      -- john

      Delete
  2. Follow up,

    It does not work like I want.

    psql (9.4.7)
    Type "help" for help.

    prod=> \set autocommit off;
    prod=> select * from x;
    ERROR: relation "x" does not exist
    LINE 1: select * from x;
    ^
    prod=> create table x(y int);
    CREATE TABLE
    prod=> select * from z;
    ERROR: relation "z" does not exist
    LINE 1: select * from z;
    ^
    prod=> rollback;
    WARNING: there is no transaction in progress
    ROLLBACK
    prod=> select * from x;
    y
    ---
    (0 rows)

    prod=> drop table x;
    DROP TABLE
    prod=> commit;
    WARNING: there is no transaction in progress
    COMMIT
    prod=> \q


    In statement above I wouldn't expect the first implicit transaction to autocommit when I've set autocommit to off. I guess that it auto commits things when it gets the error with "select * from z;"... Holy Cow Batman!

    Comments?

    -- john

    ReplyDelete
  3. Thanks for sharing with us this important Content. I feel strongly about it and really enjoyed learning more about this topic. Feel free to visit my website; 토토사이트

    ReplyDelete
  4. Amazing article. Your blog helped me to improve myself in many ways thanks for sharing this kind of wonderful informative blogs in live. I have bookmarked more article from this website. Such a nice blog you are providing. 야동

    Also feel free to visit may web page check this link 일본야동

    ReplyDelete
  5. Thanks for the marvelous posting! I certainly enjoyed reading it, you’re a great author. I will ensure that I bookmark your blog and may come back from now on. I want to encourage you to continue your great writing, have a nice day! 국산야동

    Also feel free to visit may web page check this link 야설

    ReplyDelete
  6. I’m impressed, I must say. Rarely do I come across a blog that’s both equally educative and interesting, and let me tell you, you’ve hit the nail on the head 한국야동

    Also feel free to visit may web page check this link 야설

    ReplyDelete
  7. Heya i'm for the first time here. I found this board and I find It really useful & it helped
    me out a lot. I hope to offer one thing back and help others like you helped
    me. 토토사이트


    ReplyDelete
  8. webgirls.pl When it comes to combating candidiasis, victims frequently have their work reduce on their behalf. The reason being candida albicans can certainly grow to be long-term and on-going. Bearing that in mind, in the following paragraphs, we will present a variety of among the best proven candida treatment and elimination suggestions around.

    ReplyDelete
  9. https://gameeffect.xyz Lots of people have liked the game of baseball for years. There are actually supporters around the world, from dedicated little-leaguers to pass away-hard spectators. This information has suggestions to show how pleasant baseball is really.

    ReplyDelete
  10. https://gamezoom.xyz Getting a exercise routine companion can considerably enhance your muscle tissue-constructing final results. Your spouse can be quite a beneficial supply of enthusiasm for adhering to your regular workout program, and driving anyone to increase your efforts when you workout. Developing a trustworthy spouse to work through with will also help keep you harmless as you will invariably possess a spotter.

    ReplyDelete
  11. 仮想 通貨 カジノ The thing is them on magazines and also on Tv set, men and women who appear like their arms and legs will explode his or her muscle tissues are so big! There is absolutely no will need that you can consider your body to that levels in the event you don't wish to, as being the easy strategies in the following paragraphs will help you to develop muscle within a wholesome way.

    ReplyDelete
  12. I have read through your post twice or three times, as your points coincide with my own. Your readers will benefit greatly from this outstanding information. Also Check: BSc 3rd Year Result

    ReplyDelete