Wednesday, March 25, 2015

Index Scan Optimization for ">" condition

In PostgreSQL 9.5, we can see improved performance  for Index Scan on ">" condition.

In order to explain this optimization, consider the below schema:
create table tbl2(id1 int, id2 varchar(10), id3 int);
create index idx2 on tbl2(id2, id3);

Query as:
                select count(*) from tbl2 where id2>'a' and id3>990000;

As per design prior to this patch, Above query used following steps to retrieve index tuples:

  • Find the scan start position by searching first position in BTree as per the first key condition i.e. as per id2>'a'
  • Then it fetches each tuples from position found in step-1.
  • For each tuple, it matches all scan key condition, in our example it matches both scan key condition.
  • If condition match, it returns the tuple otherwise scan stops.


Now problem is here that already first scan key condition is matched to find the scan start position (Step-1), so it is obvious that any further tuple also will match the first scan key condition (as records are sorted).

So comparison on first scan key condition again in step-3 seems to be redundant.

So we have made the changes in BTree scan algorithm to avoid the redundant check i.e. remove the first key comparison for each tuple as it is guaranteed to be always true.

Performance result summary:



I would like to thanks Simon Riggs for verifying and committing this patch. Simon Riggs also confirmed improvement of 5% in both short and long index, on the least beneficial data-type and considered to be very positive win overall. 

2 comments:

  1. It's late finding this act. At least, it's a thing to be familiar with that there are such events exist. I agree with your Blog and I will be back to inspect it more in the future so please keep up your act 야설
    Also visit my web site 야설

    ReplyDelete