Smart Scan In Exadata

Smart scan is one of the major contributors which boost the database performance and reduce data transfer traffic over the network from storage tier to compute node. But 2 question raises here:

 (1)  How Exadata machine choose to use smart scan for a sql statement ?

(2)  How it helps to transfer only necessary block from storage tier to compute node

 

In the traditional database system the storage system was not aware with the database technology so whenever the database client request came for retrieving a specific block or blocks of data it simply revert back the data block which includes necessary and unnecessary both and then the use of predicates referencing in the query and filtration work done at the database’s instance memory due to which  the data transfer traffic increases over the network however in Exadata , the Exadata storage server software has self-intelligence and also they are aware with the database technology so whenever the database client request came then all the work of using predicates and filtration done at the storage tier due to which only the required data block has been sent to the compute/database server which also reduces the data transfer load over the network and result in high response time.

 

Facts Need To Remember About Smart Scan

 

Smart scan is a run time decision decided by exadata strorage software however we still need to remember some facts related to smart scan feature.

 

(1) Smart scan also works well with encrypted and compressed data set.

(2) Optimizer cannot decide whether a particular query will use the feature of smart scan or not however smart scan indirectly influences by the result of query optimization

(3) Smart scan work for full table table,index fast full scan and parallel queries

(4) Any query which is using a parallelism is automatically a potential candidate for using smart scan

(5) It’s not always mandatory that the query should use parallelism to use smart scan feature however if a query will go for a full table scan, then that query is also a potential candidate for smart scan.

(6)  Smart scan must be enabled within the database using cell_offload_processing parameter which must be set to TRUE

(7) The default value of cell_offload_processing parameter is TRUE

(8) If a join operator is using bloom filter then that query is also a good candidate for using smart scan feature

 


  Conditions Prevents To Use Smart Scan

 

(1) If cell_offload_processing parameter is set to FALSE or any query which is using a param_hint of cell_offload_processing to false then that query will not use smart scan feature

(2) If the number of columns referencing in the query is more than 255 then the query will not use smart scan

(3) If the query is referencing a virtual column, smart will not work

(4) If the table referencing into the query is IOT table, smart scan will not work

(5)  If the table is a clustered table, smart scan will not work

(6)  If the command is creating using NOSORT clause then smart scan will not work

(7) If the column referencing in a query is a LONG or LOB column then smart scan will not work

(8) The Sql in question must not be quarantined

(9) Some sql function or operator also doesn’t support smart scan feature as well and to check whether a particular sql function or operator uses the smart scan feature for that we can use v$sqlfn_metadata view where OFFLODABLE column display the value YES or NO where YES indicate the use of smart scan and NO indicate that the smart scan will not be use by the sql function or operator

  

What Is SQL Quarantine

 When a cellsrv process (on an exadata storage server cell) terminates and, if the thread that caused the death (or rather encountered the death) was performing smart scan, we determine the sql step that the thread was performing and make a note of it in a file on the cell. When the cellsrv restarts, the plan step is considered quarantined. When there are 3 quarantines, smart scan for a database gets quarantined. When smart scan gets quarantined, cellsrv will not perform any predicate evaluation/filtering. It will issue parallel IOs, and return the blocks as is.

 

Parameter Related To Smart Scan

Majorly there are 2 parameter which comes into the role when we talk about the smart scan feature and both of them serves a different purpose.

 (1)  Cell_offload_processing parameter is one of the mandatory parameter which must be set to TRUE if we want to use the feature of smart scan to boost database performance and reduce data transfer traffic over the network, The by default value of cell_offload_processing is TRUE.

 (2)  Cell_offload_plan parameter decides whether the STORAGE predicate clause will be display in the execution plan of the sql statement or not. The value of this parameter can be of 3 types:

 

AUTO:  This is the default setting of cell_offload_plan parameter which instruct to display storage clause predicate in the sql statement execution plan

 

ALWAYS: It instructs to display STORAGE clause in the execution plan whether the query is getting benefit from smart scan or not.

 

NEVER: It instructs not to display Storage predicate in query execution plan

 

             Is SQL Query Actually Using Smart Scan

 

If a query execution plan is displaying STORAGE predicate clause, then we would consider that the query is using smart scan however it’s still not guarantee that the query is actually using the feature of smart scan or there may be a situation when we get less benefit from smart scan.

 (1)  Smart scan always uses the direct path read which means that the block must be in current state and during the query execution if Exadata machine or software feel that the block state is not current then it simply transfers the request to the traditional buffer cache and in that case, we’ll not get much benefit from smart scan

 (2)  If the row is chained or migrated then the Exadata machine need additional block to read to resolve indirect reference due to which we’ll not get much benefit from smart scan

 (3)  If the current usage of the Exadata storage tier is significantly high then the Exadata machine will transfer extra block to database server to balance CPU usage in that case we’ll not get much benefit from smart scan

 (4)  Since the smart scan is related to storage server I/O, so if the block is already resided inside database buffer cache then in that case we’ll not get much benefit from smart scan.

 (5)  If a statement is affected by storage server quarantine, then smart scan may be disabled for that statement on the storage server that contain the quarantine.

 

Validate If Query Is Actually Getting Benefit from Smart Scan


We can enable the tracing and validate the trace file to check if sql statement is  using smart scan or not OR checking statistics is another way through which we can validate whether sql statement is getting benefit from smart scan or not.


set pagesize 999

set lines 190

col sql_text format a70 trunc

col child format 99999

col execs format 9,999

col avg_etime format 99,999.99

col "IO_SAVED_%" format 999.99

col avg_px format 999

col offload for a7

select sql_id, child_number child, plan_hash_value plan_hash, executions execs,

(elapsed_time/1000000)/decode(nvl(executions,0),0,1,executions)/

decode(px_servers_executions,0,1,px_servers_executions/decode(nvl(executions,0),0,1,executions)) avg_etime,

px_servers_executions/decode(nvl(executions,0),0,1,executions) avg_px,

decode(IO_CELL_OFFLOAD_ELIGIBLE_BYTES,0,'No','Yes') Offload,

decode(IO_CELL_OFFLOAD_ELIGIBLE_BYTES,0,0,100*(IO_CELL_OFFLOAD_ELIGIBLE_BYTES-IO_INTERCONNECT_BYTES)

/decode(IO_CELL_OFFLOAD_ELIGIBLE_BYTES,0,1,IO_CELL_OFFLOAD_ELIGIBLE_BYTES)) "IO_SAVED_%",

sql_text

from v$sql s

where upper(sql_text) like upper(nvl(q'[&sql_text]',sql_text))

and sql_text not like 'BEGIN :sql_text := %'

and sql_text not like '%IO_CELL_OFFLOAD_ELIGIBLE_BYTES%'

and sql_text not like '/* SQL Analyze(%'

and sql_id like nvl('&sql_id',sql_id)

order by 1, 2, 3

/

 

NOTE: IO_CELL_OFFLOAD_ELIGIBLE_BYTES in V$SQL to indicate whether a statement can be offloaded or not

 

Query 1

SQL_ID         CHILD  PLAN_HASH  EXECS  AVG_ETIME AVG_PX OFFLOAD IO_SAVED_% SQL_TEXT

------------- ------ ---------- ------ ---------- ------ ------- ---------- ----------------------------------------------------------------------

6avfua5g1gkh2      0 2703984749      1        .35      0 No             .00

 

Query 2

SQL_ID         CHILD  PLAN_HASH  EXECS  AVG_ETIME AVG_PX OFFLOAD IO_SAVED_% SQL_TEXT

------------- ------ ---------- ------ ---------- ------ ------- ---------- ----------------------------------------------------------------------

abgqtq8zt90sb      0 3570092908      1        .79      0 Yes          99.86 select * from test_objects where object_id='54'

 

NOTE: We can see in above output that query 2 is using smart scan and it saves approx. 99% of unnecessary I/O to transfer from storage tier to compute node.

                    

Query To Check Statistics

 

select b.name, a.value from v$mystat a, v$statname b

  where a.STATISTIC# = b.STATISTIC# and

        (b.name = 'cell session smart scan efficiency' or

         b.name = 'cell physical IO bytes saved by storage index' or

         b.name = 'cell physical IO bytes eligible for predicate offload' or

         b.name = 'cell physical IO interconnect bytes returned by smart scan'

         or

         b.name = 'cell IO uncompressed bytes'

         or

         b.name like '%cell blocks processed%');

 

--------------------------------------

 select name, value from v$mystat s, v$statname n

 where n.statistic#=s.statistic#

 and name like 'cell scans';




Post a Comment

Previous Post Next Post