observations: Hash-partitioning in a data warehouse and some side effects

In our data warehouse we were struggling for quite a while to a find a suitable solution for some daily OLTP like loads which back in the days were relying on an Advanced Queuing solution leading to a massive amount of row-by-row data processing and very high amount of IOPS.
Since we are loading data throughout the day it became a necessity to switch to some kind of bulk load mechanism. Our developers came up with the idea to have some hash partitioned tables which enable partition wise joins when having the same partitioning key. Now the merges work pretty fast since statements using the parallel options Oracle (11.2.0.4) provides:
-parallel DML
-parallel query
On the flipside this is taking up a lot of database resources in terms of CPU and IO since the statements still end up in a full table scan only split on the partition level.
One of the tables is storing core information so as the time goes by more processes started to be built around that table using hash partitioning as well.

The point the developers made was: you have to have to same amount of partitions on the tables to achieve better performance.
I was interested whether it is possible to use less partitions or even no partitioning at all for the newly created tables without having much impact on the performance. I created some tests to demonstrate the results:

I created some tables all of them are using the order_id as the partitioning key and the number of partitions varies:

create table HASH_TEST
(
customer_id NUMBER,
order_id NUMBER
)
partition by hash (ORDER_ID)
partitions 4;

create table HASH_TEST_BIG
(
customer_id NUMBER,
order_id NUMBER
)
partition by hash (ORDER_ID)
partitions 2;

create table HASH_TEST_BIG_SAME
(
customer_id NUMBER,
order_id NUMBER
)
partition by hash (ORDER_ID)
partitions 4;

insert into hash_no_part select customer_id,order_id from ods_acc where rownum<=100000;
insert into hash_test_big select customer_id,order_id from ods_acc where rownum<=10000000;
insert into hash_test_big_same_lot select customer_id,order_id from ods_acc where rownum<=10000000;
commit;

begin
dbms_stats.gather_table_stats(USER,'HASH_NO_PART');
dbms_stats.gather_table_stats(USER,'HASH_TEST_BIG');
dbms_stats.gather_table_stats(USER,'HASH_TEST_BIG_SAME');
end;
/

Now let's run selects to see whether the number of partitions really makes a difference:

--# of partitions same
select s.customer_id from hash_test s
join hash_test_big_same b
on s.order_id=b.order_id
where b.order_id=19422274;

Execution Plan
----------------------------------------------------------
Plan hash value: 261338677

---------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 116 | 2088 | 1517 (1)| 00:00:01 | | |
|* 1 | HASH JOIN | | 116 | 2088 | 1517 (1)| 00:00:01 | | |
| 2 | PART JOIN FILTER CREATE| :BF0000 | 3 | 36 | 46 (0)| 00:00:01 | | |
| 3 | PARTITION HASH SINGLE | | 3 | 36 | 46 (0)| 00:00:01 | 1 | 1 |
|* 4 | TABLE ACCESS FULL | HASH_TEST | 3 | 36 | 46 (0)| 00:00:01 | 1 | 1 |
| 5 | PARTITION HASH SINGLE | | 40 | 240 | 1471 (1)| 00:00:01 |KEY(AP)|KEY(AP)|
|* 6 | TABLE ACCESS FULL | HASH_TEST_BIG_SAME | 40 | 240 | 1471 (1)| 00:00:01 | 1 | 1 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("S"."ORDER_ID"="B"."ORDER_ID")
4 - filter("S"."ORDER_ID"=19422274)
6 - filter("B"."ORDER_ID"=19422274)

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
2094 consistent gets
0 physical reads
0 redo size
4273 bytes sent via SQL*Net to client
656 bytes received via SQL*Net from client
14 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
188 rows processed

--# of partitions not the same 2 for the small table 8 for the big table
select s.customer_id from hash_test s
join hash_test_big b
on s.order_id=b.order_id
where b.order_id=19422274;

Elapsed: 00:00:00.18

Execution Plan
----------------------------------------------------------
Plan hash value: 3648893118

----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 110 | 1980 | 3670 (1)| 00:00:02 | | |
|* 1 | HASH JOIN | | 110 | 1980 | 3670 (1)| 00:00:02 | | |
| 2 | PART JOIN FILTER CREATE| :BF0000 | 3 | 36 | 46 (0)| 00:00:01 | | |
| 3 | PARTITION HASH SINGLE | | 3 | 36 | 46 (0)| 00:00:01 | 1 | 1 |
|* 4 | TABLE ACCESS FULL | HASH_TEST | 3 | 36 | 46 (0)| 00:00:01 | 1 | 1 |
| 5 | PARTITION HASH SINGLE | | 37 | 222 | 3623 (1)| 00:00:02 |KEY(AP)|KEY(AP)|
|* 6 | TABLE ACCESS FULL | HASH_TEST_BIG | 37 | 222 | 3623 (1)| 00:00:02 | 1 | 1 |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("S"."ORDER_ID"="B"."ORDER_ID")
4 - filter("S"."ORDER_ID"=19422274)
6 - filter("B"."ORDER_ID"=19422274)

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
4822 consistent gets
0 physical reads
0 redo size
4273 bytes sent via SQL*Net to client
656 bytes received via SQL*Net from client
14 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
188 rows processed

In the first case both tables had 4 partitions and in the second case the larger table had only 2 partitions. Looking at the cost and the number of consistent gets shows a very significant difference so it seems like if the number of partitions are the same joining the two tables will consume less resources.

My point: I am fine with using the same number of partitions for every table involved in this process but could we just lower the number of partitions thereby reducing table fragmentation? Following my thinking here is another test to show how much influence the number of partitions have on the performance:

I created 2 tables with a higher amount of partitions:

create table HASH_TEST_LOT
(
customer_id NUMBER,
order_id NUMBER
)
partition by hash (ORDER_ID)
partitions 16;

create table HASH_TEST_BIG_SAME_LOT
(
customer_id NUMBER,
order_id NUMBER
)
partition by hash (ORDER_ID)
partitions 32;

insert into hash_test_lot select customer_id,order_id from ods_acc where rownum<=100000;
insert into hash_test_big_same_lot select customer_id,order_id from ods_acc where rownum<=10000000;

begin
dbms_stats.gather_table_stats(USER,'HASH_TEST_LOT');
dbms_stats.gather_table_stats(USER,'HASH_TEST_BIG_SAME_LOT');
end;
/

Let's run the same select as before:

select s.customer_id from hash_test_lot s
join hash_test_big_same_lot b
on s.order_id=b.order_id
where b.order_id=19422274;

188 rows selected.

Elapsed: 00:00:00.18

Execution Plan
----------------------------------------------------------
Plan hash value: 2985387547

-------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 89 | 3471 | 1833 (18)| 00:00:01 | | |
|* 1 | HASH JOIN | | 89 | 3471 | 1833 (18)| 00:00:01 | | |
| 2 | PART JOIN FILTER CREATE| :BF0000 | 4 | 104 | 45 (0)| 00:00:01 | | |
| 3 | PARTITION HASH SINGLE | | 4 | 104 | 45 (0)| 00:00:01 | 1 | 1 |
|* 4 | TABLE ACCESS FULL | HASH_TEST_LOT | 4 | 104 | 45 (0)| 00:00:01 | 1 | 1 |
| 5 | PARTITION HASH SINGLE | | 22 | 286 | 1788 (19)| 00:00:01 |KEY(AP)|KEY(AP)|
|* 6 | TABLE ACCESS FULL | HASH_TEST_BIG_SAME_LOT | 22 | 286 | 1788 (19)| 00:00:01 | 1 | 1 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("S"."ORDER_ID"="B"."ORDER_ID")
4 - filter("S"."ORDER_ID"=19422274)
6 - filter("B"."ORDER_ID"=19422274)

Note
-----
- dynamic sampling used for this statement (level=4)

Statistics
----------------------------------------------------------
72 recursive calls
2 db block gets
2432 consistent gets
0 physical reads
264 redo size
4273 bytes sent via SQL*Net to client
656 bytes received via SQL*Net from client
14 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
188 rows processed

As you see the number of consistent gets is slightly higher but I wouldn't consider it really significant (it's a big box) so in my opinion we could go for smaller number of hash partitions in the tables, keeping the number of partitions equal.

As I mentioned previously another option I wanted to examine is having the small table with no partitioning at all:

create table hash_no_part

(

customer_id NUMBER,

order_id NUMBER

)

;

insert into hash_no_part select customer_id,order_id from ods_acc where rownum<=100000;

begin
dbms_stats.gather_table_stats(USER,'HASH_NO_PART');
end;
/

188 rows selected.

Elapsed: 00:00:00.08

Execution Plan
----------------------------------------------------------
Plan hash value: 2379155543

---------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 32 | 1474 (1)| 00:00:01 | | |
|* 1 | HASH JOIN | | 1 | 32 | 1474 (1)| 00:00:01 | | |
| 2 | PART JOIN FILTER CREATE| :BF0000 | 1 | 26 | 3 (0)| 00:00:01 | | |
|* 3 | TABLE ACCESS FULL | HASH_NO_PART | 1 | 26 | 3 (0)| 00:00:01 | | |
| 4 | PARTITION HASH SINGLE | | 40 | 240 | 1471 (1)| 00:00:01 |KEY(AP)|KEY(AP)|
|* 5 | TABLE ACCESS FULL | HASH_TEST_BIG_SAME | 40 | 240 | 1471 (1)| 00:00:01 | 1 | 1 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("S"."ORDER_ID"="B"."ORDER_ID")
3 - filter("S"."ORDER_ID"=19422274)
5 - filter("B"."ORDER_ID"=19422274)

Statistics
----------------------------------------------------------
33 recursive calls
1 db block gets
2118 consistent gets
0 physical reads
132 redo size
4273 bytes sent via SQL*Net to client
656 bytes received via SQL*Net from client
14 SQL*Net roundtrips to/from client
5 sorts (memory)
0 sorts (disk)

188 rows processed

Pretty much the same as with partitioning so there is a possibility to use a simple table here.
The last point I was interested in having a global index on the large hash-partitioned table's partition key (order_id) which could might eliminate the hash-all partition scans:

create index glob_index on HASH_TEST_BIG_SAME_LOT(order_id) global;

select s.customer_id from hash_test_lot s
join hash_test_big_same_lot b
on s.order_id=b.order_id
where b.order_id=19422274;

188 rows selected.

Elapsed: 00:00:00.14

Execution Plan
----------------------------------------------------------
Plan hash value: 3234924659

--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 188 | 7332 | 48 (0)| 00:00:01 | | |
|* 1 | HASH JOIN | | 188 | 7332 | 48 (0)| 00:00:01 | | |
| 2 | PARTITION HASH SINGLE| | 4 | 104 | 45 (0)| 00:00:01 | 1 | 1 |
|* 3 | TABLE ACCESS FULL | HASH_TEST_LOT | 4 | 104 | 45 (0)| 00:00:01 | 1 | 1 |
|* 4 | INDEX RANGE SCAN | GLOB_INDEX | 47 | 611 | 3 (0)| 00:00:01 | | |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("S"."ORDER_ID"="B"."ORDER_ID")
3 - filter("S"."ORDER_ID"=19422274)
4 - access("B"."ORDER_ID"=19422274)

Note
-----
- dynamic sampling used for this statement (level=4)

Statistics
----------------------------------------------------------
13 recursive calls
0 db block gets
226 consistent gets
2 physical reads
0 redo size
4273 bytes sent via SQL*Net to client
656 bytes received via SQL*Net from client
14 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)

188 rows processed

This looks very promising but also you have to keep in mind that the global index may cause some noticable performance degradation when you are inserting into the table but this is another topic for later maybe, another issue which was pointed out is the concurrency on global index while inserting into more partitions at the same time therefore the loading strategy should be redesigned as well.

observations

Search This Blog

Friday, 29 April 2016

Hash-partitioning in a data warehouse and some side effects

No comments:

Post a Comment