This article shows how to use the new PIVOT
and UNPIVOT
operators in 11g release 2.
PIVOT
For starting, we need some data
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | create table sales ( customer_id varchar2(10), product_id varchar2(10), month number, amount number ); insert into sales values('Adam','Product 1',1,134); insert into sales values('Adam','Product 2',2,250); insert into sales values('Adam','Product 3',3,190); insert into sales values('Jones','Product 1',1,200); insert into sales values('Jones','Product 3',2,126); insert into sales values('Kanes','Product 2',1,240); insert into sales values('Kanes','Product 2',2,210); insert into sales values('Kanes','Product 3',3,250); commit; |
As all we know, relational tables are tabular - that is, they are presented in a column-value pair. Now the simple select query is below:
1 2 | SELECT a.customer_id, a.product_id, a.month, a.amount FROM sales a; |
And this is the output:
1 2 3 4 5 6 7 8 9 10 | CUSTOMER_ID PRODUCT_ID MONTH AMOUNT ----------- ---------- ------ --------- Adam Product 1 1 134 Adam Product 2 2 250 Adam Product 3 3 190 Jones Product 1 1 200 Jones Product 3 2 126 Kanes Product 2 1 240 Kanes Product 2 2 210 Kanes Product 3 3 250 |
Suppose if we want to transpose the query output to row-value instead of column-value, it is called Pivoting (Rows to Columns Conversion) in SQL. The output will be transposed as
1 2 3 4 5 6 7 8 9 10 | CUSTOMER_ID MONTH PRODUCT_1 PRODUCT_2 PRODUCT_3 ----------- ------ ------------ ------------ ------------ Adam 1 134 Adam 2 250 Adam 3 190 Jones 1 200 Jones 2 126 Kanes 1 240 Kanes 2 210 Kanes 3 250 |
Prior to Oracle Database 11g, you would do that via some sort of a DECODE function for each value and write each distinct value as a separate column (in this article we do not mention about DECODE function). The technique is quite nonintuitive however.
Fortunately, you now have a great new feature called PIVOT for presenting any query in the crosstab format using a new operator, appropriately named PIVOT. Here is how you write the query:
1 2 3 4 5 6 7 | SELECT * FROM sales PIVOT (SUM (amount) FOR product_id IN ('Product 1' AS Product_1, 'Product 2' AS Product_2, 'Product 3' AS Product_3)) ORDER BY customer_id, month; |
With the SELECT statement above, we notice that we have some NULL or empty values. To replace NULL values with another value like zero value we can use COALESCE function like this:
1 2 3 4 5 6 7 8 9 10 11 | SELECT customer_id, month, COALESCE (Product_1, 0) Product_1, COALESCE (Product_2, 0) Product_2, COALESCE (Product_3, 0) Product_3 FROM sales PIVOT (SUM (amount) FOR product_id IN ('Product 1' AS Product_1, 'Product 2' AS Product_2, 'Product 3' AS Product_3)) ORDER BY customer_id, month; |
Next, suppose that we have n products, how can we list these n products not product by product? In Oracle 11g, we can do this by using PIVOT plus XML clause:
1 2 3 4 5 | SELECT * FROM sales PIVOT XML (SUM (amount) FOR product_id IN (SELECT DISTINCT product_id FROM sales)) ORDER BY customer_id, month; |
And here is the result:
1 2 3 4 5 6 7 8 9 10 | CUSTOMER_ID MONTH PRODUCT_ID_XML ----------- ------ -------------- Adam 1 <PivotSet><item><column name = "PRODUCT_ID">Product 1</column><column name = "SUM(AMOUNT)">134</column></item><item><column name = "PRODUCT_ID">Product 2</column><column name = "SUM(AMOUNT)"></column></item><item><column name = "PRODUCT_ID">Product 3</column><column name = "SUM(AMOUNT)"></column></item></PivotSet> Adam 2 <PivotSet><item><column name = "PRODUCT_ID">Product 1</column><column name = "SUM(AMOUNT)"></column></item><item><column name = "PRODUCT_ID">Product 2</column><column name = "SUM(AMOUNT)">250</column></item><item><column name = "PRODUCT_ID">Product 3</column><column name = "SUM(AMOUNT)"></column></item></PivotSet> Adam 3 <PivotSet><item><column name = "PRODUCT_ID">Product 1</column><column name = "SUM(AMOUNT)"></column></item><item><column name = "PRODUCT_ID">Product 2</column><column name = "SUM(AMOUNT)"></column></item><item><column name = "PRODUCT_ID">Product 3</column><column name = "SUM(AMOUNT)">190</column></item></PivotSet> Jones 1 <PivotSet><item><column name = "PRODUCT_ID">Product 1</column><column name = "SUM(AMOUNT)">200</column></item><item><column name = "PRODUCT_ID">Product 2</column><column name = "SUM(AMOUNT)"></column></item><item><column name = "PRODUCT_ID">Product 3</column><column name = "SUM(AMOUNT)"></column></item></PivotSet> Jones 2 <PivotSet><item><column name = "PRODUCT_ID">Product 1</column><column name = "SUM(AMOUNT)"></column></item><item><column name = "PRODUCT_ID">Product 2</column><column name = "SUM(AMOUNT)"></column></item><item><column name = "PRODUCT_ID">Product 3</column><column name = "SUM(AMOUNT)">126</column></item></PivotSet> Kanes 1 <PivotSet><item><column name = "PRODUCT_ID">Product 1</column><column name = "SUM(AMOUNT)"></column></item><item><column name = "PRODUCT_ID">Product 2</column><column name = "SUM(AMOUNT)">240</column></item><item><column name = "PRODUCT_ID">Product 3</column><column name = "SUM(AMOUNT)"></column></item></PivotSet> Kanes 2 <PivotSet><item><column name = "PRODUCT_ID">Product 1</column><column name = "SUM(AMOUNT)"></column></item><item><column name = "PRODUCT_ID">Product 2</column><column name = "SUM(AMOUNT)">210</column></item><item><column name = "PRODUCT_ID">Product 3</column><column name = "SUM(AMOUNT)"></column></item></PivotSet> Kanes 3 <PivotSet><item><column name = "PRODUCT_ID">Product 1</column><column name = "SUM(AMOUNT)"></column></item><item><column name = "PRODUCT_ID">Product 2</column><column name = "SUM(AMOUNT)"></column></item><item><column name = "PRODUCT_ID">Product 3</column><column name = "SUM(AMOUNT)">250</column></item></PivotSet> |
We see that the PRODUCT_ID_XML column returns XML data type and it is very hard to see the result.
Now let's get the first XML value to see.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | <PivotSet> <item> <column name = "PRODUCT_ID">Product 1</column> <column name = "SUM(AMOUNT)">134</column> </item> <item> <column name = "PRODUCT_ID">Product 2</column> <column name = "SUM(AMOUNT)"/> </item> <item> <column name = "PRODUCT_ID">Product 3</column> <column name = "SUM(AMOUNT)"/> </item> </PivotSet> |
With XML data type, we can use EXTRACTVALUE function with XPath structure to get the value that we want. We can do like this:
1 2 3 4 5 6 7 8 9 10 | SELECT customer_id, month, EXTRACTVALUE (product_id_xml, '/PivotSet/item[1]/column[2]') Product_1, EXTRACTVALUE (product_id_xml, '/PivotSet/item[2]/column[2]') Product_2, EXTRACTVALUE (product_id_xml, '/PivotSet/item[3]/column[2]') Product_3 FROM (SELECT * FROM sales PIVOT XML (SUM (amount) FOR product_id IN (SELECT DISTINCT product_id FROM sales)) ORDER BY customer_id, month); |
And here is what we want to see:
1 2 3 4 5 6 7 8 9 10 | CUSTOMER_ID MONTH PRODUCT_1 PRODUCT_2 PRODUCT_3 ----------- ------ --------- --------- --------- Adam 1 134 Adam 2 250 Adam 3 190 Jones 1 200 Jones 2 126 Kanes 1 240 Kanes 2 210 Kanes 3 250 |
UNPIVOT
For starting, we need some data
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | create table sales_month ( customer_id varchar2(10), product_id varchar2(10), january number, february number, march number ); insert into sales_month values('Kanes','Product 1',0,0,1); insert into sales_month values('Jones','Product 1',0,1,1); insert into sales_month values('Kanes','Product 3',0,1,0); insert into sales_month values('Kanes','Product 2',1,0,0); insert into sales_month values('Adam','Product 2',0,1,0); insert into sales_month values('Adam','Product 1',1,0,0); insert into sales_month values('Adam','Product 3',0,0,1); commit; |
The UNPIVOT operator converts column-based data into separate rows. See the simple query before going to converting column-based data.
1 | select * from sales_month; |
And the result is:
1 2 3 4 5 6 7 8 9 | CUSTOMER_ID PRODUCT_ID JANUARY FEBRUARY MARCH ----------- ---------- -------- --------- ------ Kanes Product 1 0 0 1 Jones Product 1 0 1 1 Kanes Product 3 0 1 0 Kanes Product 2 1 0 0 Adam Product 2 0 1 0 Adam Product 1 1 0 0 Adam Product 3 0 0 1 |
Prior to 11g, we can use the DECODE function and a pivot table with the correct number of rowsby using the CONNECT BY clause in a query from dual to generate the correct number of rows for the unpivot operation. But in this article we do not mention about this.
So we will use UNPIVOT function to convert column-based data into separate rows:
1 2 3 4 5 | SELECT * FROM sales_month UNPIVOT (yes FOR month IN ("JANUARY", "FEBRUARY", "MARCH")) ORDER BY customer_id, product_id; |
Here is the result:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | CUSTOMER_ID PRODUCT_ID MONTH YES ----------- ---------- -------- ---- Adam Product 1 JANUARY 1 Adam Product 1 FEBRUARY 0 Adam Product 1 MARCH 0 Adam Product 2 FEBRUARY 1 Adam Product 2 JANUARY 0 Adam Product 2 MARCH 0 Adam Product 3 FEBRUARY 0 Adam Product 3 JANUARY 0 Adam Product 3 MARCH 1 Jones Product 1 FEBRUARY 1 Jones Product 1 JANUARY 0 Jones Product 1 MARCH 1 Kanes Product 1 MARCH 1 Kanes Product 1 JANUARY 0 Kanes Product 1 FEBRUARY 0 Kanes Product 2 FEBRUARY 0 Kanes Product 2 JANUARY 1 Kanes Product 2 MARCH 0 Kanes Product 3 MARCH 0 Kanes Product 3 FEBRUARY 1 Kanes Product 3 JANUARY 0 |
With the result above, we want to sort the month column in the order. How can we do that?
We use ORDER BY clause and DECODE function:
1 2 3 4 5 6 7 | SELECT * FROM sales_month UNPIVOT (yes FOR month IN ("JANUARY", "FEBRUARY", "MARCH")) ORDER BY customer_id, product_id, DECODE (month, 'JANUARY', 1, 'FEBRUARY', 2, 'MARCH', 3); |
And here:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | CUSTOMER_ID PRODUCT_ID MONTH YES ----------- ---------- -------- ---- Adam Product 1 JANUARY 1 Adam Product 1 FEBRUARY 0 Adam Product 1 MARCH 0 Adam Product 2 JANUARY 0 Adam Product 2 FEBRUARY 1 Adam Product 2 MARCH 0 Adam Product 3 JANUARY 0 Adam Product 3 FEBRUARY 0 Adam Product 3 MARCH 1 Jones Product 1 JANUARY 0 Jones Product 1 FEBRUARY 1 Jones Product 1 MARCH 1 Kanes Product 1 JANUARY 0 Kanes Product 1 FEBRUARY 0 Kanes Product 1 MARCH 1 Kanes Product 2 JANUARY 1 Kanes Product 2 FEBRUARY 0 Kanes Product 2 MARCH 0 Kanes Product 3 JANUARY 0 Kanes Product 3 FEBRUARY 1 Kanes Product 3 MARCH 0 |
0 comments:
Post a Comment