Analytical Functions

Q1. What is Rollup?

Answer. ROLLUP produces group subtotals from right to left and a grand total. If “n” is the number of columns listed in the ROLLUP, then there will be n+1 levels of subtotals.

SELECT fact1_id, fact2_id, 
       SUM(sales_value) AS sales_value
  FROM dim_tab
 GROUP BY ROLLUP(fact1_id, fact2_id)
 ORDER BY fact1_id, fact2_id;

FACT1_ID	FACT2_ID	SALES_VALUE
1	1	4363.55
1	2	4794.76
1	3	4718.25
1	4	5387.45
1	5	5027.34
1		24291.35
2	1	5652.84
2	2	4583.02
2	3	5555.77
2	4	5936.67
2	5	4508.74
2		26237.04
		50528.39

Q2. What is Cube?

Answer. Cube generates subtotals like ROLLUP. It also generates subtotals for all combinations of the dimensions specified. If “n” is the number of columns listed in the CUBE, there will be 2n subtotal combinations.

SELECT fact1_id, fact2_id, 
       SUM(sales_value) AS sales_value
  FROM dimension_tab
 GROUP BY CUBE(fact1_id, fact2_id)
 ORDER BY fact1_id, fact2_id;

FACT1_ID	FACT2_ID	SALES_VALUE
1	1	4363.55
1	2	4794.76
1	3	4718.25
1	4	5387.45
1	5	5027.34
1		24291.35
2	1	5652.84
2	2	4583.02
2	3	5555.77
2	4	5936.67
2	5	4508.74
2		26237.04
	1	10016.39
	2	9377.78
	3	10274.02
	4	11324.12
	5	9536.08
		50528.39

Q3. What is Rank?

Answer. Let’s assume we want to assign a sequential order, or rank, to people within a department based on salary, we might use the RANK function like.

SELECT empno, deptno, sal, 
       RANK() OVER 
         (PARTITION BY deptno ORDER BY sal) "rank"
  FROM emp;

14 rows selected.

EMPNO	DEPTNO	SAL	rank
7934	10	1300	1
7782	10	2450	2
7839	10	5000	3
7369	20	800	1
7876	20	1100	2
7566	20	2975	3
7788	20	3000	4
7902	20	3000	4
7900	30	950	1
7654	30	1250	2
7521	30	1250	2
7844	30	1500	4
7499	30	1600	5
7698	30	2850	6

Q4. Explain DENSE_RANK.

Answer. The DENSE_RANK function acts like the RANK function except that it assigns consecutive ranks.

SELECT empno, deptno, sal, 
       DENSE_RANK() OVER 
         (PARTITION BY deptno ORDER BY sal) "rank"
  FROM emp;

EMPNO	DEPTNO	SAL	rank
7934	10	1300	1
7782	10	2450	2
7839	10	5000	3
7369	20	800	1
7876	20	1100	2
7566	20	2975	3
7788	20	3000	4
7902	20	3000	4
7900	30	950	1
7654	30	1250	2
7521	30	1250	2
7844	30	1500	3
7499	30	1600	4
7698	30	2850	5

Q5. What is row_number?

Answer. The ROW_NUMBER analytic function is like the ROWNUM pseudo column in that it assigns a unique number for each row returned, but like all analytic functions its action can be limited to a specific window of data in the result set and based on the order of data in that window. In this simple example using a window of the whole result set it functions the same as the ROWNUM pseudo column.

SELECT val
  FROM (SELECT val, 
               ROW_NUMBER() OVER (ORDER BY val DESC) 
                 AS val_row_number 
          FROM rownum_order_test)
 WHERE val_row_number <= 5;

VAL

Q6. Explain FIRST and LAST.

Answer. The FIRST and LAST functions can be used to return the first or last value from an ordered sequence. Say we want to display the salary of each employee, along with the lowest and highest within their department we may use something like.

SELECT empno, deptno, sal,
       MIN(sal) KEEP (DENSE_RANK FIRST ORDER BY sal) 
         OVER (PARTITION BY deptno) "Lowest",
       MAX(sal) KEEP (DENSE_RANK LAST ORDER BY sal) 
         OVER (PARTITION BY deptno) "Highest"
  FROM emp
 ORDER BY deptno, sal;

EMPNO	DEPTNO	SAL	Lowest	Highest
7934	10	1300	1300	5000
7782	10	2450	1300	5000
7839	10	5000	1300	5000
7369	20	800	800	3000
7876	20	1100	800	3000
7566	20	2975	800	3000
7788	20	3000	800	3000
7902	20	3000	800	3000
7900	30	950	950	2850
7654	30	1250	950	2850
7521	30	1250	950	2850
7844	30	1500	950	2850
7499	30	1600	950	2850
7698	30	2850	950	2850

Q7. What is LEAD & LAG?

Answer. Both LEAD & LAG functions have the same usage, as shown below.
LEAD (value_expression [,offset] [,default]) OVER ([query_partition_clause] order_by_clause)
LAG (value_expression [,offset] [,default]) OVER ([query_partition_clause] order_by_clause)

value_expression	Can be a column or a built-in function, except for other analytic functions.
Offset	The number of rows preceding / following the current row, from which the data is to be retrieved. The default value is 1.
Default	The value returned if the offset is outside the scope of the window. The default value is NULL.

Looking at the EMP table, we query the data in salary (SAL) order.

SELECT empno, ename, job, sal
  FROM emp
 ORDER BY sal;

EMPNO	ENAME	JOB	SAL
7369	SMITH	CLERK	800
7900	JAMES	CLERK	950
7876	ADAMS	CLERK	1100
7521	WARD	SALESMAN	1250
7654	MARTIN	SALESMAN	1250
7934	MILLER	CLERK	1300
7844	TURNER	SALESMAN	1500
7499	ALLEN	SALESMAN	1600
7782	CLARK	MANAGER	2450
7698	BLAKE	MANAGER	2850
7566	JONES	MANAGER	2975
7788	SCOTT	ANALYST	3000
7902	FORD	ANALYST	3000
7839	KING	PRESIDENT	5000

LAG
The LAG function is used to access data from a previous row. The following query returns the salary from the previous row to calculate the difference between the salary of the current row and that of the previous row. Notice that the ORDER BY of the LAG function is used to order the data by salary.

SELECT empno, ename, job, sal, 
       LAG(sal, 1, 0) OVER 
         (ORDER BY sal) AS sal_prev,
       sal - LAG(sal, 1, 0) OVER 
         (ORDER BY sal) AS sal_diff
  FROM emp;

EMPNO	ENAME	JOB	SAL	SAL_PREV	SAL_DIFF
7369	SMITH	CLERK	800	0	800
7900	JAMES	CLERK	950	800	150
7876	ADAMS	CLERK	1100	950	150
7521	WARD	SALESMAN	1250	1100	150
7654	MARTIN	SALESMAN	1250	1250	0
7934	MILLER	CLERK	1300	1250	50
7844	TURNER	SALESMAN	1500	1300	200
7499	ALLEN	SALESMAN	1600	1500	100
7782	CLARK	MANAGER	2450	1600	850
7698	BLAKE	MANAGER	2850	2450	400
7566	JONES	MANAGER	2975	2850	125
7788	SCOTT	ANALYST	3000	2975	25
7902	FORD	ANALYST	3000	3000	0
7839	KING	PRESIDENT	5000	3000	2000

LEAD
The LEAD function is used to return data from the next row. The following query returns the salary from the next row to calulate the difference between the salary of the current row and the following row.

SELECT empno, ename, job, sal, 
       LEAD(sal, 1, 0) OVER 
         (ORDER BY sal) AS sal_next,
       LEAD(sal, 1, 0) OVER 
         (ORDER BY sal) - sal AS sal_diff
   FROM emp;

EMPNO	ENAME	JOB	SAL	SAL_NEXT	SAL_DIFF
7369	SMITH	CLERK	800	950	150
7900	JAMES	CLERK	950	1100	150
7876	ADAMS	CLERK	1100	1250	150
7521	WARD	SALESMAN	1250	1250	0
7654	MARTIN	SALESMAN	1250	1300	50
7934	MILLER	CLERK	1300	1500	200
7844	TURNER	SALESMAN	1500	1600	100
7499	ALLEN	SALESMAN	1600	2450	850
7782	CLARK	MANAGER	2450	2850	400
7698	BLAKE	MANAGER	2850	2975	125
7566	JONES	MANAGER	2975	3000	25
7788	SCOTT	ANALYST	3000	3000	0
7902	FORD	ANALYST	3000	5000	2000
7839	KING	PRESIDENT	5000	0	-5000

Q8. Explain LISTAGG.

Answer. The LISTAGG allows us to order the elements in the concatenated list.

COLUMN employees FORMAT A50

SELECT deptno, 
       LISTAGG(ename, ',') WITHIN GROUP 
         (ORDER BY ename) AS employees
  FROM emp
 GROUP BY deptno;

DEPTNO	EMPLOYEES
10	CLARK,KING,MILLER
20	ADAMS,FORD,JONES,SCOTT,SMITH
30	ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

In case of any discrepancies or doubts do contact me.
Best of Luck Friends.

Analytical Functions

Published by Girish Kirtikumar Mehta.

One thought on “Analytical Functions”

Leave a reply to Kiran Cancel reply

Share this:

Related

Published by Girish Kirtikumar Mehta.

One thought on “Analytical Functions”

Leave a reply to Kiran Cancel reply