Analytical Functions

Q1. What is Rollup?

Answer. ROLLUP produces group subtotals from right to left and a grand total. If “n” is the number of columns listed in the ROLLUP, then there will be n+1 levels of subtotals.

SELECT fact1_id, fact2_id, 
       SUM(sales_value) AS sales_value
  FROM dim_tab
 GROUP BY ROLLUP(fact1_id, fact2_id)
 ORDER BY fact1_id, fact2_id;
FACT1_IDFACT2_IDSALES_VALUE
114363.55
124794.76
134718.25
145387.45
155027.34
1 24291.35
215652.84
224583.02
235555.77
245936.67
254508.74
2 26237.04
          50528.39

Q2. What is Cube?

Answer. Cube generates subtotals like ROLLUP. It also generates subtotals for all combinations of the dimensions specified. If “n” is the number of columns listed in the CUBE, there will be 2n subtotal combinations.

SELECT fact1_id, fact2_id, 
       SUM(sales_value) AS sales_value
  FROM dimension_tab
 GROUP BY CUBE(fact1_id, fact2_id)
 ORDER BY fact1_id, fact2_id;
FACT1_IDFACT2_IDSALES_VALUE
114363.55
124794.76
134718.25
145387.45
155027.34
124291.35
215652.84
224583.02
235555.77
245936.67
254508.74
226237.04
        110016.39
29377.78
310274.02
411324.12
59536.08
50528.39

Q3. What is Rank?

Answer. Let’s assume we want to assign a sequential order, or rank, to people within a department based on salary, we might use the RANK function like.

SELECT empno, deptno, sal, 
       RANK() OVER 
         (PARTITION BY deptno ORDER BY sal) "rank"
  FROM emp;

14 rows selected.
EMPNODEPTNOSALrank
79341013001
77821024502
78391050003
7369208001
78762011002
75662029753
77882030004
79022030004
7900309501
765430 12502
75213012502
78443015004
74993016005
76983028506

Q4. Explain DENSE_RANK.

Answer. The DENSE_RANK function acts like the RANK function except that it assigns consecutive ranks.

SELECT empno, deptno, sal, 
       DENSE_RANK() OVER 
         (PARTITION BY deptno ORDER BY sal) "rank"
  FROM emp;
EMPNODEPTNOSALrank
 79341013001
77821024502
78391050003
7369208001
78762011002
75662029753
77882030004
79022030004
790030 9501
765430 12502
75213012502
78443015003
74993016004
76983028505

Q5. What is row_number?

Answer. The ROW_NUMBER analytic function is like the ROWNUM pseudo column in that it assigns a unique number for each row returned, but like all analytic functions its action can be limited to a specific window of data in the result set and based on the order of data in that window. In this simple example using a window of the whole result set it functions the same as the ROWNUM pseudo column.

SELECT val
  FROM (SELECT val, 
               ROW_NUMBER() OVER (ORDER BY val DESC) 
                 AS val_row_number 
          FROM rownum_order_test)
 WHERE val_row_number <= 5; 
VAL
10
10
9
9
8

Q6. Explain FIRST and LAST.

Answer. The FIRST and LAST functions can be used to return the first or last value from an ordered sequence. Say we want to display the salary of each employee, along with the lowest and highest within their department we may use something like.

SELECT empno, deptno, sal,
       MIN(sal) KEEP (DENSE_RANK FIRST ORDER BY sal) 
         OVER (PARTITION BY deptno) "Lowest",
       MAX(sal) KEEP (DENSE_RANK LAST ORDER BY sal) 
         OVER (PARTITION BY deptno) "Highest"
  FROM emp
 ORDER BY deptno, sal; 
EMPNODEPTNOSALLowestHighest
793410130013005000
778210245013005000
783910500013005000
7369208008003000
78762011008003000
75662029758003000
77882030008003000
79022030008003000
7900309509502850
76543012509502850
75213012509502850
78443015009502850
74993016009502850
76983028509502850

Q7. What is LEAD & LAG?

Answer. Both LEAD & LAG functions have the same usage, as shown below.
LEAD (value_expression [,offset] [,default]) OVER ([query_partition_clause] order_by_clause)
LAG (value_expression [,offset] [,default]) OVER ([query_partition_clause] order_by_clause)

value_expressionCan be a column or a built-in function, except for other analytic functions.
OffsetThe number of rows preceding / following the current row, from which the data is to be retrieved. The default value is 1.
DefaultThe value returned if the offset is outside the scope of the window. The default value is NULL.

Looking at the EMP table, we query the data in salary (SAL) order.

SELECT empno, ename, job, sal
  FROM emp
 ORDER BY sal;
EMPNOENAMEJOBSAL
7369SMITHCLERK800
7900JAMESCLERK950
7876ADAMSCLERK1100
7521WARDSALESMAN1250
7654MARTINSALESMAN1250
7934MILLERCLERK1300
7844TURNERSALESMAN1500
7499ALLENSALESMAN1600
7782CLARKMANAGER2450
7698BLAKEMANAGER2850
7566JONESMANAGER2975
7788SCOTTANALYST3000
7902FORDANALYST3000
7839KINGPRESIDENT5000

LAG
The LAG function is used to access data from a previous row. The following query returns the salary from the previous row to calculate the difference between the salary of the current row and that of the previous row. Notice that the ORDER BY of the LAG function is used to order the data by salary.

SELECT empno, ename, job, sal, 
       LAG(sal, 1, 0) OVER 
         (ORDER BY sal) AS sal_prev,
       sal - LAG(sal, 1, 0) OVER 
         (ORDER BY sal) AS sal_diff
  FROM emp;
EMPNOENAMEJOBSALSAL_PREVSAL_DIFF
7369SMITHCLERK8000800
7900JAMESCLERK950800150
7876ADAMSCLERK1100950150
7521WARDSALESMAN12501100150
7654MARTINSALESMAN125012500
7934MILLERCLERK1300125050
7844TURNERSALESMAN15001300200
7499ALLENSALESMAN16001500100
7782CLARKMANAGER24501600850
7698BLAKEMANAGER28502450400
7566JONESMANAGER29752850125
7788SCOTTANALYST3000297525
7902FORDANALYST300030000
7839KINGPRESIDENT500030002000

LEAD
The LEAD function is used to return data from the next row. The following query returns the salary from the next row to calulate the difference between the salary of the current row and the following row.

SELECT empno, ename, job, sal, 
       LEAD(sal, 1, 0) OVER 
         (ORDER BY sal) AS sal_next,
       LEAD(sal, 1, 0) OVER 
         (ORDER BY sal) - sal AS sal_diff
   FROM emp;
EMPNOENAMEJOBSALSAL_NEXTSAL_DIFF
7369SMITHCLERK800950150
7900JAMESCLERK9501100150
7876ADAMSCLERK11001250150
7521WARDSALESMAN125012500
7654MARTINSALESMAN1250130050
7934MILLERCLERK13001500200
7844TURNERSALESMAN15001600100
7499ALLENSALESMAN16002450850
7782CLARKMANAGER24502850400
7698BLAKEMANAGER28502975125
7566JONESMANAGER2975300025
7788SCOTTANALYST300030000
7902FORDANALYST300050002000
7839KINGPRESIDENT50000-5000

Q8. Explain LISTAGG.

Answer. The LISTAGG allows us to order the elements in the concatenated list.

COLUMN employees FORMAT A50

SELECT deptno, 
       LISTAGG(ename, ',') WITHIN GROUP 
         (ORDER BY ename) AS employees
  FROM emp
 GROUP BY deptno;
DEPTNOEMPLOYEES
10CLARK,KING,MILLER
20ADAMS,FORD,JONES,SCOTT,SMITH
30ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

In case of any discrepancies or doubts do contact me.
Best of Luck Friends.

Published by Girish Kirtikumar Mehta.

Software engineer with 25+ years of experience in software development. Have exposure to domains like Core Banking System, Islamic Banking, Mutual Funds, Credit Cards, Insurance, Digital Gifts, Accounting, Construction, etc. Developed applications using C, various versions of Delphi (3, 5, 7, 2005, 2007, Xe2, Xe8 and 12.2), Oracle SQL & PL/SQL, MS SQL & T-SQL, MySQL, MS Access.

One thought on “Analytical Functions

Leave a reply to Kiran Cancel reply