LAG()
The LAG window function that returns the value from a specified offset location relative to the current row. In datasets where data is logged by a predefined sequence like time or dates, it can be used to compare current value to previous data value. Put simply, it returns the lagged value of a column from a table.
The lag function takes 3 parameters:
- return_value: This is the column we wish to return the lagged value
- lag_value: An integer to specify the lag magnitude. i.e. 1 for lag by 1 row, 2 for lag by 2 rows
- default_value: A place-holder for lagged values that cannot be found. i.e. the first lagged value will always be null.
The syntax looks like this:
LAG(return_column, lag_value, default_value )
LAG() by Example
Let's query the database to return a list of 10 customer_ids and their credit limit. Within the query, we will add a few LAG functions.
SELECT
customer_id,
credit_limit,
LAG(credit_limit, 1) OVER() as credit_lag_1,
LAG(credit_limit, 2) OVER() as credit_lag_2
FROM customers
WHERE credit_limit IS NOT NULL
LIMIT 10;
customer_id | credit_limit | credit_lag_1 | credit_lag_2 -------------+--------------+--------------+-------------- 1 | 5000.00 | | 2 | 3000.00 | 5000.00 | 3 | 4000.00 | 3000.00 | 5000.00 4 | 2500.00 | 4000.00 | 3000.00 6 | 6000.00 | 2500.00 | 4000.00 7 | 2000.00 | 6000.00 | 2500.00 8 | 3500.00 | 2000.00 | 6000.00 9 | 4500.00 | 3500.00 | 2000.00 10 | 2800.00 | 4500.00 | 3500.00 12 | 3200.00 | 2800.00 | 4500.00 (10 rows)
Notice that both credit_lag_1 and credit_lag_2 are taking values from credit and offseting them by the lag value. This operation can be useful in computing something like inter-day differences in performances or even intra-day. As long as the data is ordered in the correct form.