Pandas Groupby Aggregation Multiple Compute Functions
Often you may need to aggregate data such that you compute multiple functions against all columns of data. In this example, I demonstrate how to aggregate data with pandas groupby using multiple compute methods.
import numpy as np
import pandas as pd
from datetime import datetime
To begin, let's read a file with some financial data at the daily level for multiple stocks.
sample_df = pd.read_csv('full_data.csv')
sample_df.head()
date | Ticker | 1. open | 2. high | 3. low | 4. close | 5. volume | |
---|---|---|---|---|---|---|---|
0 | 2019-07-03 | AAPL | 203.603278 | 203.653537 | 203.560323 | 203.606995 | 53458.197115 |
1 | 2019-07-03 | AXP | 125.587369 | 125.616468 | 125.558919 | 125.587030 | 14894.287805 |
2 | 2019-07-03 | BA | 353.382682 | 353.453835 | 353.301135 | 353.377761 | 32912.900474 |
3 | 2019-07-03 | CAT | 135.110770 | 135.152305 | 135.065153 | 135.105175 | 10665.087379 |
4 | 2019-07-03 | CSCO | 56.063977 | 56.083697 | 56.045637 | 56.065251 | 61533.024038 |
Below, we perform multiple aggregations on all price and volume columns to return the mean and var by day.
sample_df.groupby(['date', 'Ticker'], as_index=False).agg([np.mean, np.var]).head()
1. open | 2. high | 3. low | 4. close | 5. volume | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
mean | var | mean | var | mean | var | mean | var | mean | var | ||
date | Ticker | ||||||||||
2019-07-03 | AAPL | 203.603278 | 0.095705 | 203.653537 | 0.091135 | 203.560323 | 0.104315 | 203.606995 | 0.099121 | 53458.197115 | 1.364074e+10 |
AXP | 125.587369 | 0.073431 | 125.616468 | 0.072509 | 125.558919 | 0.075973 | 125.587030 | 0.074976 | 14894.287805 | 1.688696e+08 | |
BA | 353.382682 | 0.272969 | 353.453835 | 0.273478 | 353.301135 | 0.272263 | 353.377761 | 0.272168 | 32912.900474 | 2.697468e+10 | |
CAT | 135.110770 | 0.169655 | 135.152305 | 0.173295 | 135.065153 | 0.155551 | 135.105175 | 0.161341 | 10665.087379 | 7.142522e+07 | |
CSCO | 56.063977 | 0.091510 | 56.083697 | 0.089896 | 56.045637 | 0.094455 | 56.065251 | 0.092519 | 61533.024038 | 2.410866e+10 |