Pandas Groupby with Different Aggregation Functions

Like SQL, pandas provides a useful aggregation method in the form of GROUP BY. In this example, I demonstrate the use of pandas groupby with multiple aggregation functions.

import pandas as pd
from sklearn import datasets

Loading the iris dataset from the sklearn datasets

iris = datasets.load_iris()
iris_df = pd.DataFrame(iris.data, columns=iris.feature_names)

decode_species = {0: 'Setosa', 1:'Versicolor', 2:'Virginica' }
iris_df['Species'] = [decode_species.get(specie) for specie in iris.target]
iris_df.head()

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)	Species
0	5.1	3.5	1.4	0.2	Setosa
1	4.9	3.0	1.4	0.2	Setosa
2	4.7	3.2	1.3	0.2	Setosa
3	4.6	3.1	1.5	0.2	Setosa
4	5.0	3.6	1.4	0.2	Setosa

To perform multiple aggregations, we pass a dictionary of the column we wish to aggregate and a list of aggregation types, as many as needed - both standard functions like below and custom functions.

iris_df.groupby(['Species'], as_index=False).agg( { 'sepal length (cm)': ['count','sum'],
        'sepal width (cm)': 'max',
        'petal length (cm)': ['min', 'max'] })

	Species	sepal length (cm)		sepal width (cm)	petal length (cm)
		count	sum	max	min	max
0	Setosa	50	250.3	4.4	1.0	1.9
1	Versicolor	50	296.8	3.4	3.0	5.1
2	Virginica	50	329.4	3.8	4.5	6.9