Pandas Basics

Pandas has some very useful basic function that help you get started with data analysis. In this notebook, we look at the basic functionalities that are available.

from sklearn.datasets import load_iris
import pandas as pd

iris = load_iris()
iris_data = pd.DataFrame(iris.data, columns=iris.feature_names)

decode_species = {0: 'Setosa', 1:'Versicolor', 2:'Virginica' }
iris_data['Species'] = [decode_species.get(specie) for specie in iris.target]

head()

The head method returns the first n rows depending on the specified n. More specifically, head(10) will returns the first 10 rows of the data. By default, head() returns the first five rows

iris_data.head(3)
OUTPUT sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) Species 0 5.1 3.5 1.4 0.2 Setosa 1 4.9 3.0 1.4 0.2 Setosa 2 4.7 3.2 1.3 0.2 Setosa

tail()

The tail method returns the last n rows depending on the specified n. More specifically, tail(10) will returns the last 10 rows of the data. By default, tail() returns the last five rows

iris_data.tail(3)
OUTPUT sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) Species 147 6.5 3.0 5.2 2.0 Virginica 148 6.2 3.4 5.4 2.3 Virginica 149 5.9 3.0 5.1 1.8 Virginica

info()

The info method returns general information about the columns in the data such as number of observations, type of data and existence of null values. It also provide details on memory usage.

iris_data.info()
OUTPUT RangeIndex: 150 entries, 0 to 149 Data columns (total 5 columns): sepal length (cm) 150 non-null float64 sepal width (cm) 150 non-null float64 petal length (cm) 150 non-null float64 petal width (cm) 150 non-null float64 Species 150 non-null object dtypes: float64(4), object(1) memory usage: 5.9+ KB

dtypes

dtypes attributed stands for data types and it returns all the data types for every column associated with each columns. This is very useful when analyzing data because data types determine what operations can be done on them.

iris_data.dtypes
OUTPUTsepal length (cm) float64 sepal width (cm) float64 petal length (cm) float64 petal width (cm) float64 Species object dtype: object