AutoCorrelation ACF Plots
Time series data or events often depend on lagged values. That is, the current value of a time series data is influenced by its previous (lagged) values. Much like if a doctor spends more time on patient one than intended, time spent on patient two will be influenced by time of the previous patient and so on.
The autocorrelation function measures the linear correlation between lagged values of a time series data. It helps identify the lagged values that have significant influence on the current value.
Reading sample sales data
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.graphics.tsaplots import plot_acf
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
search_data = pd.read_csv('data/google_search.csv', parse_dates=['Week'])
search_data.head()
| Week | google_search |
|---|---|
| 2022-04-03 | 48 |
| 2022-04-10 | 4 |
| 2022-04-17 | 67 |
| 2022-04-24 | 56 |
| 2022-05-01 | 60 |
fig = plt.figure(figsize=(9,4))
plt.plot(search_data.Week, search_data.google_search)
plt.title('Weekly Good Search: Premier League')
plt.ylabel('Number of Searches')
plt.tight_layout()
Rendering ACF Plot
plt.rc("figure", figsize=(9,4))
plot_acf( sales_data.google_search, lags=20 , auto_ylims=True )
plt.tight_layout()
plt.savefig('time_series_acf_plot.png')