Pandas Read Methods
Pandas provide various read methods for different text files, databases and specific data files. Below we can look at the most common read methods that do not require custom files
import pandas as pd
1. read_csv()
The most common way of reading data using pandas is the read_csv method. It simply takes a csv or text file delimited by commas. Notice that the file is in the working directory. You may need to specify full path.
csv_file = pd.read_csv('./sample_file.csv')
csv_file.head()
2. read_table()
read_table provides the same functionality as read_csv and they can be used interchangeably. In the example below I use a tab delimiter.
table_file = pd.read_table('./tab_file.txt', delimiter='\t')
table_file.head()
3. read_html
Pandas also allow us to read a table from html. In case there are several tables we can select individual tables by index. It does this by evaluating the table tags from the page.
html_data = pd.read_html('https://money.cnn.com/data/us_markets/', header=0)
html_data[0].head()
4. read_sql
If you have an established database connection, you can read data directly from the database using the read_sql method. To demonstrate this I use a local database connection to my stock data sqlite database.
import sqlite3
conn = sqlite3.connect('data.db')
query = "SELECT * FROM stockdata LIMIT 10"
stock_data = pd.read_sql(query , con=conn)
stock_data.head()
5. read_excel
Finally, the last common read method is the read_excel method. Excel files can house multiple spreadsheets and can contain formular and advanced operations so they do require some argument specification.
excel_file = pd.read_excel('./sample_excel.xlsx', sheet_name='Sheet1')
excel_file.head()