Data Science II - Introduction to Pandas
February 6, 2020 ·
4 mins read
Pandas is an open source library built on the top of
NumPy
that allows us to analyse and clean the data for further step to be performed upon.
Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.
1. Pandas features
2. How to install Pandas?
3. What are we going to learn?
4. Series in Pandas
5. DataFrames in Pandas
6. DataFrame Operations
7. Working on missing data in DataFrames
8. Applying GroupBy( Aggregate) operations in Pandas
9. Merging, concatenating and joining DataFrames in Pandas
Pandas is the next step that you need to know if you are starting out as a data scientist. As it is built on the top of Numpy, that is why studied about Numpy first.
2. How to install Pandas?
3. What are we going to learn?
4. Series in Pandas
5. DataFrames in Pandas
6. DataFrame Operations
7. Working on missing data in DataFrames
8. Applying GroupBy( Aggregate) operations in Pandas
9. Merging, concatenating and joining DataFrames in Pandas
Pandas features
- Pandas library has a built-in visualization which you can use which we are going to discuss in the next few parts.
- It can work with a wide variety of data sources and can help us to clean them up.
How to install Pandas?
Installing Pandas is quite similar to installing Numpy as we did in the last part. When in your virtual environment, use the following commandpip install pandas
What are we going to learn?
We are going to learn various methods of the Pandas library which will help us to clean and analyse the code. Some important terms we will be seeing in this post are:- Series
- DataFrames
- Missing Data
- GroupBy
- Merging, Joining and Concatenating …
Series in Pandas
Recap
In this notebook, we discussed,- How to create Pandas series?
- How to create Pandas series with custom indexes?
- Creating series using Python dictionaries.
- How to select elements from pandas series?
- How to apply arithmetic operations in Pandas series?
int
type, it will convert it to float
.
DataFrames in Pandas
- How to create Pandas DataFrames?
- How to select column series from DataFrame?
- How to add new data into the Pandas DataFrame?
- How to remove series from Pandas DataFrames?
- How to select rows from Pandas DataFrames?
DataFrame Operations
Normal pythonNumpy DataFrames Operations Notebook In this notebook we learned about the different methods used in Pandas to select, manipulate and operate on Pandas DataFrames.and
andor
don’t work because they doesn’t have the capability to compare boolean values in a series.
Working on missing data in DataFrames
Pandas provide a lot of methods that can help us with cleaning and removing the missing data from the DataFrames. Let’s head up to the jupyter notebook and learn more on how to handle missing data in a DataFrame.Applying GroupBy( Aggregate) operations in Pandas
Group by operators allow us to apply aggregate functions. Let’s jump into the jupyter notebook and learn how can we apply group by techniques to pandas DataFrame.Merging, concatenating and joining DataFrames in Pandas
Let’s jump to the jupyter notebook to learn more about this. That’s it for this part of the post. I will keep adding more operations and methods if I find something interesting to this post.
Please share your Feedback:
Did you enjoy reading or think it can be improved? Don’t forget to leave your thoughts in the comments section below! If you liked this article, please share it with your friends, and read a few more!