Setup!
Github source code & data:
Installing Jupyter Notebook:
Installing Pandas library:
Check out the first video I did on Pandas:
Check out the videos I did on Matplotlib:
Detailed video description! (timeline can be found in comments)
We start by cleaning our data. Tasks during this section include:
- Drop NaN values from DataFrame
- Removing rows based on a condition
- Change the type of columns (to_numeric, to_datetime, astype)
Once we have cleaned up our data a bit, we move the data exploration section. In this section we explore 5 high level business questions related to our data:
- What was the best month for sales? How much was earned that month?
- What city sold the most product?
- What time should we display advertisemens to maximize the likelihood of customer’s buying product?
- What products are most often sold together?
- What product sold the most? Why do you think it sold the most?
To answer these questions we walk through many different pandas & matplotlib methods. They include:
- Concatenating multiple csvs together to create a new DataFrame (pd.concat)
- Adding columns
- Parsing cells as strings to make new columns (.str)
- Using the .apply() method
- Using groupby to perform aggregate analysis
- Plotting bar charts and lines graphs to visualize our results
- Labeling our graphs
If you enjoy this video, make sure to leave it a like and subscribe to not miss any future similar tutorials :).
---------------------------------------------
Follow me on social media!
Instagram |
Twitter |
---------------------------------------------
0 Comments