Pandas for Data Science
Pandas for Data Science
A brief summary of the topics covered in this course is as below. This is 30 hours course, it is suggested to complete this course in 3 weeks. If you already know python enough, then you can skip the first 3 modules of this course. It will save you one week.
Module 1: Basics of Python
- Introduction
- Getting Started with Python
- Introduction to Jupyter Notebook
- Data Types in Python
- Arithmetic Operations
- String Operations
- Data Structures in Python
- Introduction
- Lists
- Tuples
- Sets
- Dictionaries
- Assignment and Practice
Module 2: Control Structures and Functions in Python
- Introduction
- Decision Making
- Loops and Iterations
- Comprehensions
- Functions in Python
- Map, Filter and Reduce Functions
- Practice Exercise: Map, Filter and Reduce
- OOP in Python
- Introduction
- Class and Objects
- Methods
- Class Inheritance and Overriding
- Decorator Function
- Assignment and Practice
Module 3: Python for Data Science
- Introduction to NumPy, Matplotlib and Pandas
- NumPy
- Introduction to NumPy
- Basics of NumPy
- Operations Over 1-D Arrays
- Practice Exercise I
- Multidimensional Arrays
- Creating NumPy Arrays
- Mathematical Operations on NumPy
- Mathematical Operations on NumPy II
- Computation Times in NumPy vs Python Lists
- Assignment and Practice
Module 4: Pandas
- Introduction to Pandas
- Basics of Pandas
- Pandas – Rows and Columns
- Series
- Dataframe
- Pandas functions
- Reading files
- Index and Reindexing
- Sorting
- Slicing Dataset
- Groupby and Aggregate Functions
- Merging DataFrames
- Pivot Tables
- Window function
- Data function
- Time delta function
- Categorical data
- Visualization
- IO Tools
- Statistical functions
- Working with text data
- Iterations
- Panel
- Assignment and Practice
Module 5: Data Visualization in Python I
- Introduction to Data Visualizationwith Matplotlib
- Introduction to Matplotlib
- The Necessity of Data Visualization
- Visualization – Some Examples
- Facts and Dimensions
- Bar Graph
- Scatter Plot
- Line Graph and Histogram
- Box Plot
- Subplots
- Choosing Plot Types
- Assignment and Practice
Module 6: Data Cleaning I
- Introduction
- Case Study Overview
- Visualization
- Data Handling and Cleaning
- Data Visualization with Seaborn
- Styling Options
- Sanity Checks
- Histograms
- Assignment and Practice
Module 7: Data Cleaning II
- Introduction
- Distribution Plots
- Outliers Analysis with Boxplots
- Pie – Chart and Bar Chart
- Scatter Plots
- Pair Plots
- Revisiting Bar Graphs and Box Plots
- Heatmaps
- Line Charts
- Stacked Bar Charts
- Assignment and Practice
Module 8: Data Visualization in Python II
- Plotly
- Bokeh
- Geoplotlib
Module 9: Project & Resources
- Resources for practice
- A Final Data Cleaning and Analysis Project