Python Machine Learning Notebooks (Website)
Dr. Tirthajyoti Sarkar, Fremont, CA (Please feel free to add me on LinkedIn here)
- Python 3.5+
- NumPy (
pip install numpy) - Pandas (
pip install pandas) - Scikit-learn (
pip install scikit-learn) - SciPy (
pip install scipy) - Statsmodels (
pip install statsmodels) - MatplotLib (
pip install matplotlib) - Seaborn (
pip install seaborn) - Sympy (
pip install sympy) - Flask (
pip install flask) - WTForms (
pip install wtforms) - Tensorflow (
pip install tensorflow) - Keras (
pip install keras)
You can start with this article that I wrote in Heartbeat magazine (on Medium platform):
Jupyter notebooks covering a wide range of functions and operations on the topics of NumPy, Pandans, Seaborn, matplotlib etc.
- Detailed Numpy operations
- Detailed Pandas operations
- Numpy and Pandas quick basics
- Matplotlib and Seaborn quick basics
- Advanced Pandas operations
- How to read various data sources
- PDF reading and table processing demo
- How fast are Numpy operations compared to pure Python code? (Read my article on Medium related to this topic)
- Fast reading from Numpy using .npy file format (Read my article on Medium on this topic)
Tutorial-type notebooks covering regression, classification, clustering, dimensionality reduction, and some basic neural network algorithms
- Simple linear regression with t-statistic generation
-
Multiple ways to perform linear regression in Python and their speed comparison (check the article I wrote on freeCodeCamp)
-
Polynomial regression using scikit-learn pipeline feature (check the article I wrote on Towards Data Science)
-
Decision trees and Random Forest regression (showing how the Random Forest works as a robust/regularized meta-estimator rejecting overfitting)
-
Detailed visual analytics and goodness-of-fit diagnostic tests for a linear regression problem
-
Robust linear regression using
HuberRegressorfrom Scikit-learn
- Logistic regression/classification (Here is the Notebook)
-
k-nearest neighbor classification (Here is the Notebook)
-
Decision trees and Random Forest Classification (Here is the Notebook)
-
Support vector machine classification (Here is the Notebook) (check the article I wrote in Towards Data Science on SVM and sorting algorithm)
- Naive Bayes classification (Here is the Noteb


