#365daysMLChallenge – Episode 2: Nucleus

by Heydarov

At first 7 days, I have worked on practicing the general dataset manipulations:

  • Importing the Dataset from various sources
  • Describing the Dataset
  • Dealing with the Missing Values
  • Dealing with the Outliers
  • Visualizing the dataset.

All the Python notebooks can be found at my github via the link below:

https://github.com/hheydaroff/365_Days_ML_Challenge

Episode 2: The Nucleus.

In next 14 days, two main libraries of Python will be explored:

Pandas (Day 8 – Day 14)

Pandas is the main data manipulation and data analysis package for Python. There are many useful Pandas functions that are heavily used in the pre-processing step. Therefore, it is necessary to learn it more in depth.

In order to make the learning process more measurable, I will follow few challenges presented online. I will start with the Pandas Challenge provided by Guilherme at the below link:
https://github.com/guipsamora/pandas_exercises

Once I am done with it, I will follow more with the below links:

https://github.com/jvns/pandas-cookbook/

http://kanoki.org/2017/07/16/pandas-in-a-nutshell/

https://nbviewer.jupyter.org/github/rasbt/python_reference/blob/master/tutorials/things_in_pandas.ipynb

https://github.com/ajcr/100-pandas-puzzles/

Numpy (Day 15 – Day 21)

Numpy is the core package for numerical computations. Since AI/ML is strongly math based, it makes sense to learn the package in depth at the beginning.

To learn Numpy, I will use a Numpy 100 challenge provided by Nicolas at below link:


https://github.com/rougier/numpy-100/

Once I am done with 100 questions there, I will follow with the other sources below:

https://www.machinelearningplus.com/python/101-numpy-exercises-python/

http://www.scipy-lectures.org/intro/numpy/index.html


Show Must Go On…