Pandas

Machine Learning Example with Pandas

Source: https://www.w3resource.com/python-exercises/pandas/pandas-machine-learning-integration.php

Structure of data.csv:

ID	Name	  Age	Gender	Salary	 Target
1,Sara,25,Female,50000,0
2,Ophrah,30,Male,60000,1
3,Torben,22,Male,70000,0
4,Masaharu,35,Male,80000,1
5,Kaya,NaN,Female,55000,0
6,Abaddon,29,Male,NaN,1
Column Description:

ID: A unique identifier for each record (integer). Name: The name of the individual (string). Age: Age of the individual (numerical, may have missing values). Gender: Gender of the individual (categorical: Male/Female). Salary: The individual's salary (numerical, may have missing values). Target: The target variable for binary classification (binary: 0 or 1).

Read more >

Predicting Life Expectancy

Intro

The focus here is on EDA (Exploratory Data Analysis) and investigating the best choice for the \(\lambda\) hyperparameter for LASSO and Ridge Regression.

We will be working on the Life Expectancy CSV data obtained from WHO.

Peeking at Data

We begin by viewing the columns of the Life Expectancy Dataframe:

  import seaborn as sns
  import pandas as pd
  import matplotlib.pyplot as plt

  pd.options.display.float_format = '{:.2f}'.format
  le_df = pd.read_csv("life_expectancy.csv")
  le_df.columns
Index(['Country', 'Year', 'Status', 'Life expectancy ', 'Adult Mortality',
       'infant deaths', 'Alcohol', 'percentage expenditure', 'Hepatitis B',
       'Measles ', ' BMI ', 'under-five deaths ', 'Polio', 'Total expenditure',
       'Diphtheria ', ' HIV/AIDS', 'GDP', 'Population',
       ' thinness  1-19 years', ' thinness 5-9 years',
       'Income composition of resources', 'Schooling'],
      dtype='object')

We can then view the range of our life expectancy values with a box plot:

Read more >