A beginners machine learning project ( Prediction of shoe size of a person using Linear Regression in the online market )
The problem described in this post will be very helpful for beginners in the machine learning area.
Title: Prediction of shoe size of a person using Linear Regression in the online market.
Description of problem: In a shoe shop (Nike or Adidas), the shoe size of a person is one of the major problems in buying shoes via the online market for oneself. He/She wants perfect size shoes. When it comes to the online marketing sector, the shoe size of a company can be changed according to its making process. So, the customer or dealer will be in a problem when a new updated shoe comes in the market. Such a problem can be solved using AI methods. One can find their best fit shoe by giving him/her age in the online market software. The output of the system will be the size of him/herself.
Description of the dataset: The dataset is made by taking the real-life shoe size of 23 persons. There is a two-row in the dataset. One is age and another one is shoe size.
Steps for solving the solution: Linear Regression is one of the most basic algorithms in AI(machine learning) for solving one output problem.
(a) Prepare the dataset for the NumPy array
(b) Train the dataset using a Linear Regression algorithm
(c) Test the dataset using a new integer(age)
Input (Age): 55
Output (Shoe size): 7
Solution for the prediction of shoe size using python
Importing required packages
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import sklearn
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression, LinearRegression
The dataset in table format
age=[[14], [16], [12],[13], [14], [18], [22], [30], [25], [40], [45], [60], [33], [16], [90], [40], [17], [41], [31], [44], [33], [26], [80]]
shoe_size=[[5], [6], [5], [5], [5] ,[6] ,[7], [7], [6], [6], [7], [8], [6], [5], [5], [8], [6],[5], [8], [5], [8], [7], [6]]
age_array=np.array(age)
ss_array=np.array(shoe_size)
dataset = pd.DataFrame({'Age of Person ': age_array[:, 0], 'Shoe Size': ss_array[:, 0]})
print(dataset)    Age of Person   Shoe Size
0               14          5
1               16          6
2               12          5
3               13          5
4               14          5
5               18          6
6               22          7
7               30          7
8               25          6
9               40          6
10              45          7
11              60          8
12              33          6
13              16          5
14              90          5
15              40          8
16              17          6
17              41          5
18              31          8
19              44          5
20              33          8
21              26          7
22              80          6Visualization of data
plt.title('Age vs Shoe size')
plt.xlabel('Age')
plt.ylabel('Shoe size')
plt.plot(age_array, ss_array, 'o')
plt.show()Training the new data
x_train,x_test,y_train,y_test=train_test_split(age,shoe_size,test_size=0.3,train_size=0.7,random_state=2)
lor=LinearRegression()
lor.fit(x_train,y_train)
lor.predict(x_test)array([[6.16472167],
[5.7573916 ],
[6.0146527 ],
[5.928899 ],
[7.3867119 ],
[5.82170687],
[5.73595317]])
Predicting new data from the trained model
age_in=int(input('Age: '))
age_in_ar=[[age_in]]
shoe_size_pr=lor.predict(age_in_ar)
print('A person with ',age_in_ar[0][0], ' years old and predicted shoe size is : ',round(shoe_size_pr[0][0]))
Output:
Age: 55
A person with 55 years old and predicted shoe size is : 7
Full Code
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import sklearn
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression, LinearRegression
age=[[14], [16], [12],[13], [14], [18], [22], [30], [25], [40], [45], [60], [33], [16], [90], [40], [17], [41], [31], [44], [33], [26], [80]]
shoe_size=[[5], [6], [5], [5], [5] ,[6] ,[7], [7], [6], [6], [7], [8], [6], [5], [5], [8], [6],[5], [8], [5], [8], [7], [6]]
age_array=np.array(age)
ss_array=np.array(shoe_size)
dataset = pd.DataFrame({'Age of Person ': age_array[:, 0], 'Shoe Size': ss_array[:, 0]})
print(dataset)
plt.title('Age vs Shoe size')
plt.xlabel('Age')
plt.ylabel('Shoe size')
plt.plot(age_array, ss_array, 'o')
plt.show()
x_train,x_test,y_train,y_test=train_test_split(age,shoe_size,test_size=0.3,train_size=0.7,random_state=2)
lor=LinearRegression()
lor.fit(x_train,y_train)
lor.predict(x_test)
age_in=int(input('Age: '))
age_in_ar=[[age_in]]
shoe_size_pr=lor.predict(age_in_ar)
print('A person with ',age_in_ar[0][0], ' years old and predicted shoe size is : ',round(shoe_size_pr[0][0]))