A beginners machine learning project ( Prediction of shoe size of a person using Linear Regression in the online market )
The problem described in this post will be very helpful for beginners in the machine learning area.
Title: Prediction of shoe size of a person using Linear Regression in the online market.
Description of problem: In a shoe shop (Nike or Adidas), the shoe size of a person is one of the major problems in buying shoes via the online market for oneself. He/She wants perfect size shoes. When it comes to the online marketing sector, the shoe size of a company can be changed according to its making process. So, the customer or dealer will be in a problem when a new updated shoe comes in the market. Such a problem can be solved using AI methods. One can find their best fit shoe by giving him/her age in the online market software. The output of the system will be the size of him/herself.
Description of the dataset: The dataset is made by taking the real-life shoe size of 23 persons. There is a two-row in the dataset. One is age and another one is shoe size.
Steps for solving the solution: Linear Regression is one of the most basic algorithms in AI(machine learning) for solving one output problem.
(a) Prepare the dataset for the NumPy array
(b) Train the dataset using a Linear Regression algorithm
(c) Test the dataset using a new integer(age)
Input (Age): 55
Output (Shoe size): 7
Solution for the prediction of shoe size using python
Importing required packages
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import sklearn
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression, LinearRegression
The dataset in table format
age=[[14], [16], [12],[13], [14], [18], [22], [30], [25], [40], [45], [60], [33], [16], [90], [40], [17], [41], [31], [44], [33], [26], [80]]
shoe_size=[[5], [6], [5], [5], [5] ,[6] ,[7], [7], [6], [6], [7], [8], [6], [5], [5], [8], [6],[5], [8], [5], [8], [7], [6]]
age_array=np.array(age)
ss_array=np.array(shoe_size)
dataset = pd.DataFrame({'Age of Person ': age_array[:, 0], 'Shoe Size': ss_array[:, 0]})
print(dataset)
Age of Person Shoe Size
0 14 5
1 16 6
2 12 5
3 13 5
4 14 5
5 18 6
6 22 7
7 30 7
8 25 6
9 40 6
10 45 7
11 60 8
12 33 6
13 16 5
14 90 5
15 40 8
16 17 6
17 41 5
18 31 8
19 44 5
20 33 8
21 26 7
22 80 6
Visualization of data
plt.title('Age vs Shoe size')
plt.xlabel('Age')
plt.ylabel('Shoe size')
plt.plot(age_array, ss_array, 'o')
plt.show()
Training the new data
x_train,x_test,y_train,y_test=train_test_split(age,shoe_size,test_size=0.3,train_size=0.7,random_state=2)
lor=LinearRegression()
lor.fit(x_train,y_train)
lor.predict(x_test)
array([[6.16472167],
[5.7573916 ],
[6.0146527 ],
[5.928899 ],
[7.3867119 ],
[5.82170687],
[5.73595317]])
Predicting new data from the trained model
age_in=int(input('Age: '))
age_in_ar=[[age_in]]
shoe_size_pr=lor.predict(age_in_ar)
print('A person with ',age_in_ar[0][0], ' years old and predicted shoe size is : ',round(shoe_size_pr[0][0]))
Output:
Age: 55
A person with 55 years old and predicted shoe size is : 7
Full Code
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import sklearn
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression, LinearRegression
age=[[14], [16], [12],[13], [14], [18], [22], [30], [25], [40], [45], [60], [33], [16], [90], [40], [17], [41], [31], [44], [33], [26], [80]]
shoe_size=[[5], [6], [5], [5], [5] ,[6] ,[7], [7], [6], [6], [7], [8], [6], [5], [5], [8], [6],[5], [8], [5], [8], [7], [6]]
age_array=np.array(age)
ss_array=np.array(shoe_size)
dataset = pd.DataFrame({'Age of Person ': age_array[:, 0], 'Shoe Size': ss_array[:, 0]})
print(dataset)
plt.title('Age vs Shoe size')
plt.xlabel('Age')
plt.ylabel('Shoe size')
plt.plot(age_array, ss_array, 'o')
plt.show()
x_train,x_test,y_train,y_test=train_test_split(age,shoe_size,test_size=0.3,train_size=0.7,random_state=2)
lor=LinearRegression()
lor.fit(x_train,y_train)
lor.predict(x_test)
age_in=int(input('Age: '))
age_in_ar=[[age_in]]
shoe_size_pr=lor.predict(age_in_ar)
print('A person with ',age_in_ar[0][0], ' years old and predicted shoe size is : ',round(shoe_size_pr[0][0]))