Title: Iris plant recognition using Gaussian Naive Bayes Classifier.
Problem: There is much confusion among the three species ( Iris-Setosa, Iris-Versicolour, Iris-Virginica ) of the iris plant. The length and the width of the sepals and petals of this species can be used to easily recognize them.
Programming Language: For the solution to this problem, we will use the Python language. Also, we will use python's sklearn (scikit learn) packages library dataset.
Solution steps:
1. Prepare the iris dataset for training and testing.
2. Fit the dataset in Gaussian Naive Bayes Model.
3. Compare the predicted class and real class results to get the accuracy of the model.
4. Test the model with a new dataset.
Dataset description: The dataset takes 4 parameters as input: sepal length in cm, sepal width in cm, petal length in cm, petal width in cm.
The output classes are:
0. Iris-Setosa,
1. Iris-Versicolour,
2. Iris-Virginica
Sample input and output:
input: 7.9, 3.8, 6.4, 2.0
output: Iris-Virginica
Code:
-----------start------------
from sklearn.datasets import load_iris
iris = load_iris()
X = iris.data
y = iris.target
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.4, random_state=1)
from sklearn.naive_bayes import GaussianNB
gnb = GaussianNB()
gnb.fit(X_train, y_train)
y_pred = gnb.predict(X_test)
from sklearn import metrics
print("Gaussian Naive Bayes model accuracy(in %):",
metrics.accuracy_score(y_test, y_pred)*100)
#Test with new Dataset
Xnew = [[7.9,3.8,6.4,2.0]]
#[[sepal length in cm, sepal width in cm,
petal length in cm, petal width in cm]]
ynew = gnb.predict(Xnew)
res={0:"Iris-Setosa", 1:"Iris-Versicolour", 2: " Iris-Virginica"}
print("Predicted: ", res[ynew[0]])
-----------------end------------
Outputs for the print function:
print("Gaussian Naive Bayes model accuracy(in %):",
metrics.accuracy_score(y_test, y_pred)*100)
Gaussian Naive Bayes model accuracy(in %): 95.0
print("Predicted: ", res[ynew[0]])
Predicted: Iris-Virginica