In the last section, we have discussed the knearest neighbors and how it is useful in different senses. Now we will see how to implement the KNN in Python practically. For this, we are going to use the Sklearn library, which is a standard library of Python for machine learning.
So without wasting any time, let's dig into the code.
from sklearn.datasets import load_iris from sklearn.cross_validation import train_test_split from sklearn.neighbors import KNeighborsClassifier import matplotlib.pyplot as plt from sklearn import metrics 
First of all, we import all the desired modules from Python. At first, we import the dataset library. Sklearn provides many datasets that are builtin, which makes it very easy for a new learner to learn machine learning. The second module that we have imported is a trained test split.
We use this for splitting our data into test and train. The advantage of using this, we don't have to code this step hard, and we can also shuffle the data, which makes the dataset more good before feeding it into the machine learning algorithm.
The third module is KNN, which is a classifier which we are going to use than we import Matplotlib for plotting data and in the last metrics module for calculating the accuracy of the model.
data = load_iris() 
Then we load the iris dataset. The Iris data set is the data of the different types of iris flowers and their difference; we feed into the ML algorithm, so next time we give the attributes of the flower and the model can tell us which flower is this. These three classes' setosa', 'versicolor,' 'virginica'. To get the target name following is the code:
data.target_names output array(['setosa', 'versicolor', 'virginica'], dtype=' 
Now we will separate the data and target the data is present in the data field, and the class name is present in the target key.
X = data.data y = data.target X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.25,random_state=5) 
We then split the data. Now the question arises on how we will know how much k neighbors will be good enough. So there are two approaches to select the number of K for our mode. The first one is by taking the square root of several rows of training data, and here there are approximately 110 rows in training data, so by taking the square root of 110, we will get approximately 10. the second is by using hit and trial rule for that following method is used.
k_neighbors = range(1,40) current_score = {} all_scores = []
for k in k_neighbors: knn = KNeighborsClassifier(n_neighbors=k) knn.fit(X_train,y_train) pred = knn.predict(X_test) sc = metrics.accuracy_score(y_test,pred) current_score[k] = sc all_scores.append(sc)
plt.plot(k_neighbors,all_scores) plt.show() 
We set the range of k from 1 to 40, and then we train our model and predict the accuracy and then plot the data. The plot looks like as follows:
Here we can see the first highest peak is at about near ten, which is similar to our square root answer, so we will use 10 number of K.
classes = data.target_names.tolist() knn = KNeighborsClassifier(n_neighbors=10) knn.fit(X_train,y_train) pred = knn.predict(X_test) sc = metrics.accuracy_score(y_test,pred) print(f"Accuracy of KNN model is: {sc}") new_data = [[5.4,3.5,1.6,0.35]] pred = knn.predict(new_data)

The output is as follows:
Accuracy of the KNN model is: 0.9736842105263158 Predicted class is: setosa 
The accuracy is pretty good, which is 97%.
Decision Tree
Introduction:
Decision tree learning is a method for approximating discretevalued target functions, in which a decision tree represents the learned function. Learned trees can also be represented as a set of ifthen rules to improve human readability.
These learning methods are among the most popular of inductive inference algorithms and have been successfully applied to a broad range of tasks, from learning to diagnose medical cases to learn to assess the credit risk of loan applicants.
Decision Tree Representation:
Decision trees classify instances by sorting them down the tree from the root to some leaf node, which provides the classification of instances. Each node in the tree specifies a test of some attribute of the instance, and each branch is descending.
Figure 1
From that node corresponds to one of the possible values for this attribute. An instance is classified by starting at the root node of the tree, testing the attribute specified by this node, then moving down the tree branch corresponding to the value of the attribute in the given an example. This process is then repeated for the subtree rooted at the new node.
Why the Decision Tree is Called Inductive Learning:
In the decision tree, we made a series of Boolean decisions and followed the corresponding branch. For example:
 Did we leave at 10 AM?
 Did a car stall on the road?
 Is there an accident on the road?
By answering each of these yes/no questions, we then concluded how long our commute might take
Appropriate Problems for Decision Tree Algorithm:
A decision tree can be applied to the number of problems depending upon the type of data we are having. Decision trees are the best suited for the problems having characteristics mentioned below:
 Attribute value pairs represent instance.
 The target function has discrete output values.
 A disjunctive description is required.
 Training data may contain errors or missing values.
How Does a Tree Decide Where To Split:
The following are the methods on which we can make a decision when to split the data and what will be the root node of that tree. This split is based on the impurity in the data set that is homogeneity or heterogeneity in a given feature set. The set is said to be sure if there is only one class in a set, and the set is said to be impure if there are multiple classes in a class.
Methods are the following:
 Gini Index: It is the measure of impurity used in building a decision tree. For finding the Gini index, the formula is:
gini(D) = 1 ( P / P+N )^{2} + ( N / P+N )  Information Gain: The information gain is the decrease in entropy after a dataset is split based on an attribute. Constructing a decision tree is all about finding the attribute that returns the highest information gain.
 Reduction invariance is an algorithm used for continuous target variable(Regression problems). The split with lower variance is selected as the criteria to split the population.
 It is an algorithm to find out the statistical significance between the differences between subnodes and parent nodes.