{ "cells": [ { "cell_type": "markdown", "id": "9eaa5a85", "metadata": {}, "source": [ "

This tutorial is prepared by Dr. Sanasam Ranbir Singh

\n" ] }, { "cell_type": "markdown", "id": "9b0da4a3", "metadata": {}, "source": [ "# Classification using scikit-learn python library\n", "\n", "[Scikit-learn (or sklearn)](https://scikit-learn.org/stable) is a free software machine learning library for the Python programming language. It supports various machine learning methods such as [feature selection](https://scikit-learn.org/stable/modules/feature_selection.html), [classification](https://scikit-learn.org/stable/supervised_learning.html), [regression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html), and [clustering](https://scikit-learn.org/stable/modules/clustering.html).\n", "\n", "In this lesson, we learn how to build various classification models using sklearn library.\n", "\n", "How to install sklearn library? [install](https://scikit-learn.org/stable/install.html).
\n" ] }, { "cell_type": "markdown", "id": "32e6667d", "metadata": {}, "source": [ "# Naive Bayes Classifier" ] }, { "cell_type": "code", "execution_count": 7, "id": "18464a89", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1]\n" ] } ], "source": [ "from sklearn.naive_bayes import GaussianNB # import Naive Bayes classifier with Gaussian Kernal.\n", "import numpy as np # import numpy for performing various mathematical functions\n", "\n", "#Define dataset\n", "X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]]) # Sample Vector\n", "Y = np.array([1, 1, 1, 2, 2, 2]) # labels\n", "\n", "#Define the classification Model\n", "model = GaussianNB()\n", "\n", "# fit the dataset into the model\n", "# Reader is also advise to check (https://scikit-learn.org/0.15/modules/scaling_strategies.html), if you are using large dataset\n", "model.fit(X, Y)\n", "\n", "# predict the output of a random sample [-0.8, -1]\n", "print(model.predict([[-0.8, -1]]))\n" ] }, { "cell_type": "markdown", "id": "1ed50608", "metadata": {}, "source": [ "# K Nearest Neighbors" ] }, { "cell_type": "code", "execution_count": 4, "id": "ef341579", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1]\n" ] } ], "source": [ "from sklearn.neighbors import KNeighborsClassifier\n", "import numpy as np # import numpy for performing various mathematical functions\n", "\n", "#Define dataset\n", "X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])\n", "Y = np.array([1, 1, 1, 2, 2, 2])\n", "\n", "#Define the classification Model\n", "model = KNeighborsClassifier(n_neighbors=3)\n", "\n", "# fit the dataset into the model\n", "model.fit(X, Y)\n", "\n", "# predict the output of a random sample [-0.8, -1]\n", "print(model.predict([[-0.8, -1]]))" ] }, { "cell_type": "markdown", "id": "5f61dd37", "metadata": {}, "source": [ "# Decision Tree" ] }, { "cell_type": "code", "execution_count": 5, "id": "b0850028", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1]\n" ] } ], "source": [ "from sklearn.tree import DecisionTreeClassifier\n", "import numpy as np # import numpy for performing various mathematical functions\n", "\n", "#Define dataset\n", "X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])\n", "Y = np.array([1, 1, 1, 2, 2, 2])\n", "\n", "#Define the classification Model\n", "model = KNeighborsClassifier(n_neighbors=3)\n", "\n", "# fit the dataset into the model\n", "model.fit(X, Y)\n", "\n", "# predict the output of a random sample [-0.8, -1]\n", "print(model.predict([[-0.8, -1]]))" ] }, { "cell_type": "markdown", "id": "ca7db816", "metadata": {}, "source": [ "# Support Vector Machine" ] }, { "cell_type": "code", "execution_count": 6, "id": "3f4aaf1d", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1]\n" ] } ], "source": [ "from sklearn import svm\n", "import numpy as np # import numpy for performing various mathematical functions\n", "\n", "#Define dataset\n", "X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])\n", "Y = np.array([1, 1, 1, 2, 2, 2])\n", "\n", "#Define the classification Model\n", "model = svm.SVC()\n", "\n", "# fit the dataset into the model\n", "model.fit(X, Y)\n", "\n", "# predict the output of a random sample [-0.8, -1]\n", "print(model.predict([[-0.8, -1]]))" ] }, { "cell_type": "code", "execution_count": null, "id": "0f486f9f", "metadata": {}, "outputs": [], "source": [ "from sklearn import tree\n", "\n", "model = tree.DecisionTreeClassifier()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.12" } }, "nbformat": 4, "nbformat_minor": 5 }