- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Introduction to Support Vector Machines (SVM)
Introduction to Support Vector Machines (SVM)
Support Vector Machines (SVM) is a potent technique in the field of machine learning that can be applied to both classification and regression analysis. It is widely utilized in many areas, including bioinformatics, text classification, and picture classification. The capacity of SVM to handle high-dimensional datasets and non-linear classification issues is its main benefit. The idea of SVM will be introduced in this article, along with a description of how to use it in Python.
Support Vector Machines (SVM)
Definition
A machine learning algorithm called Support Vector Machines (SVM) identifies the ideal hyperplane for classifying data. The margin between the two classes is maximized by measuring the distance between the hyperplane and the nearest data points from each class. Support vectors are the data points most closely related to the hyperplane. By applying kernel functions to translate the input data into a higher-dimensional feature space where a linear hyperplane may be utilized to divide the classes, SVM can handle both linear and non-linear classification issues.
Syntax
from sklearn import svm clf = svm.SVC(kernel='linear') clf.fit(X_train, y_train)
Explanation of Syntax
‘from sklearn import svm’ − The SVM module is imported from the Scikit-Learn library by this line.
‘clf = svm.SVC(kernel=’linear’) − An SVM classifier is initialised with a linear kernel in this line. Polynomial, radial basis function (RBF), and sigmoid are some more kernel functions that can be used.
‘clf.fit(X_train, y_train)’ − On the basis of the training data (X_train) and matching class labels (y_train), this line trains the SVM classifier.
Algorithm
Step 1: Data preprocessing − Any missing or superfluous features are removed from the supplied data by preprocessing.
Step 2: Feature Extraction − The input data is turned into a set of features that may be used to train the SVM classifier through the feature extraction process.
Step 3: Training the SVM Classifier − The input data are used to train the SVM classifier to identify the best hyperplane for categorizing the data.
Step 4: Testing the SVM Classifier − A collection of validation data is used to test the SVM classifier and assess its performance.
Step 5: Tuning the SVM Classifier − The SVM Classifier is tuned by adjusting its hyperparameters in order to enhance its performance on the validation data.
Approach
Approach 1 − Program to show Linear SVM
Approach 2 − Program to show Non Linear SVM
Approach 1: Program to show Linear SVM
Linear SVM is used when the input data can be separated by a linear hyperplane.Below is the code for same.
Example
from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn import svm from sklearn.metrics import accuracy_score iris = load_iris() X = iris.data y = iris.target X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) clf = svm.SVC(kernel='linear') clf.fit(X_train, y_train) y_pred = clf.predict(X_test) accuracy = accuracy_score(y_test, y_pred) print("Accuracy:", accuracy)
Output
Accuracy: 1.0
Load the Dataset − We can use the scikit-learn library to load the dataset. The iris dataset is a popular dataset that can be used for classification tasks.
Split the Dataset into Training and Testing Sets − The dataset is split into a training set and a testing set using the train_test_split function from the scikit-learn library.
Create an SVM Classifier with a Non-Linear Kernel − We create an SVM classifier with a non-linear kernel using the svm. SVM function from the scikit-learn library.
Train the SVM Classifier − The SVM classifier is trained on the training data using the fit method.
Test the SVM Classifier − The SVM classifier is tested on the testing data using the predict method.
Approach 2: Program to show Non Linear SVM
When a linear hyperplane cannot be used to separate the input data, non-linear SVM is used. In this method, the input data are transformed into a higher-dimensional feature space using the kernel trick so that the data can be separated using a linear hyperplane.
Example
from sklearn.datasets import make_circles from sklearn.model_selection import train_test_split from sklearn import svm from sklearn.metrics import accuracy_score X, y = make_circles(n_samples=100, noise=0.1, factor=0.5, random_state=42) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) clf = svm.SVC(kernel='rbf') clf.fit(X_train, y_train) y_pred = clf.predict(X_test) accuracy = accuracy score(y_test, y_pred) print(“Accuracy:”,accuracy)
Output
Accuracy: 0.9
Load the Dataset −To create a toy dataset with two classes that cannot be separated by a linear hyperplane, we can use the scikit-learn library's make_circles method.
Create Training and Testing Sets from the Dataset − The train_test_split function from the scikit-learn library is used to divide the dataset into a training set and a testing set.
Incorporate a Non-Linear Kernel into an SVM Classifier − Using the svm.SVC function from the scikit-learn library, we build an SVM classifier that has a non-linear kernel. In this illustration, the RBF kernel is used.
Train the SVM Classifier − The fit approach is used to train the SVM classifier on the training data.
Test the SVM Classifier − The predict method is used to test the SVM classifier using the testing data.
Conclusion
SVMs are a potent machine learning technique that may be applied to both linear and non-linear classification applications in conclusion. They are widely employed in numerous industries and have numerous practical uses. In this post, we introduced SVMs, discussed their definition and syntax, and described the linear SVM and non-linear SVM Python implementation methodologies. Additionally, we offered the whole code for applying these methods to two datasets, showing their effectiveness using accuracy as a metric. Machine learning professionals can broaden their skill sets and use SVMs to solve a variety of real-world issues by learning about SVMs and how to implement them in Python.