What are the types of regression in data mining?


Regression defines a type of supervised machine learning approaches that can be used to forecast any continuous-valued attribute. Regression provides some business organization to explore the target variable and predictor variable associations. It is an essential tool to explore the data that can be used for monetary forecasting and time series modeling.

There are various types of regression which are as follows −

Linear Regression − Linear regression includes discovering the “best” line to fit two attributes (or variables) therefore that one attribute can be used to predict the other. Multiple linear regression is an advancement of linear regression, where higher than two attributes are included and the record are fit to a multidimensional area.

For example, the equation is

Y = a + b*X + e.

Where,

a defines the intercept

b defines the slope of the regression line

e defines the error

X and Y define the predictor and target variables, accordingly. If X is create up of higher than one variable, defined as multiple linear equations.

In linear regression, the best fit line is implemented using the least squared method, and it decrease the total sum of the squares of the deviations from every data point to the line of regression. Therefore, the positive and negative deviations do not have canceled as some deviations are squared.

Polynomial Regression − If the power of the separate variable is more than 1 in the regression equation, it is defined as polynomial equation.

For example, the equation is

Y = a + b * x2

In the specific regression, the best fit line is not treated a straight line such as linear equation; but it defines a curve fitted to some data points.

Logistic Regression − When the dependent variable is binary in nature such as 0 and 1, true or false, success or failure, the logistic regression methods appears into existence. Therefore, the target value (Y) ranges from 0 to 1, and it is generally used for classification-based problems. Unlike linear regression, it does not required some independent and dependent variables to have a linear relationship.

Ridge Regression − Ridge regression defines a process that can be used to compute various regression data that have the problem of multicollinearity. Multicollinearity is the continuation of a linear correlation between two separate variables.

Lasso Regression − LASSO represents Least Absolute Shrinkage and Selection Operator. Lasso regression is a linear method of regression that uses shrinkage. In Lasso regression, some data points are shrunk towards a central point, also called mean. The lasso procedure is most fitted for simple and sparse models with several parameters than other regression. This method of regression is well appropriated for models that endure from multicollinearity.

Updated on: 15-Feb-2022

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements