Data Mining Tutorials and Articles

Found 413 Articles for Data Mining

What is Outlier Detection?

Ginni

Updated on 10-Feb-2022 11:56:31

835 Views

An outlier is a data object that diverges essentially from the rest of the objects as if it were produced by several mechanisms. For the content of the demonstration, it can define data objects that are not outliers as “normal” or expected data. Usually, it can define outliers as “abnormal” data.Outliers are data components that cannot be combined in a given class or cluster. These are the data objects which have several behavior from the usual behavior of different data objects. The analysis of this kind of data can be important to mine the knowledge.Outliers are fascinating because they are ... Read More

What are the approaches of Unsupervised Discretization?

Data Mining Database Data Structure

Ginni

Updated on 10-Feb-2022 11:54:18

653 Views

An attribute is discrete if it has an associatively small (finite) number of possible values while a continuous attribute is treated to have a huge number of possible values (infinite).In other term, a discrete data attribute can be viewed as a function whose range is a finite group while a continuous data attribute is a function whose range is an infinite completely ordered group, generally an interval.Discretization aims to decrease the number of possible values a continuous attribute takes by partitioning them into several intervals. There are two methods to the problem of discretization. One is to quantize every attribute ... Read More

What are Generalizing Exemplars?

Data Mining Database Data Structure

Ginni

Updated on 10-Feb-2022 11:52:27

84 Views

Generalized exemplars are the rectangular scope of instance area, known as hyperrectangles because they are high-dimensional. When defining new instances it is essential to convert the distance function to enable the distance to a hyperrectangle to be computed.When a new exemplar is defined correctly, it is generalized by directly merging it with the nearest exemplar of a similar class. The nearest exemplar can be an individual instance or a hyperrectangle.In this method, a new hyperrectangle is generated that covers the previous and the new instance. The hyperrectangle is expanded to surround the new instance. Lastly, if the prediction is false ... Read More

What are Radial Basis Function Networks?

Data Mining Database Data Structure

Ginni

Updated on 10-Feb-2022 11:50:08

7K+ Views

The popular type of feed-forward network is the radial basis function (RBF) network. It has two layers, not counting the input layer, and contrasts from a multilayer perceptron in the method that the hidden units implement computations.Each hidden unit significantly defines a specific point in input space, and its output, or activation, for a given instance based on the distance between its point and the instance, which is only a different point. The closer these two points, the better the activation.This is implemented by utilizing a nonlinear transformation function to modify the distance into a similarity measure. A bell-shaped Gaussian ... Read More

What are the estimation methods in data mining?

Data Mining Database Data Structure

Ginni

Updated on 15-Feb-2022 09:55:28

488 Views

Tenfold cross-validation is the standard way of measuring the error rate of a learning scheme on a particular dataset; for reliable results, 10 times tenfold cross-validation. There are two methods are leave-one-out cross-validation and bootstrap.Leave-One-Out Cross-ValidationLeave-one-out cross-validation is openly n-fold cross-validation, where n is the multiple instances in the dataset. Each instance in turn is left out, and the learning scheme is trained on all the remaining instances. It is calculated by its correctness on the remaining instance—one or zero for success or failure, accordingly. The results of all n judgments, one for each group of the dataset, are averaged, ... Read More

How to construct a decision tree?

Data Mining Database Data Structure

Ginni

Updated on 10-Feb-2022 11:44:19

2K+ Views

A decision tree is a flow-chart-like tree mechanism, where each internal node indicates a test on an attribute, each department defines an outcome of the test, and leaf nodes describe classes or class distributions. The largest node in a tree is the root node.The issues of constructing a decision tree can be defined recursively. First, select an attribute to place at the root node, and make one branch for each possible value. This divides up the example set into subsets, one for each value of the attribute. The procedure can be repeated recursively for every branch, utilizing only those instances ... Read More

What is Instance-based representation?

Data Mining Database Data Structure

Ginni

Updated on 10-Feb-2022 11:35:00

805 Views

The simplest structure of learning is plain memorization, or rote learning. Because a group of training instances has been remembered, on encountering a new instance the memory is investigated for the training instance that most powerfully resembles the new one.The only problem is how to clarify resembles. First, this is a completely different method of describing the “knowledge” extracted from a group of instances − It stores the instances themselves and works by associating new instances whose class is unknown to the current ones whose class is known. Rather than trying to make rules, work directly from the instances themselves. ... Read More

What are the performance of discriminant analysis?

Data Mining Database Data Structure

Ginni

Updated on 10-Feb-2022 11:32:28

235 Views

The discriminant analysis approach relies on two main assumptions to appear at classification scores − First, it considers that the predictor measurements in some classes appear from a multivariate normal distribution. When this hypothesis is reasonably assembled, discriminant analysis is a dynamic tool than other classification methods, including logistic regression.It is displayed that discriminant analysis is 30% more effective than logistic regression if the data are multivariate normal, it needs 30% fewer records to arrive at equal results. It has been displayed that this method is relatively strong to depart from normality in the sense that predictors can be non-normal ... Read More

What are the benefits of k-NN Algorithms?

Data Mining Database Data Structure

Ginni

Updated on 10-Feb-2022 11:29:39

213 Views

A k-nearest-neighbors algorithm is a classification approach that does not create assumptions about the structure of the relationship among the class membership (Y) and the predictors X1, X2, …. Xn.This is a nonparametric approach because it does not contain the estimation of parameters in a pretended function form, including the linear form pretended in linear regression. This method draws data from similarities among the predictor values of the data in the dataset.The benefit of k-NN methods is their integrity and the need for parametric assumptions. In the presence of a huge training set, these approaches perform especially well, when each ... Read More

What is the K-nearest neighbors algorithm?

Data Mining Database Data Structure

Ginni

Updated on 10-Feb-2022 11:24:41

364 Views

A k-nearest-neighbors algorithm is a classification approach that does not create assumptions about the structure of the relationship among the class membership (Y) and the predictors X1, X2, …. Xn.This is a nonparametric approach because it does not include estimation of parameters in a pretended function form, including the linear form pretended in linear regression. This approach draws data from similarities among the predictor values of the data in the dataset.The concept in k-nearest-neighbors methods is to recognize k records in the training dataset that are the same as the new data that it is required to classify. It can ... Read More