How is the derived model presented in data mining?


Classification is the procedure of discovering a model that defines and categorize data classes or concepts. The model are based on the search of a set of training data (i.e., data objects for which the class labels are famous). The model can predict the class label of objects for which the class label is anonymous.

The derived model can be represented in several forms, including classification rules (i.e., IF-THEN rules), decision trees, numerical formulae, or neural networks. A decision tree is a flowchart-like tree architecture, where each node indicates a test on an attribute value, each branch defines an outcome of the test, and tree leaves describe classes or class distributions.

Decision trees can be transformed to classification rules. A neural network, when used for classification, is generally a set of neuron-like processing units with weighted connections among the units. There are several methods for constructing classification models, including naıve Bayesian classification, support vector machines, and k-nearest-neighbor classification.

Classification forecast categorical (discrete, unordered) labels, regression models continuous-valued functions. Regression can predict missing or unavailable statistical data values instead of (discrete) class labels.

The prediction defines both numeric prediction and class label prediction. Regression analysis is a statistical methodology that is used for numeric prediction, although several techniques exist as well. Regression also surround the identification of distribution trends depends on the available data.

Classification and regression can required to be preceded by relevance analysis, which tries to recognize attributes that are significantly applicable to the classification and regression process. Such attributes will be choose for the classification and regression process. There are multiple attributes, which are irrelevant, can be unauthorized from consideration.

Suppose as a sales manager of AllElectronics it is required to define a large set of items in the store, based on three types of responses to a sales campaign such as good response, mild response and no response.

It can derive a model for each of these three classes based on the descriptive features of the items, including price, brand, place made, type, and category. The resulting classification should maximally analyse each class from the others, presenting an organized image of the data set.

The decision tree can identify price as being the individual factor that best distinguishes the three classes. The tree can reveal that moreover price, other features that support to further distinguish objects of each class from one another contain brand and place made. Such a decision tree can provide us to learn the impact of the given sales campaign and design a more efficient campaign in the future.

Updated on: 17-Feb-2022

380 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements