Brief introduction of various analysis methods of DM

classification

firstly, select the training set that has been classified from the data, and use the technology of data mining classification on this training set to establish a classification model to classify the unclassified data.

Example:

A. Credit card applicants are classified as low, medium and high risk

B. Assign customers to predefined customer segments

Note: the number of classes is fixed, and the predefined

estimation

estimation is similar to classification, but the difference is that classification describes discrete variables. The number of categories of classification is certain, and the amount of valuation is uncertain.

Example:

A. Estimate the number of children in a family according to the purchase pattern

B. Estimate the income of a family according to the purchase pattern

C. Estimate the value of real estate

Generally speaking, valuation can be used as the previous step of classification. Given some input data, the values of unknown continuous variables are obtained by estimation, and then, according to the preset threshold, classification is carried out. For example, for the family loan business, banks use valuation to score each customer (Score ~1). Then, according to the threshold, the loan levels are classified.

prediction

usually, prediction works through classification or estimation, that is, a model is obtained through classification or estimation, which is used to predict unknown variables. In this sense, there is no need to divide prophecy into a separate category. The purpose of prediction is to predict the unknown variables in the future. This prediction takes time to verify, that is, it takes a certain period of time to know the accuracy of prediction.

affinity grouping or association rules

determines which things will happen together.

Example:

A. Customers in supermarkets often buy B when they buy A, that is, A => B (association rules)

B. After buying A, customers will buy B (sequence analysis)

clustering

Clustering is to group records and put similar records in an aggregation. The difference between aggregation and classification is that aggregation does not depend on predefined classes and does not need training sets.

Example:

A. Aggregation of some specific symptoms may indicate a specific disease

B. Aggregation of customers with different VCD types may imply that members belong to different sub-cultural groups

Aggregation is usually the first step of data mining. For example, what kind of promotion responds best to customers? For this kind of questions, it may be better to gather the whole customers first, group the customers in their own gatherings, and then answer questions for each different gathering.

description and visualization

is the representation of data mining results.

Why do some people check the information and others don't?

Edison Chen is suspected of dating a beautiful woman during Qin Shupei's pregnancy, and was let off by the woman while chatting online. What happened?

What does ICBC's 15 customer number mean?

How should I pay off my credit card?

Are there any fees for collecting money with Alipay merchant code?

How to deal with credit card bad debts

How to use PayTong POS machine?

How to cancel loan guarantee for others?

How to handle the application conditions of stopping interest and paying bills?

Is visa a credit card?