Current location - Trademark Inquiry Complete Network - Overdue credit card - The use of data mining
The use of data mining

Analysis methods:

·Classification

·Estimation

·Prediction

· Affinity grouping or association rules

· Clustering

· Complex data type mining (Text, Web, graphics, images, Video, audio, etc.)

Method introduction:

·Classification (Classification)

First select the training set that has been classified into categories from the data. Use data mining classification technology on the training set to establish a classification model and classify unclassified data.

Example:

a. Credit card applicants, classified as low, medium and high risk

b. Fault diagnosis: China Baosteel Group and Shanghai Tianlu Information Technology Co., Ltd. to use data mining technology to monitor and analyze the quality of the entire steel production process, build a fault map, and analyze the causes of product defects in real time, effectively improving the quality rate of products.

Note: The number of classes is determined and predefined

·Estimation

Estimation is similar to classification, except that, Classification describes the output of discrete variables, while valuation deals with the output of continuous values; the number of classification categories is fixed, but the amount of valuation is uncertain.

Example:

a. Estimate the number of children in a family based on purchasing patterns

b. Estimate the income of a family based on purchasing patterns

c. Estimating the value of real estate

Generally speaking, valuation can be used as a preliminary step in classification. Given some input data, the value of the unknown continuous variable is obtained through estimation, and then classified according to the preset threshold. For example: Banks use valuation for home loan business and assign scores (Score 0~1) to each customer. Then, based on the thresholds, the loan classes are classified.

·Prediction

Usually, prediction works through classification or valuation, that is, a model is derived through classification or valuation, which is used to predict Predictions of unknown variables. In this sense, prophecies do not actually need to be classified into a separate category. The purpose of prediction is to predict unknown variables in the future. This kind of prediction takes time to verify, that is, it must take a certain period of time to know the accuracy of the prediction.

· Affinity grouping or association rules

Determine which things will happen together.

Example:

a. When customers in the supermarket buy A, they often buy B, that is, A =gt; B (association rules)

b . After a customer purchases A, he will purchase B after a period of time (sequence analysis)

· Clustering

Clustering is to group records and put similar records in In a gathering. The difference between clustering and classification is that clustering does not rely on predefined classes and does not require a training set.

Examples:

a. The clustering of some specific symptoms may indicate a specific disease

b. The clustering of customers who rent VCDs with dissimilar types may indicate Members belong to different subculture groups

Aggregation is often used as the first step in data mining. For example, which types of promotions respond best to customers? , For this type of problem, it may be better to first aggregate the entire customers, group the customers into their respective aggregations, and then answer questions for each different aggregation.

· Description and Visualization (Description and Visualization)

Is the way to represent data mining results. Generally just refers to data visualization tools, including reporting tools and business intelligence analysis products (BI) collectively. For example, using tools such as Yonghong Z-Suite to display, analyze, and drill data, the analysis results of data mining can be displayed more vividly and profoundly.