Calculation method of multiple linear regression - Chi Pi Zhimei's creation Abstract In real economic problems, one variable is often affected by multiple variables. For example, household consumption expenditure is not only affected by household disposable income , is also affected by various factors such as household wealth, price levels, financial institution deposit interest, etc., implying that there are multiple explanatory variables in the linear regression model. Such a model is called a multiple linear regression model. Multiple linear regression The basic principles and basic calculation process are the same as those of single linear regression. However, due to the large number of independent variables, the calculation is quite troublesome. Generally, statistical software is required when applying it in practice. Here we only introduce some basic issues of multiple linear regression.
However, because the units of each independent variable may be different, for example, in a consumption level relationship, factors such as salary level, education level, occupation, region, family burden, etc. will affect the consumption level. , and the units of these influencing factors (independent variables) are obviously different, so the size of the coefficient before the independent variable cannot actually indicate the importance of the factor. To put it more simply, the same wage income, if the unit is yuan The regression coefficient is smaller than the regression coefficient obtained by using hundreds of yuan as the unit, but the impact of the salary level offset fee has not changed, so we have to find a way to convert each independent variable into a unified unit. The standard score learned earlier has this function , specifically here, it is to convert all variables, including the dependent variable, into standard scores first, and then perform linear regression. The regression coefficient obtained at this time can reflect the importance level of the corresponding independent variable. The regression equation at this time is called the standard score Regression equation, the regression coefficient is called the standard regression coefficient, and the implication is as follows:
Zy=β1Zx1+β2Zx2+…+βkZxk
Note that since they are all converted into standard scores, there is no longer The constant term is a, because when the respective variables take the average level, the dependent variable should also take the average level, and the average level corresponds to the standard score of 0. When the variables on both sides of the equation take the value of 0, the constant term will also be 0. .Establishment of multiple linear regression models. The general form of multiple linear regression models is Yi=ββ1X1i+β2X2i+…+
β_{i}x_{i}h_{i}+υ_{i} p>
β
i
x
i
h
i
+υ
i
=1,2,…,n where k is the number of explanatory variables,
β_{j}
β
j
= (j=1,2,…,k) is called the regression coefficient. The above formula is also called the overall regression function The random expression. Its non-random expression is E(Y∣X1i,X2i,…Xki,)=ββ1X1i+β2X2i+…+βkXki
¥
5.9< /p>
Baidu Wenku VIP limited time offer is now open, enjoy 600 million + VIP content immediately
Get it now
Calculation method of multiple linear regression
The calculation method of multiple linear regression created by Chi Pi Zhimei
Abstract
In actual economic problems, a variable is often affected by multiple variables. For example, household consumption expenditure, in addition to being affected by In addition to the impact of household disposable income, it is also affected by various factors such as household wealth, price levels, deposit interest in financial institutions, etc., implying that there are multiple explanatory variables in the linear regression model. Such a model is called multivariate Linear regression model.
The basic principles and basic calculation process of multiple linear regression are the same as those of one-variable linear regression. However, due to the large number of independent variables, the calculation is quite troublesome. Generally, it is necessary to use it when applying it in practice. Statistical software. Here we only introduce some basic issues of multiple linear regression.
However, because the units of each independent variable may be different, for example, in a consumption level relationship, wage level, education level, Factors such as occupation, region, family burden, etc. will affect the level of consumption, and the units of these influencing factors (independent variables) are obviously different. Therefore, the size of the coefficient before the independent variable cannot actually indicate the importance of the factor. It is simpler. Generally speaking, for the same wage income, if the unit is yuan, the regression coefficient will be smaller than if the unit is hundreds of yuan. However, the impact of the wage level offset fee has not changed, so we have to find a way to unify the independent variables. The unit of The importance level of the independent variable. The regression equation at this time is called the standard regression equation, and the regression coefficient is called the standard regression coefficient, which means as follows:
Zy=β1Zx1+β2Zx2+…+βkZxk
Note that since they are all converted into standard scores, there is no longer a constant term a, because when the respective variables take the average level, the dependent variable should also take the average level, and the average level corresponds to the standard score of 0. When the two equations are true When all variables are 0, the constant term is also 0.