2022-07-11 11:52:48
Assumptions of Multiple Linear Regression
Multiple linear regression is based on the following assumptions:
1. Linearity
A linear relationship between the dependent and independent variables
The first assumption of multiple linear regression is that there is a linear relationship between the dependent variable and each of the independent variables.
The best way to check the linear relationships is to create scatter plots and then visually inspect the scatter plots for linearity.
If the relationship displayed in the scatter plot is not linear, then the analyst will need to run a non-linear regression or transform the data using statistical software, such as SPSS.
2. Multicolinearity
The independent variables are not highly correlated with each other
The data should not show multicollinearity, which occurs when the independent variables (explanatory variables) are highly correlated.
When independent variables show multicollinearity, there will be problems figuring out the specific variable that contributes to the variance in the dependent variable.
The best method to test for the assumption is the Variance Inflation Factor method.
3. The variance of the residuals is constant
Multiple linear regression assumes that the amount of error in the residuals is similar at each point of the linear model.
This scenario is known as homoscedasticity.
When analyzing the data, the analyst should plot the standardized residuals against the predicted values to determine if the points are distributed fairly across all the values of independent variables.
To test the assumption, the data can be plotted on a scatterplot or by using statistical software to produce a scatterplot that includes the entire model.
4. Independence of observation
The model assumes that the observations should be independent of one another.
Simply put, the model assumes that the values of residuals are independent.
To test for this assumption, we use the Durbin Watson statistic.
The test will show values from 0 to 4, where a value of 0 to 2 shows positive autocorrelation, and values from 2 to 4 show negative autocorrelation.
The mid-point, i.e., a value of 2, shows that there is no autocorrelation. However, we usually test it for time series data
5. Multivariate normality
Multivariate normality occurs when residuals are normally distributed.
To test this assumption, look at how the values of residuals are distributed.
It can also be tested using two main methods, i.e., a histogram with a superimposed normal curve or the Normal Probability Plot method.
@researchhealth
@Healthresearc
471 viewsHealth Researcher, 08:52