| Home | Up | Search |
Logistic regression models the relationship between a binary parameter (i.e. dichotomous, e.g. live or dead) and other parameters that are assumed to be related. In other words, with a logistic regression model and given predicting parameters, one can predict the probability of certain event.
Since most statistical software packages provide all the computations, users usually only need to understand the concepts and some fundamental mathematics behind the logistic regression. The outcome of logistic regression is:
![]() |
Where b0+b1 x1+…+bn xn is called logit. This transformation enables logistic regression to use much mathematical elements of linear regression. The above equation is also the link function for logistic regression. Link function yields linear function of the independent variable for the dependent.
b0, b1, …bn are generated by the regression and each of them (bi) has its significance level - pi. Just like other statistical analysis, pi shows whether bi makes significant difference from assuming bi = 0. There are different ways to assess the significance resulting in different p’s, but all of them have same the objective, they only approach it differently. They include deviance test, Wald test and SCORE test.
1. Use univariate analysis (t-test or logistic regression) to select relevant variables.
2. If necessary, assess the linearity of the variables.
3. If necessary, assess the interactions among the selected variables.
4. Build multi-variate model.
One popular deterministic approach to select variables is stepwise logistic regression. It includes backward and forward stepwise logistic regressions.
The essence of assessing logistic model is the same as that of any other model assessment – comparing the predicted with the observed. They include the following methods:
1. Perrson Chi-square statisitc
2. Deviance,
3. Hosmer-Lemeshow test that is essentially a variant of Chi-square statistic, but much more practical because in most cases Chi-square test assumptions (e.g. m-asymptotic) are not met.
It is obvious that it would be desirable to test the model with external data – data that are not used to develop the model. The above tests can be used in this way.
Logistic model interpretation is quite straightforward. It is based on the definition of the predicted value – the probability for the event to happen. The signs of the estimated coefficients tell which parameters contribute positively to the occurrence of the event and which ones negatively. The most frequently used value is odds ratio.
Originally written in December 2003
Established since 2003
Waging Peace, Fighting Disease, Building Hope with Passion and Integrity