Proc Logistic, Proc Probit and Proc Catmod in SAS
This note discusses the proper interpretation of the results of logit (logistic regression) and probit analyses performed with SAS's Logistic, Probit and Catmod procedures. For some inexplicable reason, SAS sets up these analyses differently from what is described in standard textbook treatments and differently from the way they are handled by other statistical packages. Unfortunately, this SAS quirk is not discussed either prominently or clearly in the relevant SAS documentation. But forewarned is forearmed--read on.
When performing a logit or probit analysis, we are usually interested in what factors influence the probability of an outcome Y, where Y as two possible values. In the social sciences, the values of the Y variable are typically assigned such that 1 represents a response or event which we are interested in explaining and 0 a non-response or non-event. For example, if we were interested in studying what factors explain retirement decisions using a cohort of 65-year-olds, we would set Y=1 if the individual were retired and 0 otherwise. The sign of the coefficient on one of our independent variables would then tell us whether an increase in that variable increased or decreased the probability that a person would retire before age 65.
In looking at retirement decisions, we would define p=Prob(Y=1) for the purposes of our analysis. Here is where the problem lies. SAS always assigns p to be the probability of the lower value of Y, in our case Prob(Y=0). Thus, given the way we have defined the Y variable, SAS will use the independent variables to try to explain the decision to remain in the workforce--exactly the opposite of what we intended! All of our coefficient estimates will be correct in absolute value, but will have the wrong sign.
Fortunately, rather than just thinking about everything in reverse, there are simple ways to convince SAS to produce output according to social scientific coding conventions.
PROC LOGISTIC
PROC LOGISTIC descending; MODEL <dep var>=<ind vars>;
or, equivalently,PROC SORT; by descending <dep var>; PROC LOGISTIC order=data; MODEL <dep var>=<ind vars>;
PROC PROBIT
PROC SORT; by descending <dep var> PROC PROBIT order=data; CLASS <dep var>; MODEL <dep var>=<ind vars>;
PROC CATMOD
PROC SORT; by descending <dep var> PROC CATMOD order=data; DIRECT <ind vars>; MODEL <dep var>=<ind vars>;
lm: December 20, 2001
