home | schedules | software | help | who we are | about | workshops | links | data access | contact us | print version

AMT home page ITS home page Yale Front Door Contact us Search Yale Statlab

<  August 2008 >
(Leap Year)
Su Mo Tu We Th Fr Sa
          1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

Reserve a classroom


Schedule for
08/20/2008


Main Lab
140 Prospect St.
Room 101
No Events Today

Brewster Hall
124 Prospect St.
Room B-13
No Events Today

Consultant's Desk
140 Prospect St.
Room 100
10am- 1:30pm Foong Soon Cheong
1:30- 5pm Frank Limbrock


schedules

software

help

who we are

data access

about

workshops

links

Proc Logistic, Proc Probit and Proc Catmod in SAS

This note discusses the proper interpretation of the results of logit (logistic regression) and probit analyses performed with SAS's Logistic, Probit and Catmod procedures. For some inexplicable reason, SAS sets up these analyses differently from what is described in standard textbook treatments and differently from the way they are handled by other statistical packages. Unfortunately, this SAS quirk is not discussed either prominently or clearly in the relevant SAS documentation. But forewarned is forearmed--read on.

When performing a logit or probit analysis, we are usually interested in what factors influence the probability of an outcome Y, where Y as two possible values. In the social sciences, the values of the Y variable are typically assigned such that 1 represents a response or event which we are interested in explaining and 0 a non-response or non-event. For example, if we were interested in studying what factors explain retirement decisions using a cohort of 65-year-olds, we would set Y=1 if the individual were retired and 0 otherwise. The sign of the coefficient on one of our independent variables would then tell us whether an increase in that variable increased or decreased the probability that a person would retire before age 65.

In looking at retirement decisions, we would define p=Prob(Y=1) for the purposes of our analysis. Here is where the problem lies. SAS always assigns p to be the probability of the lower value of Y, in our case Prob(Y=0). Thus, given the way we have defined the Y variable, SAS will use the independent variables to try to explain the decision to remain in the workforce--exactly the opposite of what we intended! All of our coefficient estimates will be correct in absolute value, but will have the wrong sign.

Fortunately, rather than just thinking about everything in reverse, there are simple ways to convince SAS to produce output according to social scientific coding conventions.

PROC LOGISTIC

PROC LOGISTIC descending;
  MODEL <dep var>=<ind vars>;
or, equivalently,
PROC SORT;  by descending <dep var>;

PROC LOGISTIC order=data;
  MODEL <dep var>=<ind vars>;

PROC PROBIT

PROC SORT;  by descending <dep var>

PROC PROBIT order=data;
  CLASS <dep var>;
  MODEL <dep var>=<ind vars>;

PROC CATMOD

PROC SORT;  by descending <dep var>

PROC CATMOD order=data;
  DIRECT <ind vars>;
  MODEL <dep var>=<ind vars>;

lm: December 20, 2001