Help Using SSDA Data Files: SAS
Introduction
The instructions and examples below explain how to use data files from a computer directly connected to the StatLab NT Server.Using ASCII Data Files (.dat, .raw , .txt or .csv files)
Column-delimited ASCII files (usually with the file extension .dat or .raw) and Delimited ASCII files (usually with the file extension .txt or .csv) can be read by the statistical packages on the StatLab NT Server. The sections that follow provide information for reading in ASCII files in SAS for Windows. Since many of the ASCII files on the StatLab NT Server are too large to convert to SAS datasets, we recommend that users create extracts and write them to diskette, zip disk, their personal drive (y:\ ), or to the c:\temp directory of the StatLab computers. Note the c:\temp directory can only be used for temporary storage, as files may be deleted by other users.Using SAS System Data files
The system data files available are ready for analysis using STATLAB software, with little additional data management. The sections below provide instructions for accessing system data files in SAS (typically extension .sd2 or .sas7bdat). If the analysis is designed to use other software, these data files are readily convertible to other data formats using the file conversion software StatTransfer or DBMSCOPY also available at the STATLAB (Start>Programs>Statlab Packages).
Using ASCII Data Files with SAS
If ASCII data is "delimited" (there is a marker such as a comma separating columns, typically files with the extension .txt or .csv), use the program StatTransfer to transform data into a SAS (.sd2) dataset. SAS can also import such files (select File> Import from the main screen or open file from the Analyst (SAS's user friendly format) screen. However, these programs are erratic and are not currently recommended.
If data is organized is "Column-delimited" (variables are in fixed positions and are of fixed length, typically with the file extension .dat or .raw), there are two options for reading in ASCII data: write a program to create a temporary SAS dataset (which can be saved) or use EFI, SAS's external file interface.
Program to create a temporary SAS dataset
When reading in the ASCII data with the DATA step in SAS, be sure to include the lrecl (logical record length) if the data are longer than 132 columns. This information can be found in the codebook for the file(s).
The following program will create a temporary SAS dataset. This program is useful for extracting variables for analysis from large ASCII files, which do not require extensive transformations of variables.
DATA abcd;
INFILE 'h:\ssda\directory\filename' lrecl=nnn;
INPUT variable column_location;
transformations and/or recodings;
RUN;
- The DATA statement tells SAS that a DATA step is starting. The dataset name "abcd" is an internal name (max. 8 characters) by which SAS remembers the data entered in this DATA step.
- The INFILE statement identifies the external file which contains your raw data.
- The INPUT statement tells SAS how to read in the data from the external file. The options available on the INPUT statement are quite complex, but SAS is able to read in a wide variety of formatted data. See the SAS Basics page for more examples.
- The lrecl statement is required for data longer than 132 columns.
- The RUN statement starts the program.
LIBNAME abcd "A:\";
DATA abcd.mydata;
INFILE 'a:mydata.asc';
INPUT variable column_location $ ;
transformations and/or recodings ;
RUN ;
- The LIBNAME statement tells SAS where to store or TO find the permanent SAS datasets. You have to submit a LIBNAME statement only once per session.
- The DATA step identifies the name of a permanent SAS dataset, which always has two parts separated by a period. The first part (abcd) shows the location, using the name you give it in the LIBNAME statement. The second part (mydata) is the actual dataset name. In the above example, the program will write out a file called mydata.sd2 on your diskette in the A: drive .
For information on running procedures, consult the SAS for Windows help page.
Using EFI
Before Starting the process, open or print out the codebook for your datafile (typically a text or .pdf file). This codebook will provide the information you need to separate the data into variables. Also, if you do not already have a SAS library name defined (library is SAS's term for a datapath), you should create one now. Open SAS. Select View>Explorer. Select File>New. Enter a library name (for example "mylibrary") and either browse for or type in a directory path (for example y:\ or c:\temp). Now you have a library, SAS will know where to direct your files and you will know where to find them.
To use SAS's interface to import ASCII data select File>Import Data.
1. Select "User Defined format". Select NEXT.
2. Browse or type in file location (for this example H:\SSDA\0060\cpsvs72.dat). Select NEXT.
3. Select a Library in which your data will be stored ("mylibrary") and type a name for your dataset into the "Member" space. Select NEXT.
4. Select FINISH.
5. A window will appear as below. You will be able to see the first few rows of data for reference; however it is at this point that the codebook is useful. First, for most data sets you will need to click the Options button and Change "Style of Input" to Columns and select OK. This will allow you to delimit variables when they are in fixed positions and of fixed lengths. [Select Style of Input as "List" for data separated by delimiters (e.g. commas); however in this case you would be better off using StatTransfer to create your SAS dataset.] The initial screen will change to allow for the input of information on data positions. (Begin, End, and Length). Second, using the codebook you should enter the variable name into the field name window and the description into "descriptive label". Select whether the variable is a string (character based) or numeric. Now either enter the beginning and ending positions into the boxes provided or use your mouse in the data window (the boxes will fill automatically) to delimit the variable. Finally select "ADD" and the variable will appear in the window to the right of the data. Repeat the second step until all variables are defined. The informat and format boxes should change automatically to adjust for the variable length; however if they do not you can click on the arrows to the right to adjust.

When you have delimited all the variables you need, select file>save to save the dataset. Alternatively, you can close the box and SAS will automatically ask you to save your dataset.
Using SAS Transport Files (.xpt files)
These files are transport files for use in SAS to transfer SAS data files across computer platforms. In general, if data files are large, it is recommended that variables be selected using the KEEP command and written onto a smaller file on the c:\temp directory, diskette or zip disk.If the data can be analyzed as is, without extensive subsetting or transformations of the variables, use the following program lines:
LIBNAME IN XPORT 'h:\ssda\directory'; PROC procedure_name DATA=IN.statdata ; RUN;
- LIBNAME IN is a command to designate a SAS library with the name IN (the name of the library is arbitrary)
- XPORT is the SAS engine required to read in transport files
- the path in single quotes designates the location of the library
- PROC invokes the procedure for analysis
- DATA=IN.statdata points the procedure to the file in the library designated as IN with filename statdata
LIBNAME IN XPORT 'h:\ssda\directory'; LIBNAME OUT 'a:\' ; DATA OUT.mydata ; SET IN.statdata ; transformations_here ; KEEP variable_names ; RUN;
- LIBNAME OUT is a second library designation to write out the permanent SAS data file
- DATA OUT.mydata invokes the DATA step to write out a permanent SAS data file to library OUT with filename mydata (and extension sd2)
- SET IN.statdata executes the DATA step by pointing (SET) to the SAS transport file in library IN with filename statdata as the source file
- KEEP retains only the variables in variable_names in the new dataset (mydata.sd2)
Using SAS System Files (.ssd or .sd2 files)
These files are ready for analysis in SAS for Windows. In general, if data files are large, it is recommended that variables be selected using the KEEP command and written onto a smaller file on the c:\temp directory, diskette or zip disk.If the data can be analyzed as is, without extensive subsetting or transformations of the variables, use the following program lines:
LIBNAME IN 'h:\ssda\directory'; PROC procedure_name DATA=IN.statdata ; RUN;
- LIBNAME IN is a command to designate a SAS library with the name IN (the name of the library is arbitrary)
- the path in single quotes designates the location of the library
- PROC invokes the procedure for analysis
- DATA=IN.statdata points the procedure to the file in the library designated as IN with filename statdata
LIBNAME IN XPORT 'h:\ssda\directory'; LIBNAME OUT 'a:\' ; DATA OUT.mydata ; SET IN.statdata ; transformations_here ; KEEP variable_names ; RUN;
- LIBNAME OUT is a second library designation to write out the permanent SAS data file
- DATA OUT.mydata invokes the DATA step to write out a permanent SAS data file to library OUT with filename mydata (and extension sd2)
- SET IN.statdata executes the DATA step by pointing (SET) to the SAS transport file in library IN with filename statdata as the source file
KEEP retains only the variables in variable_names in the new dataset (mydata.sd2)
- Help using SSDA Data files with SPSS
- For additional information on running procedures, refer to the SAS for Windows help page.
- SSDA home
![]() | ![]() | ©2007 Yale
University Social Science Statistical Laboratory Certifying Authority: Themba Flowers lm: Fri Apr 11 11:14:36 EDT 2003 |


