Moving Excel Data into SAS
Options for reading data from Excel data to SAS include:
- Creating a.csv file in Excel (we recommend this option)
- Using the Import Wizard
Creating a .csv file in Excel:
Use an Excel spreadsheet as a quick data entry tool:
- Enter the name of the variables on the first row of the spreadsheet
- Enter your data in subsequent rows in the appropriate columns.
If a column is too narrow to contain the values of that variable, simply position your mouse on the right margin for that column and drag it as far to the right as necessary. - Do not have any blank rows in the spreadsheet.
- Enter data on only one sheet.
- If you have any dates: Select the cells containing the dates and format them with leading zeros:
- Select Format,Cells...,Number tab
- In the Category: list select: Date
- In the Type: list select 03/04/97.
- Select the OK button.
-
When you have entered all of your data and are ready to save it:
-
Select File, Save As.
- In the box labeled File name: enter the name of your file.
- In the Save as type box, choose CSV (Comma delimited) (*.csv).
- Select the Save button.
- You will see a message that the selected file type will save only the active sheet.
- Click OK.
-
If you need to upload to one of the UNIX systems (such as Koti), use SSH for secure file transfer.
Creating a SAS Program to read a .csv file:
SAMPLE SAS PROGRAM FOR PC SAS:
* This sample program reads a file named study.csv in c:\sasrawd |
|
SAMPLE SAS PROGRAM FOR UNIX SYSTEMS: *This sample program for UNIX reads a file study.csv in the data; |
In the above examples, in the INFILE statement dsd ignores delimiters in data values enclosed in quotation marks, does not read quotation marks as part of the data value, and treats two delimiters in a row as a missing value. It also tells SAS the data is comma-delimited. If your delimiter is not a comma, then you can use the DLM= option with the DSD option to specify the delimiter. Missover tells SAS there may be missing data at the end of a line and that SAS should not skip to the next line searching for that data. Firstobs=2 causes SAS to skip the first line of the data file, which contains the names of your variables.
The length statement is necessary because you are using list input to read your data. Without a preceding length statement, SAS assumes that character variables (designated by the $ following the variable name) are a maximum of eight characters long. Be sure the value specified on the length statement accommodates the maximum length for that variable, or the value will be truncated.
Note: A quick way to build your input statement with the same variables as in your Excel spreadsheet is to use Windows Copy and Paste. To do so on the PC, open your .csv file in Notepad (Start, Programs, Accessories, Notepad). To do so on our UNIX systems, use HostExplorer to login and open the .csv file in your favorite editor. Select the variable names. Select Edit, Copy. In your editor, open the SAS program file and position the insertion point on your input statement line. Choose Edit, Paste. To complete your input statement, delete the commas, insert $ for character variables, and add the ending semicolon. Use a length statement for variables longer than eight characters.
Using the Import Wizard:
The SAS Import Wizard is available in both the Windows and UNIX operating environments on the File menu. The Import Wizard will create SAS datasets from Excel spreadsheets as well as delimited and other types of files. The disadvantage of this method is that the Wizard chooses the amount of space to store the data. It is usually NOT the most efficient storage. For large files, this may become a severe handicap. Also, SAS has reported problems that can result in missing data. We recommend using the method of saving your Excel worksheet as a .csv (comma separated) file and writing your DATA step to import the .csv file instead of using the Import Wizard.
By default, the Import Wizard will scan the first 20 rows for delimited files and the first eight rows for Microsoft Excel files. The size and type of the SAS variable are determined by the data scanned. If you have missing data in all these rows, then the Import Wizard (and the IMPORT procedure) may not read the file correctly.
- To change the number of rows scanned from a delimited file:
- In the SAS command line, enter REGEDIT.
- From the ‘Edit’ menu, select ‘Find.’
- Check the Value Names box
- Search for “ GuessingRows”
- Double click on “Guessing Rows” to change the value
- To change the number of rows scanned from an Excel file:
- Go the ‘Start’ button and select ‘Run.’
- From the ‘Edit’ menu, select ‘Find.’
- Search for “ TypeGuessRows”
- Double click on “ TypeGuessRows” to change the value
This article was updated: 04/27/2020