You are on page 1of 16

Introduction to SAS and

Basic Concepts

Statistical Analysis System- Class-01


Introduction to the SAS System

SAS - Statistical Analysis System is an integrated system


of software solutions that enables you to perform the
following tasks:
 Data entry, retrieval, and management

 Report writing and graphics design

 Statistical and mathematical analysis

 Business forecasting and decision support

 Operations research and project management

 Applications development
2
Components Of SAS
 Base SAS – Basic procedures and data management
 SAS/STAT – Statistical analysis
 SAS/GRAPH – Graphics and presentation
 SAS/OR – Operations research
 SAS/ETS – Econometrics and Time Series Analysis
 Enterprise Guide - GUI based code editor & project manager
 SAS EBI - Suite of Business Intelligence Applications etc..,
How you use SAS depends on what you want to accomplish. Some
people use many of the capabilities of the SAS System, and others
use only a few. At the core of the SAS System is Base SAS
Overview of Base SAS Software
 Base SAS software contains the following:
 A data management facility
 Data analysis and reporting utilities
The functionality of SAS is built around the four data-driven tasks common
to virtually any application:
 Data Access - addresses the data required by the application.
 Data Management - shapes data into a form required by the
application.
 Data Analysis - summarizes, reduces, or otherwise
transforms raw data into meaningful and useful
information.
 Data Presentation - communicates information in ways that
clearly demonstrate its significance.
Data Management Facility

SAS organizes data into a rectangular form or table that is


called a SAS data set.
In a SAS data set, each row represents information about an
individual entity and is called an observation.
Each column represents the same type of information and is
called a variable.
Each separate piece of information is a data value.
Rectangular Form of a SAS Data Set
Variable

IdNum Name Team Start End


ber Weight Weight
Obs1
1001 Anil Red 80 60
Obs2
1002 Arun Red 90 75
Obs3
1003 Polsani Green 65 59
Obs4
1004 Sani Red 75 66
Obs5
1005 Pols Green 60 50
Value
Components of SAS Programs – DATA & PROC

DATA steps typically create or modify SAS data sets. They can
also be used to produce custom designed reports, we can use
DATA steps to

 Put your data into a SAS data set


 Compute values
 Check for and correct errors in your data
 Produce new SAS data sets by sub-setting, merging, and
updating existing data sets.

7
Components of SAS Programs

PROC (procedure) steps are pre-written routines that enable you


to analyze and process the data in a SAS data set and to present
the data in the form of a report.
PROC steps sometimes create new SAS data sets that contain the
results of the procedure.
PROC steps can list, sort, and summarize data.
Use PROC steps to:
 Create a report that lists the data
 Produce descriptive statistics
 Create a summary report
 Produce plots and charts.

8
SAS Data Set – Data Step
Data Step - SAS program begins with a DATA statement and used to create
`a SAS data set
DATA WEIGHT_CLUB;
INPUT IDNUMBER 1-4 NAME $ 6-24 TEAM $
STARTWEIGHT ENDWEIGHT;
LOSS=STARTWEIGHT-ENDWEIGHT;
DATALINES;
1001 Anil Red 80 60
1002 Arun Red 90 75
1003 Polsani Green 65 59
1004 Sani Red 75 66
1005 Ramesh Green 60 50
;
Note: By default, the data set WEIGHT_CLUB is temporary only for the current job
SAS Program in Detail
 The DATA statement tells SAS to begin building a SAS data set
named WEIGHT_CLUB.
 The INPUT statement identifies the fields to be read from the input
data and names the SAS variables to be created from them
(IdNumber, Name, Team, StartWeight, and EndWeight).
 The third statement is an assignment statement. It calculates the
weight each person lost and assigns the result to a new variable,
Loss.
 The DATALINES statement indicates that data lines follow. The
data lines follow the DATALINES statement. This approach to
processing raw data is useful when you have only a few lines of
data.
 The semicolon signals the end of the raw data, and is a step
boundary. It tells SAS that the preceding statements are ready for
execution.
Rules for SAS Statements

SAS statements end with a semicolon.


You can enter SAS statements in lowercase, uppercase, or
a mixture of the two.
You can begin SAS statements in any column of a line and
write several statements on the same line.
You can begin a statement on one line and continue it on
another line, but you cannot split a word between two lines.
Words in SAS statements are separated by blanks or by
special characters
Rules for SAS Names
Rules for SAS Data set Names -
 A SAS name can contain from one to 32 characters.
 The first character must be a letter or an underscore (_).
 Subsequent characters must be letters, numbers, or
underscores.
 Blank spaces cannot appear in SAS names
Rules for Variable Names – SAS remembers the combination
of uppercase and lowercase letters that you use when you
create the variable name. Internally, the case of letters does not
matter. “CAT,” “cat,” and “Cat” all represent the same
variable. But for presentation purposes, SAS remembers them
Data Analysis and Reporting Utilities

 The SAS programming language is both powerful and


flexible. In SAS we use library of built-in programs known
as SAS procedures.
 SAS procedures use data values from SAS data sets to
produce pre-programmed reports.
Data Analysis and Reporting Utilities
A portion of a SAS program that begins with a PROC
(procedure) statement and ends with a RUN statement (or is
ended by another PROC or DATA statement) is called a
PROC step.
PROC PRINT DATA=WEIGHT_CLUB;
TITLE 'HEALTH CLUB DATA';
RUN;
This procedure, known as the PRINT procedure, displays
the variables in a simple, organized form. The following
output displays the results:
Data Analysis and Reporting Utilities
 A PROC statement, which includes the word PROC, the
name of the procedure that you want to use, and the name of
the SAS data set that contains the values. (If you omit the
DATA= option and data set name, the procedure uses the
SAS data set that was most recently created in the program.)
 Additional statements that give SAS more information about
what you want to do, for example, the CLASS, VAR, TABLE,
and TITLE statements.
 A RUN statement, which indicates that the preceding group
of statements is ready to be executed.
Thank You

http://mainframes-online-training.weebly.com/
Polsani Anil Kumar

You might also like