data thrdcls ;
input name$ age sex grade$;
datalines;
ali 21 0 2.78
hasan 22 0 3.21
yasemin 21 1 3.52
run;
data thrdcls (rename=(tmp=grade) drop=grade);
set thrdcls;
tmp=input(grade,5.2);
run;
data frthcls;
input name$ age sex grade;
datalines;
mustafa 21 0 3.15
fatam 22 1 2.95
emre 21 0 2.25
selin 22 1 2.70
run;
data std;
set thrdcls frthcls;
run;
proc print; run;
4/22/2008
conditional processing ( if )
example(general if statement)
data family ;
infile datalines dlm=',' missover;
input fname:$20. rship age sex occup income dollar5.;
format income dollar5.;
datalines;
kaya,1,50,0,1,$1500
kaya,2,45,1,2
kaya,3,30,0,1,$2000
sahin,1,60,0,2
sahin,2,50,1,1,$1750
sahin,3,15,1,0,
run;
proc print data=family; run;
data incomeytl;
set family ;
incomeytl=income*1.2;
run;
proc print data=incomeytl; run;
data job;
set family;
if occup eq 1 then status='working';
else status='not working';
run;
proc print data=job; run;
example(length statement)
data job;
set family;
length status $ 20;
if occup eq 1 then status='working';
else if occup=2 or occup=0 then status='not working';
run;
proc print data=job; run;
example(do and end)
data job_status;
set family;
length status $ 20;
if occup eq 1 then status='working';
else do
status='not working';
income=0;
end;
run;
proc print data=job_status; run;
example (multiple if else statement)
data job_status;
set family;
length status $ 20;
if occup eq 1 then status='working';
else if occup=2 then do
status='not working';
income=0;
end;
else if occup= 0 then do
status ='student';
income=0;
end;
run;
proc print data=job_status; run;
example (more than one output dataset)
data working not_working student;
set family;
if occup=0 then output student;
else if occup=1 then output working;
else if occup=2 output not_working;
run;
proc print data=not_working; run;
4/20/2008
reading data that requires spedcial intructions(Formatted input)
SAS can read numeric data that is in special formats such as binary, packed decimals, date/time.
example: formatted input
data std;
input name$ 1-15 age 17-18 class 20 sex 22 grade comma5.2;
datalines;
ali sahin 21 4 0 3,11
hüseyin karatas 20 2 1 2,90
;
run;
proc print;
run;
note that in thi example numbers for grade variable has character ',' ( with blank ti has 5 place and the value with 2 decimals so comma5.2
if we have 31,12 we have to write comma6.2
example: formatted input
data std;
input name$ 1-15 age 17-18 class 20 sex 22 grade 24-27 birthdate ddmmyy11.;
datalines;
ali sahin 21 4 0 3.11 21/02/1986
hüseyin karatas 20 2 1 2.90 13/06/1987
;
run;
proc print;
run;
example: formatted input
data std;
input name$ 1-15 age 17-18 class 20 sex 22 grade comma5.2;
datalines;
ali sahin 21 4 0 3,11
hüseyin karatas 20 2 1 2,90
;
run;
proc print;
run;
note that in thi example numbers for grade variable has character ',' ( with blank ti has 5 place and the value with 2 decimals so comma5.2
if we have 31,12 we have to write comma6.2
example: formatted input
data std;
input name$ 1-15 age 17-18 class 20 sex 22 grade 24-27 birthdate ddmmyy11.;
datalines;
ali sahin 21 4 0 3.11 21/02/1986
hüseyin karatas 20 2 1 2.90 13/06/1987
;
run;
proc print;
run;
reading data that is aligned in columns(column input) inline
in column input in input statement list the variable names and specify column positions that identify the location of the corresponding data fields. you can use column input when your raw data is in fixed columns an does not require the use of informats to be read.
example: column input
data std;
input name$ 1-15 age 17-18 class 20 sex 22 grade 24-27;
datalines;
ali sahin 21 4 0 3.1
hüseyin karatas 20 2 1 2.90
;
run;
proc print;
run;
example: column input
data std;
input name$ 1-15 age 17-18 class 20 sex 22 grade 24-27;
datalines;
ali sahin 21 4 0 3.1
hüseyin karatas 20 2 1 2.90
;
run;
proc print;
run;
reading unaligned data(list input) inline
example: comma delimited data
data std;
infile datalines dlm=',' ;
input name$ age class sex grade;
datalines;
sahin,21,4,0,3.1
dasd,20,2,1,2.90
;
run;
proc print;
run;
data std;
infile datalines dlm=',' ;
input name$ age class sex grade;
datalines;
sahin,21,4,0,3.1
dasd,20,2,1,2.90
;
run;
proc print;
run;
10/18/2007
Creating and Using Dummy Variables
A dummy variable is a numerical variable used in regression analysis to represent subgroups of the sample in your study. Dummy variable is used to distinguish different treatment groups. They are useful because they enable us to use a single regression equation to represent multiple groups. This means we don't need to write out separate equation models for each subgroup. If we have "n" subgroups we will have (n-1) dummy variables.
EX:
consider this simple data file having 9 subjects in 3 groups with a score iv dv.
sas program :
data dummy;
input sub iv dv;
cards;
1 1 48
2 1 49
3 1 50
4 2 17
5 2 20
6 2 23
7 3 28
8 3 30
9 3 32
;
run;
data dummy2;
set dummy;
if (iv=1) then iv1=1; else iv1=0;
if (iv=2) then iv2=1; else iv2=0;
if (iv=3) then iv3=1; else iv3=0;
run;
proc reg data=dummy2;
model dv=iv1 iv2;
run;
EX:
consider this simple data file having 9 subjects in 3 groups with a score iv dv.
sas program :
data dummy;
input sub iv dv;
cards;
1 1 48
2 1 49
3 1 50
4 2 17
5 2 20
6 2 23
7 3 28
8 3 30
9 3 32
;
run;
data dummy2;
set dummy;
if (iv=1) then iv1=1; else iv1=0;
if (iv=2) then iv2=1; else iv2=0;
if (iv=3) then iv3=1; else iv3=0;
run;
proc reg data=dummy2;
model dv=iv1 iv2;
run;
simple anova example
EX: the response time in milliseconds was determined for 3 different types of circuits.
sas program:
data time;
input circuit $ Time @@;
cards;
A 9 A 12 A 10 A 9 A 15
B 20 B 21 B 23 B 17 B 30
C 6 C 5 C 8 C 16 C 7
;
run;
proc anova data=time;
class circuit ;
model time= circuit;
means circuit / duncan scheffe alpha= 0.01 tukey lines;
run;
sas program:
data time;
input circuit $ Time @@;
cards;
A 9 A 12 A 10 A 9 A 15
B 20 B 21 B 23 B 17 B 30
C 6 C 5 C 8 C 16 C 7
;
run;
proc anova data=time;
class circuit ;
model time= circuit;
means circuit / duncan scheffe alpha= 0.01 tukey lines;
run;
Subscribe to:
Posts (Atom)