Sas Functions
Sas Functions
Q1)
Create dataset ds1 as follows.
Find the difference between the two dates sdate and edate.
data ds1;
length sdate edate $20.;
input sdate $ edate $;
datalines;
12/02/1919 14/02/2019
13/05/1997 15/03/2007
13/06/2013 12/01/2006
;
run;
Q3.
From the below data:
investigator contactno designation
A-B-C (020)234 manager
P-Q-R (050)598 analyst
X-Y-X (239)789 testing
R-S-T (001)998 manager
K-L-M (991)887 analyst
Q4.
From the company.sas7bdat extract first name and last name
from the variable ‘Name’. Display it in UPCASE.
For variable SSN instead of ‘-’ add a ‘/’ as separator.
Q5.
Data
A single line of temperature values
32F 42C 137F 84F 20C
Directions
Temperatures in degrees Celsius and degrees Fahrenheit are
mixed together in an input data
stream. Celsius temperatures appear in the form nnnC, where
nnn is a 1 to 3 digit number. Fahrenheit temperatures are
entered as nnnF. Write a program to read the sample data and
store
all temperatures in degrees Celsius. The conversion from
Fahrenheit to Celsius is:
C = (F - 32)*5/9 where C = temperature in degrees Celsius and
F = temperature in degrees Fahrenheit
Q6.
Data
SAS data set PACK created by running the following program
DATA PACK;
INPUT TEN $10.;
DATALINES;
0123456789
1 2 3 4 56
3428645889
;
Directions
You are given a SAS data set PACK that contains a ten-byte
character variable (TEN) where
each byte is a numeral 0 to 9 or a blank (representing a
missing value). Write a program that
will read this data set and create a new data set called
UNPACK that contains ten numeric
variables (X1-X10) from the ten-byte character variable.
Q7.
Data
Five responses to a multiple choice test
12345
3 414
54321
Directions
Data values for five multiple choice questions were entered as
digits 1-5 (or blank) rather than
the letters A-E. Write a program that will read the digits
(you may read them as character
values), convert them to the corresponding letters, and create
a SAS data set called CONVERT
that contains five character variables. For example, 1 would
be converted to A, 2 to B, and
so on.
Q8.
Data
Five responses to a multiple choice test
12345
aBcDe
xY73E
3 E w
Directions
The raw data consist of some numerals, some data already
entered as A-E, some data errors
(numerals above 5 and letters other than A-E), and a mixture
of upper- and lowercase. Convert
all letters to uppercase and substitute the letters A through
E for the numerals 1 through 5.
Convert any remaining data values (letters other than A-E or
numerals not in the range of 1 to
5) to a character missing value. You may want to use an array
to make this program more
compact.
Q9.
Data
Raw data values for variables X, Y, and Z
X Y Z
-----------------
1 2 3
4 . 6
2.33 5 .
2.5 2.6 2.7
Directions
Given the data for variables X, Y, and Z, write a SAS DATA
step to create a SAS data set called XYZ which contains the
following new variables:
New Variable Description
------------------------------------------------------------
ROUND_X X rounded to the nearest tenth
LOG_X The base e log of X
LOG_10X The base 10 log of X
WHOLE_X The integer part of X
SMALL The smallest value of X, Y, or Z
BIG The largest value of X, Y, or Z
AVE The mean of the non-missing values of X, Y, and Z
SUM The sum of the non-missing values of X, Y, and Z
NONMISS The number of non-missing values of X, Y, and Z
Q10.
Data
Month and cost figures as shown
MONTH COST
---------------
JAN 125
FEB 120
MAR 130
APR 100
MAY 140
JUN 180
JUL 200
Directions
Compute a moving average of COST using the value for the
current month and the value from
the two previous months. For example, the moving average of
COST for March would be the
mean of COST for January, February, and March. For the first
two months where you don’t
have two previous months of data, set the moving average to
missing.
Hint
The moving average for JAN and FEB should be missing. For MAR
it should be the mean of 125,120, and 130. For APR it should
be the mean of 120, 130, and 100, and so on.
Q11.
Data
Sample survey data as follows
ID QUES1 QUES2 QUES3 QUES4 QUES5 QUES6 QUES7 QUES8 QUES9
QUES10
--------------------------------------------------------------
--
1 3 4 3 2 . 5 5 4 4 3
2 . . . 2 1 1 1 2 1 2
3 5 4 5 3 3 4 5 4 . 5
Directions
A survey asking questions about the environment was
administered to some children. Compute
an overall score by taking the mean of the ten questions.
However, the overall score should only
be computed if eight or more of the ten questions are
answered. Write a SAS DATA step to
compute this overall score.
Q12.
Data
SAS data set CHAR created by running the following program
DATA CHAR;
INPUT AGE $ HEIGHT $ WEIGHT $;
DATALINES;
23 68 160
44 72 200
55 . 180
;
Directions
Given the data set CHAR, convert AGE, HEIGHT, and WEIGHT to
numeric values. If you
would like a bit more of a challenge, give the numeric
variables in the new data set the same
name as the original names in data set CHAR.
Q13.
Data
Student quiz grades as follows
STUDENT QUIZ1 QUIZ2 QUIZ3 QUIZ4 QUIZ5
----------------------------------------------
Baggett 4 2 6 2 3
Ginn 9 9 10 . 9
Cody 10 10 9 10 10
Smith . . 2 3 4
Directions
Being the nice teacher that you are, you decide to drop the
lowest of five quiz grades if a student
took all five quizzes. Write a program that will compute the
mean quiz grade based on this
decision. If the student took fewer than five quizzes, compute
the mean of the non-missing
quizzes.
Q14.
Data
Date values in a variety of styles
---------------------------------------------
102146 10/21/46 21OCT46 46294 211046 10211946
122596 12/25/96 25DEC96 96360 251296 12251996
Directions
You are given raw data which contains dates in a variety of
styles (MM/DD/YY, Julian, and so on) as in the Data section.
Write a program to read these dates and create a SAS data set
called
IN_DATE. List the contents of IN_DATE, printing all the dates
in MM/DD/YY format. The
date formats are
Variable Starting Column Date Format
--------------------------------------------
DATE1 1 MMDDYY
DATE2 8 MM/DD/YY
DATE3 17 DDMONYY
DATE4 25 YYNNN (Julian)
DATE5 31 DDMMYY
DATE6 38 MMDDYYYY
Q15.
Data
Month, day, and year data scattered about
---------------------------------------------
17 09 1990 04 96
30 11 1991 05 95
Directions
You inherited some interesting data which has date information
(month, day, and year) scatteredabout in different locations.
In addition, one of the date values only has year and month
(day is missing), and you want to create a SAS date using the
15th of the month in place of the missing day. The information
to create the first date (DATE1) is
Using the sample data, create a SAS data set called IN_DATE2,
containing the variables DATE1and DATE2 as SAS date values,
and compute the number of years (rounded to the nearest
year)between DATE1 and DATE2. Print out the contents of
IN_DATE2 using the format WORDDATEw. for the variables DATE1
and DATE2.
Q16.
Data set Names_And_More contains a character variable called
Height.
A.
As you can see, the heights are in feet and inches. Assume
that these units can be in upper- or
lowercase and there may or may not be a period following the
units. Create a temporary SAS dataset (Height) that contains a
numeric variable (Ht Inches) that is the height in inches.
Note: One of the Height values is missing an inches value. Be
sure that there are no character-to numeric notes in the SAS
log.
Hints: You can use the KD modifiers to keep the digits and use
a single blank in the second
argument to keep blanks. You can then use the SCAN function to
extract the feet and inches
values.
B.
Data set Names_And_More contains a character variable called
Mixed that is either an integer or amixed number (such as 50
1/8). Using this data set, create a new, temporary SAS data
set with a numeric variable (Price) that has decimal value
equal to the variable mixed. This number should be rounded to
the nearest.001.
Q17.
Data set Study (shown here) contains the character variables
Group and Dose. Create a new,
variable called GroupDose by putting these two valuestogether,
separated by a hyphen
Make sure that there are no blanks (except trailing blanks)
in this value.
Q18.
Data set Errors contains character variables Subj (3 bytes)
and PartNumber (8 bytes). (See the
partial listing here.) Create a temporary SAS data set
(Check1) with any observation having Errors that violates
either of the following two rules:
1,Subj should contain only digits
2,PartNumber should contain only the uppercase letters L and S
and digits.
Subj PartNumber Name
001 L1232 Nichole Brown
0a2 L887X Fred Beans
003 12321 Alfred 2 Nice
004 abcde Mary Bumpers
X89 8888S Gill Sandford
Q19.
You have several lines of data, consisting of a subject number
and two dates (date of birth and visit date). The subject
number starts in column 1 (and is 3 bytes long), the date of
birth starts in column 4 and is in the form month-day-year,
and the visit date starts in column 14 and is in the form of a
two-digit day, a three-character month abbreviation, followed
by a four-digit year (see sample lines below). Read the
following lines of data to create a temporary SAS data set
called Dates. Format both dates using the DATE9. format.
Include the subject’s age at the time of the visit in this
data set.
0011021195011Nov2006
0020102195525May2005
0031225200525Dec2006
Q20.
Using the following lines of data, create a temporary SAS data
set called ThreeDates. Each line of data contains three dates,
the first two in the form mm/dd/yyyy and the last in the form
of a two digit day, a three-character month abbreviation,
followed by a four-digit year. Name the three date variables
Date1, Date2, and Date3. Format all three using the MMDDYY10.
format. Include in your data set the number of years from
Date1 to Date2 (call it Year12) and the number of years from
Date2 to Date3 (call it Year23). Round these values to the
nearest year. Here are the lines of data (note that the
columns do not line up):
Q21.
A.
Using the values for Day, Month, and Year in the raw data
below, create a temporary SAS data set containing a SAS date
based on these values (call it Date) and format this value
using the MMDDYY10. format. Here are the Day, Month, and Year
values:
25 12 2005
1 1 1960
21 10 1946
B.
If there is a missing value for the day, substitute
the 15th of the month.
25 12 2005
. 5 2002
12 8 2006
**************************************************************
***********