0% found this document useful (0 votes)
9 views

Sas Functions

The document outlines a series of SAS programming tasks involving data manipulation, including creating datasets, calculating averages and totals, formatting variables, and converting data types. Each question provides specific data examples and instructions for processing, such as removing characters, splitting variables, and handling missing values. The tasks cover a wide range of data handling techniques, including date formatting, temperature conversion, and survey score calculations.

Uploaded by

akash behera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Sas Functions

The document outlines a series of SAS programming tasks involving data manipulation, including creating datasets, calculating averages and totals, formatting variables, and converting data types. Each question provides specific data examples and instructions for processing, such as removing characters, splitting variables, and handling missing values. The tasks cover a wide range of data handling techniques, including date formatting, temperature conversion, and survey score calculations.

Uploaded by

akash behera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

SAS FUNCTIONS

Q1)
Create dataset ds1 as follows.
Find the difference between the two dates sdate and edate.

data ds1;
length sdate edate $20.;
input sdate $ edate $;
datalines;
12/02/1919 14/02/2019
13/05/1997 15/03/2007
13/06/2013 12/01/2006
;
run;

Q2)Suppose u have a data as follows


Id wt1 wt2 wt3
01 23 45 32
02 45 55 29
03 76 78 79
04 87 67 45
05 12 32 23
;
Create one dataset as ‘weight’.
Calculate average weight(avg_wt), sum of weights(total_wt) and
finally a variable named cp which includes both.
mock output.:

Obs id wt1 wt2 wt3 total_wt avg_wt cp


1 1 23 45 32 100 33.3333 33.33(100)
2 2 45 55 29 129 43.0000 43.00(129)
3 3 76 78 79 233 77.6667 77.67(233)
4 4 87 67 45 199 66.3333 66.33(199)
5 5 12 32 23 67 22.3333 22.33( 67)

Q3.
From the below data:
investigator contactno designation
A-B-C (020)234 manager
P-Q-R (050)598 analyst
X-Y-X (239)789 testing
R-S-T (001)998 manager
K-L-M (991)887 analyst

1.Remove the ‘-‘ from the variable investigator


2. split the ‘no’ into two variables like code and number
For eg: (020)234, code=020, number=234
3.The designation should be in proper case.

Q4.
From the company.sas7bdat extract first name and last name
from the variable ‘Name’. Display it in UPCASE.
For variable SSN instead of ‘-’ add a ‘/’ as separator.

Q5.
Data
A single line of temperature values
32F 42C 137F 84F 20C

Directions
Temperatures in degrees Celsius and degrees Fahrenheit are
mixed together in an input data
stream. Celsius temperatures appear in the form nnnC, where
nnn is a 1 to 3 digit number. Fahrenheit temperatures are
entered as nnnF. Write a program to read the sample data and
store
all temperatures in degrees Celsius. The conversion from
Fahrenheit to Celsius is:
C = (F - 32)*5/9 where C = temperature in degrees Celsius and
F = temperature in degrees Fahrenheit
Q6.
Data
SAS data set PACK created by running the following program
DATA PACK;
INPUT TEN $10.;
DATALINES;
0123456789
1 2 3 4 56
3428645889
;

Directions
You are given a SAS data set PACK that contains a ten-byte
character variable (TEN) where
each byte is a numeral 0 to 9 or a blank (representing a
missing value). Write a program that
will read this data set and create a new data set called
UNPACK that contains ten numeric
variables (X1-X10) from the ten-byte character variable.
Q7.
Data
Five responses to a multiple choice test
12345
3 414
54321

Directions
Data values for five multiple choice questions were entered as
digits 1-5 (or blank) rather than
the letters A-E. Write a program that will read the digits
(you may read them as character
values), convert them to the corresponding letters, and create
a SAS data set called CONVERT
that contains five character variables. For example, 1 would
be converted to A, 2 to B, and
so on.
Q8.
Data
Five responses to a multiple choice test
12345
aBcDe
xY73E
3 E w

Directions
The raw data consist of some numerals, some data already
entered as A-E, some data errors
(numerals above 5 and letters other than A-E), and a mixture
of upper- and lowercase. Convert
all letters to uppercase and substitute the letters A through
E for the numerals 1 through 5.
Convert any remaining data values (letters other than A-E or
numerals not in the range of 1 to
5) to a character missing value. You may want to use an array
to make this program more
compact.
Q9.
Data
Raw data values for variables X, Y, and Z
X Y Z
-----------------
1 2 3
4 . 6
2.33 5 .
2.5 2.6 2.7
Directions
Given the data for variables X, Y, and Z, write a SAS DATA
step to create a SAS data set called XYZ which contains the
following new variables:
New Variable Description
------------------------------------------------------------
ROUND_X X rounded to the nearest tenth
LOG_X The base e log of X
LOG_10X The base 10 log of X
WHOLE_X The integer part of X
SMALL The smallest value of X, Y, or Z
BIG The largest value of X, Y, or Z
AVE The mean of the non-missing values of X, Y, and Z
SUM The sum of the non-missing values of X, Y, and Z
NONMISS The number of non-missing values of X, Y, and Z

Q10.
Data
Month and cost figures as shown
MONTH COST
---------------
JAN 125
FEB 120
MAR 130
APR 100
MAY 140
JUN 180
JUL 200

Directions
Compute a moving average of COST using the value for the
current month and the value from
the two previous months. For example, the moving average of
COST for March would be the
mean of COST for January, February, and March. For the first
two months where you don’t
have two previous months of data, set the moving average to
missing.
Hint
The moving average for JAN and FEB should be missing. For MAR
it should be the mean of 125,120, and 130. For APR it should
be the mean of 120, 130, and 100, and so on.

Q11.
Data
Sample survey data as follows
ID QUES1 QUES2 QUES3 QUES4 QUES5 QUES6 QUES7 QUES8 QUES9
QUES10
--------------------------------------------------------------
--
1 3 4 3 2 . 5 5 4 4 3
2 . . . 2 1 1 1 2 1 2
3 5 4 5 3 3 4 5 4 . 5

Directions
A survey asking questions about the environment was
administered to some children. Compute
an overall score by taking the mean of the ten questions.
However, the overall score should only
be computed if eight or more of the ten questions are
answered. Write a SAS DATA step to
compute this overall score.

Q12.
Data
SAS data set CHAR created by running the following program
DATA CHAR;
INPUT AGE $ HEIGHT $ WEIGHT $;
DATALINES;
23 68 160
44 72 200
55 . 180
;
Directions
Given the data set CHAR, convert AGE, HEIGHT, and WEIGHT to
numeric values. If you
would like a bit more of a challenge, give the numeric
variables in the new data set the same
name as the original names in data set CHAR.
Q13.
Data
Student quiz grades as follows
STUDENT QUIZ1 QUIZ2 QUIZ3 QUIZ4 QUIZ5
----------------------------------------------
Baggett 4 2 6 2 3
Ginn 9 9 10 . 9
Cody 10 10 9 10 10
Smith . . 2 3 4

Directions
Being the nice teacher that you are, you decide to drop the
lowest of five quiz grades if a student
took all five quizzes. Write a program that will compute the
mean quiz grade based on this
decision. If the student took fewer than five quizzes, compute
the mean of the non-missing
quizzes.
Q14.
Data
Date values in a variety of styles

---------------------------------------------
102146 10/21/46 21OCT46 46294 211046 10211946
122596 12/25/96 25DEC96 96360 251296 12251996

Directions
You are given raw data which contains dates in a variety of
styles (MM/DD/YY, Julian, and so on) as in the Data section.
Write a program to read these dates and create a SAS data set
called
IN_DATE. List the contents of IN_DATE, printing all the dates
in MM/DD/YY format. The
date formats are
Variable Starting Column Date Format
--------------------------------------------
DATE1 1 MMDDYY
DATE2 8 MM/DD/YY
DATE3 17 DDMONYY
DATE4 25 YYNNN (Julian)
DATE5 31 DDMMYY
DATE6 38 MMDDYYYY
Q15.
Data
Month, day, and year data scattered about
---------------------------------------------
17 09 1990 04 96
30 11 1991 05 95
Directions
You inherited some interesting data which has date information
(month, day, and year) scatteredabout in different locations.
In addition, one of the date values only has year and month
(day is missing), and you want to create a SAS date using the
15th of the month in place of the missing day. The information
to create the first date (DATE1) is

Day columns 3-4


Month columns 10-11
Year columns 15-18
For the second date (DATE2)
Day is missing
Month columns 20-21
Year columns 23-24

Using the sample data, create a SAS data set called IN_DATE2,
containing the variables DATE1and DATE2 as SAS date values,
and compute the number of years (rounded to the nearest
year)between DATE1 and DATE2. Print out the contents of
IN_DATE2 using the format WORDDATEw. for the variables DATE1
and DATE2.

Q16.
Data set Names_And_More contains a character variable called
Height.

Name Phone Height


Mixed
Roger Cody (908)782-1234 5ft. 10in. 50
1/8
Thomas Jefferson (315) 848-8484 6ft. 1in. 23 1/2
Marco Polo (800)123-4567 5Ft. 6in. 40
Brian Watson (518)355-1766 5ft. 10in
89 3/4
Michael DeMarco (445)232-2233 6ft. 76
1/3

A.
As you can see, the heights are in feet and inches. Assume
that these units can be in upper- or
lowercase and there may or may not be a period following the
units. Create a temporary SAS dataset (Height) that contains a
numeric variable (Ht Inches) that is the height in inches.
Note: One of the Height values is missing an inches value. Be
sure that there are no character-to numeric notes in the SAS
log.
Hints: You can use the KD modifiers to keep the digits and use
a single blank in the second
argument to keep blanks. You can then use the SCAN function to
extract the feet and inches
values.
B.
Data set Names_And_More contains a character variable called
Mixed that is either an integer or amixed number (such as 50
1/8). Using this data set, create a new, temporary SAS data
set with a numeric variable (Price) that has decimal value
equal to the variable mixed. This number should be rounded to
the nearest.001.

Eg: 50 1/8 can be converted as (50*8)+1))

Q17.
Data set Study (shown here) contains the character variables
Group and Dose. Create a new,
variable called GroupDose by putting these two valuestogether,
separated by a hyphen
Make sure that there are no blanks (except trailing blanks)
in this value.

Data Set Study


Subj Group Dose Weight Subgroup
001 A Low 220lbs. 2
002 A High 90Kg. 1
003 B Low 88kg 1
004 B High 165lbs. 2
005 A Low 88kG 1

Q18.
Data set Errors contains character variables Subj (3 bytes)
and PartNumber (8 bytes). (See the
partial listing here.) Create a temporary SAS data set
(Check1) with any observation having Errors that violates
either of the following two rules:
1,Subj should contain only digits
2,PartNumber should contain only the uppercase letters L and S
and digits.
Subj PartNumber Name
001 L1232 Nichole Brown
0a2 L887X Fred Beans
003 12321 Alfred 2 Nice
004 abcde Mary Bumpers
X89 8888S Gill Sandford

Q19.
You have several lines of data, consisting of a subject number
and two dates (date of birth and visit date). The subject
number starts in column 1 (and is 3 bytes long), the date of
birth starts in column 4 and is in the form month-day-year,
and the visit date starts in column 14 and is in the form of a
two-digit day, a three-character month abbreviation, followed
by a four-digit year (see sample lines below). Read the
following lines of data to create a temporary SAS data set
called Dates. Format both dates using the DATE9. format.
Include the subject’s age at the time of the visit in this
data set.

0011021195011Nov2006
0020102195525May2005
0031225200525Dec2006

Q20.
Using the following lines of data, create a temporary SAS data
set called ThreeDates. Each line of data contains three dates,
the first two in the form mm/dd/yyyy and the last in the form
of a two digit day, a three-character month abbreviation,
followed by a four-digit year. Name the three date variables
Date1, Date2, and Date3. Format all three using the MMDDYY10.
format. Include in your data set the number of years from
Date1 to Date2 (call it Year12) and the number of years from
Date2 to Date3 (call it Year23). Round these values to the
nearest year. Here are the lines of data (note that the
columns do not line up):

01/03/1950 01/03/1960 03Jan1970


05/15/2000 05/15/2002 15May2003
10/10/1998 11/12/2000 25Dec2005

Q21.
A.
Using the values for Day, Month, and Year in the raw data
below, create a temporary SAS data set containing a SAS date
based on these values (call it Date) and format this value
using the MMDDYY10. format. Here are the Day, Month, and Year
values:

25 12 2005
1 1 1960
21 10 1946

B.
If there is a missing value for the day, substitute
the 15th of the month.
25 12 2005
. 5 2002
12 8 2006

**************************************************************
***********

You might also like