SAS Interview Questions You'll Most Likely Be Asked
()
About this ebook
· 645 SAS Interview Questions
· 113 HR Interview Questions
Read more from Vibrant Publishers
Core Java Interview Questions You'll Most Likely Be Asked: Job Interview Questions Series Rating: 4 out of 5 stars4/5Stakeholder Engagement Essentials You Always Wanted To Know: Self Learning Management Rating: 5 out of 5 stars5/5HR Analytics Essentials You Always Wanted To Know: Self Learning Management Rating: 4 out of 5 stars4/5Operations and Supply Chain Management Essentials You Always Wanted To Know: Self Learning Management Rating: 0 out of 5 stars0 ratingsDigital SAT Reading and Writing Practice Questions: Test Prep Series Rating: 5 out of 5 stars5/5Diversity in the Workplace Essentials You Always Wanted To Know: Self Learning Management Rating: 5 out of 5 stars5/5Leadership Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsGRE Reading Comprehension: Detailed Solutions to 325 Questions: Test Prep Series Rating: 5 out of 5 stars5/5GMAT Analytical Writing: Solutions to the Real Argument Topics: Test Prep Series Rating: 4 out of 5 stars4/5GRE Text Completion and Sentence Equivalence Practice Questions: Test Prep Series Rating: 4 out of 5 stars4/5Business Strategy Essentials You Always Wanted To Know: Self Learning Management Rating: 5 out of 5 stars5/5GRE Words In Context: The Complete List: Test Prep Series Rating: 5 out of 5 stars5/5GRE Analytical Writing: Solutions to the Real Essay Topics - Book 1: Test Prep Series Rating: 5 out of 5 stars5/5Organizational Behavior Essentials You Always Wanted To Know: Self Learning Management Rating: 3 out of 5 stars3/5GRE Master Wordlist: 1535 Words for Verbal Mastery: Test Prep Series Rating: 4 out of 5 stars4/5Innovative Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsFinancial Accounting Essentials You Always Wanted to Know: 5th Edition: Self Learning Management Rating: 0 out of 5 stars0 ratingsData Analytics Essentials You Always Wanted To Know: Self Learning Management Rating: 4 out of 5 stars4/5Business Law Essentials You Always Wanted To Know: Self Learning Management Rating: 0 out of 5 stars0 ratingsWinning Strategies For ACT Essay Writing: With 15 Sample Prompts: Test Prep Series Rating: 5 out of 5 stars5/5Project Management Essentials You Always Wanted To Know: Self Learning Management Rating: 0 out of 5 stars0 ratingsHR Interview Questions You'll Most Likely Be Asked: Job Interview Questions Series Rating: 0 out of 5 stars0 ratingsGRE Verbal Reasoning Supreme: Study Guide with Practice Questions: Test Prep Series Rating: 4 out of 5 stars4/5Sales Management Essentials You Always Wanted To Know: Self Learning Management Rating: 0 out of 5 stars0 ratingsTime Management Essentials You Always Wanted To Know: Self Learning Management Rating: 4 out of 5 stars4/5Digital Marketing Essentials You Always Wanted To Know: Self Learning Management Rating: 0 out of 5 stars0 ratingsHuman Resource Management Essentials You Always Wanted To Know: Self Learning Management Rating: 0 out of 5 stars0 ratingsLeadership Essentials You Always Wanted To Know: Self Learning Management Rating: 4 out of 5 stars4/5Social Media Marketing Essentials You Always Wanted To Know: Self Learning Management Rating: 4 out of 5 stars4/5Principles of Economics Essentials You Always Wanted To Know: Self Learning Management Rating: 0 out of 5 stars0 ratings
Related to SAS Interview Questions You'll Most Likely Be Asked
Titles in the series (33)
Java / J2EE Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsAdvanced JAVA Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsC# Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsSQL Server Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsData Structures & Algorithms Interview Questions You'll Most Likely Be Asked Rating: 1 out of 5 stars1/5Software Testing Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsHibernate, Spring & Struts Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsC & C++ Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsHadoop BIG DATA Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsCORE JAVA Interview Questions You'll Most Likely Be Asked Rating: 4 out of 5 stars4/5SAP HANA Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsJSP-Servlet Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsORACLE PL/SQL Interview Questions You'll Most Likely Be Asked Rating: 5 out of 5 stars5/5Advanced C++ Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsAdvanced SAS Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsJavaScript Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsPython Interview Questions You'll Most Likely Be Asked Rating: 2 out of 5 stars2/5Automated Software Testing Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsJava/J2EE Design Patterns Interview Questions You'll Most Likely Be Asked: Second Edition Rating: 0 out of 5 stars0 ratingsUNIX Shell Programming Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsBase SAS Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsSAS Programming Guidelines Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsLinux System Administrator Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsCCNA Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsOperating Systems Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsIBM WebSphere Application Server Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsSAS Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsSelenium Testing Tools Interview Questions You'll Most Likely Be Asked: Second Edition Rating: 0 out of 5 stars0 ratingsRESTful Java Web Services Interview Questions You'll Most Likely Be Asked: Second Edition Rating: 0 out of 5 stars0 ratings
Related ebooks
Base SAS Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsSAS Certified Professional Prep Guide: Advanced Programming Using SAS 9.4 Rating: 1 out of 5 stars1/5SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsSAS Certified Specialist Prep Guide: Base Programming Using SAS 9.4 Rating: 4 out of 5 stars4/5SAS Certification Prep Guide: Statistical Business Analysis Using SAS9 Rating: 0 out of 5 stars0 ratingsPractical and Efficient SAS Programming: The Insider's Guide Rating: 0 out of 5 stars0 ratingsExercises and Projects for The Little SAS Book, Sixth Edition Rating: 0 out of 5 stars0 ratingsSAS Macro Programming Made Easy, Third Edition Rating: 3 out of 5 stars3/5Concise Oracle Database For People Who Has No Time Rating: 0 out of 5 stars0 ratingsMachine Learning with SAS Viya Rating: 0 out of 5 stars0 ratingsFundamentals of Programming in SAS: A Case Studies Approach Rating: 0 out of 5 stars0 ratingsSAS Programming in the Pharmaceutical Industry, Second Edition Rating: 5 out of 5 stars5/5Advanced SAS Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsSAS Programming with Medicare Administrative Data Rating: 5 out of 5 stars5/5An Introduction to Creating Standardized Clinical Trial Data with SAS Rating: 0 out of 5 stars0 ratingsImplementing CDISC Using SAS: An End-to-End Guide, Revised Second Edition Rating: 0 out of 5 stars0 ratingsSAS Administration from the Ground Up: Running the SAS9 Platform in a Metadata Server Environment Rating: 5 out of 5 stars5/5Snowflake Cookbook: Techniques for building modern cloud data warehousing solutions Rating: 0 out of 5 stars0 ratingsIBM InfoSphere DataStage A Complete Guide - 2021 Edition Rating: 0 out of 5 stars0 ratingsThe Simple Guide to SAS: From Null to Novice Rating: 0 out of 5 stars0 ratingsSAS Viya: The Python Perspective Rating: 0 out of 5 stars0 ratingsQuery Optimization A Complete Guide - 2020 Edition Rating: 0 out of 5 stars0 ratingsData Management Solutions Using SAS Hash Table Operations: A Business Intelligence Case Study Rating: 0 out of 5 stars0 ratingsGetting Started with Talend Open Studio for Data Integration Rating: 0 out of 5 stars0 ratingsPostgreSQL 9 Administration Cookbook - Second Edition Rating: 0 out of 5 stars0 ratingsMicrosoft Dynamics AX 2009 Administration Rating: 0 out of 5 stars0 ratingsThe SAS Programmer's PROC REPORT Handbook: ODS Companion Rating: 0 out of 5 stars0 ratingsMicrosoft SQL Server 2012 Performance Tuning Cookbook Rating: 0 out of 5 stars0 ratings
Applications & Software For You
Blender 3D By Example Rating: 4 out of 5 stars4/5YouTube Channels For Dummies Rating: 3 out of 5 stars3/5Canva Tips and Tricks Beyond The Limits Rating: 3 out of 5 stars3/5How Do I Do That In InDesign? Rating: 5 out of 5 stars5/5Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1 Rating: 5 out of 5 stars5/5Experts' Guide to Anki Flashcards Rating: 5 out of 5 stars5/5Blender 3D Basics Beginner's Guide Second Edition Rating: 5 out of 5 stars5/5Animation for Beginners: Getting Started with Animation Filmmaking Rating: 3 out of 5 stars3/5Trend Following: Learn to Make a Fortune in Both Bull and Bear Markets Rating: 5 out of 5 stars5/5How to Build and Design a Website using WordPress : A Step-by-Step Guide with Screenshots Rating: 0 out of 5 stars0 ratingsAdobe Photoshop: A Complete Course and Compendium of Features Rating: 5 out of 5 stars5/5Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer. Rating: 5 out of 5 stars5/5Digital Painting with KRITA 2.9: Learn All of the Tools to Create Your Next Masterpiece Rating: 0 out of 5 stars0 ratingsNikon D5500 For Dummies Rating: 1 out of 5 stars1/5Working with Odoo Rating: 5 out of 5 stars5/5The Designer’s Guide to Figma: Master Prototyping, Collaboration, Handoff, and Workflow Rating: 0 out of 5 stars0 ratingsBlender 3D for Jobseekers: Learn professional 3D creation skills using Blender 3D (English Edition) Rating: 0 out of 5 stars0 ratingsSound Design for Filmmakers: Film School Sound Rating: 5 out of 5 stars5/5Mastering Python for Finance Rating: 5 out of 5 stars5/5NumPy Beginner's Guide Rating: 5 out of 5 stars5/5Python Projects for Everyone Rating: 0 out of 5 stars0 ratingsThe Basics of User Experience Design by Interaction Design Foundation Rating: 4 out of 5 stars4/52022 Adobe® Premiere Pro Guide For Filmmakers and YouTubers Rating: 5 out of 5 stars5/5Adobe Photoshop Lightroom Classic For Dummies Rating: 0 out of 5 stars0 ratingsData Points: Visualization That Means Something Rating: 4 out of 5 stars4/53D Tutorials Collection Rating: 0 out of 5 stars0 ratingsAdobe Premiere Pro For Dummies Rating: 4 out of 5 stars4/5Digital Filmmaking for Beginners A Practical Guide to Video Production Rating: 0 out of 5 stars0 ratingsUnreal Engine 4 Game Development Essentials Rating: 4 out of 5 stars4/5
Reviews for SAS Interview Questions You'll Most Likely Be Asked
0 ratings0 reviews
Book preview
SAS Interview Questions You'll Most Likely Be Asked - Vibrant Publishers
1: How do you achieve scalability in SAS programming?
Answer:
SAS program scalability can be achieved in 2 ways - by scaling up and scaling out. Scalability is ensuring the lowest time to solution, especially for the most vital tasks. Typically, when you want to speed up the task completion, you either try to complete multiple processes at the same time or distribute the task across various processors and do parallel processing. This, sometimes, involve overlapping of certain processes. Scaling up requires better hardware that is capable of multiprocessing which is known as symmetric multiprocessing or SMP. Scaling out requires more servers that can handle distributed processing.
2: How do SQL Views help better efficiency?
Answer:
A View typically consists of a subset of the entire table and hence is more efficient as it accesses a smaller set of data which is required. View also lets you hide the sensitive columns and complex queries from the user by choosing only what needs to be shown. Views always fetch fresh data from the table as they do not store any data.
3: What do you know about the SPD Engine?
Answer:
The SPD Engine or the SAS Scalable Performance Data Engine is developed for SAS 9 to speed up the processing of large data sets by splitting them into smaller physical files called partitions. There are several parallel processors that have exclusive access to each partition and process them in parallel using threads. Partitions are created when the SAS data sets are created. When a WHERE clause is mentioned it is split across the partitions and processed in parallel. Data blocks are also read in parallel. Multiple connections are created based on the partitions which further reduces the I/O bottlenecks. The SPD Engine also does an implicit sort if the query contains a by clause.
4: What resources are used to run a SAS program?
Answer:
The six resources used to run a SAS program are:
a) Programmer time – The amount of time taken by the programmer for writing, testing and maintaining the program
b) Real time – The time elapsed while executing a job
c) CPU time – The amount of time the CPU takes to perform a task. The task can be reading data, writing data, calculations or implementation of a logic
d) Memory – The work area memory space used for holding executable programs, data, etc
e) Data storage space – The disk space for storing the data. This is measured in terms of bytes, kilobytes, gigabytes etc.
f) I/O – The read and write operations performed to movie data from the memory to any output device, and vice versa
5: List the factors that need to be considered while assessing the technical environment.
Answer:
The four factors that need to be considered while assessing a technical environment are:
a) Hardware – Available memory, number of CPU’s, number of devices connected, network bandwidth, I/O bandwidth, and capability to upgrade
b) Operating environment – The resource allocation & I/O methods
c) System load – This includes the number of users sharing the system, the network traffic, and the predicted increase in load
d) SAS environment – includes all SAS software products installed, number of CPU’s, and memory allocated for SAS programming
6: Explain the functionality of the system option STIMER in the Windows environment.
Answer:
STIMER option in the Windows environment specifies that CPU time and real time statistics are tracked and written to the SAS log throughout the SAS session.
Example: The following line of code turns on the STIMER option.
options stimer;
7: What is the function of the option FULLSTIMER in the Windows operating environment?
Answer:
FULLSTIMER option in the Windows environment specifies that all the available resource usage statistics needs to be tracked and written to the SAS log throughout the SAS session.
Example:
options fullstimer;
8: Explain the MEMRPT option.
Answer:
The MEMRPT option in the z/OS environment specifies that the memory usage statistics are tracked and written to the SAS log throughout the SAS session. This is not available as a separate option in the Windows operating environment.
9: While benchmarking the programming techniques in SAS, why is it necessary to execute each programming technique in separate sessions?
Answer:
It is always necessary to execute each programming technique in separate SAS sessions while benchmarking them the first time a program is read because the operating system might load the code into the cache and retrieve it from the cache when it is referenced. This takes less time. The resource usage necessary to perform this action is referred to as overhead. Using separate sessions minimizes the effect of overhead on resource statistics.
10: While doing benchmark tests, when is it advisable to run the code for each programming technique several times?
Answer:
It is advised to run the code for each programming technique several times while benchmarking tests if the system is executing other jobs at the same time. Running the code several times reduces variations in the resource consumption associated with the task and so the average resource usage is known.
11: How do you turn off the FULLSTIMER option?
Answer:
The FULLSTIMER option can be turned off with the following line of code.
options nofullstimer;
12: What steps can be taken to reduce the programmer time?
Answer:
Programmer time is the amount of time required for the programmer to determine the specifications, write, submit, test and maintain the program. It is difficult to calculate the exact time, but it can be reduced by the use of well-documented programming practices and reuse of SAS code modules.
*****
Memory Usage
13: What is PDV? How does it work?
Answer:
PDV or Program Data Vector is a memory area created after the input buffer is created. Two extra variables _N_ and _Error_ are created by the SAS engine during compilation. These variables are used for processing but never written into the data set. SAS creates a PDV for each observation.
14: How would you choose between DATA step and PROC SQL?
Answer:
With small data sets, PROC SQL works better since it loads the entire data set into the memory and works with the data. So there’s less need to go back and forth into the database. But with large data sets DATA step will work better as loading the entire data set with PROC SQL will block a huge chunk of memory. DATA step will always take one record at a time and hence, the number of records or large volume of data will not matter as long as the database connectivity remains good.
15: Explain memory management in SAS.
Answer:
SAS, unlike Java and .Net, does not have garbage collection for memory management. But it does accomplish the job with a series of instructions called steps. Memory is allocated when the step begins and released when the step completes. This way, there’s no memory loosely allocated during the runtime. When dealing with large volumes of data, there may be cases when ample memory is not available. In such cases, SAS pushes an error message that memory not available, which is logged for reference. The hash objects in SAS lets you handle considerable amount of objects quickly. The DATA step is also efficient in memory management as it takes only one record at a time. Since most of the SAS programs depend upon a Work Area which they use to store objects temporarily, this area typically runs out of memory which needs to be handled efficiently.
16: What is the sequence of actions performed in the background while trying to create a data set from another data set?
Answer:
While creating a data set from another data set the following actions take place in the background:
a) The data gets copied from the input data set to a buffer in memory
b) From the input buffer an observation at a time is written to PDV (Program Data Vector)
c) Each observation from PDV is written to output buffer when processing is complete
d) The contents of the output buffer are written to disk when the buffer is full.
17: Define PAGE and PAGESIZE.
Answer:
A PAGE is a unit that indicates the data transfer between a storage device and PAGESIZE is the amount of data that can be transferred to one buffer in a single I/O operation.
18: What procedure is used to indicate the PAGESIZE of a data set?
Answer:
The CONTENTS procedure is used to know the PAGESIZE associated with a data set.
Example: The following CONTENTS procedure issues a message to SAS log indicating the PAGESIZE associated with the data set exam.clinic1. This also gives the number of data set pages.
proc contents data = exam.clinic1;1
run;
19: Is it possible to control the PAGESIZE of an output data set?
Answer:
It is possible to control the PAGESIZE of an output data set by using BUFSIZE= option, which specifies the PAGESIZE in bytes.
Example: The following program creates a data set exam.clinic1 from the data set exam.clinic2. In the following program the BUFSIZE= option specifies a PAGESIZE of 30720 bytes.
options bufsize=30720;
libname exam ‘c:\myprog’;
data exam.clinic1
set exam.clinic2;
run;
20: What is the default value of the BUFSIZE= option?
Answer:
The default value of the BUFSIZE= option is 0. If BUFSIZE= option is set to zero SAS uses the optimal page size determined by SAS for that operating environment.
21: Is it necessary to specify the BUFSIZE= option every time a data set is processed?
Answer:
No. The BUFSIZE= option is set at the time of creation of data set, and that value of becomes a permanent attribute of the data set. Once it is specified it is used every time the data set is processed.
22: What does the BUFNO= option signify?
Answer:
The BUFNO= option is used along with a SAS data set to lay down how many buffers are available for reading, writing, or updating. The larger the value of BUFNO= the faster the input/output function would be since more values will be stored in the buffer which avoids an actual input/output function. You can specify a larger number of pages to include in the BUFNO= and accordingly that many pages will be loaded into the memory.
Example: The following program creates a data set MyExam.MyClinic from the data set MyExam.MyClinic2 in the following program, the BUFNO= option is given the value 6, that denotes 6 buffers.
options bufno=6;
libname exam ‘D:\MyProgram’;
data MyExam.MyClinic
set MyExam.MyClinic2;
run;
23: How do you set the BUFNO= option to the maximum possible number?
Answer:
To set the maximum value to BUFNO= option, you can set BUFNO= MAX which sets the maximum buffer value available in the current operating environment. The largest possible value of MAX would be approximately 2 billion (231-1).
Example: The following program creates a data set MyExam.MyClinic from the data set MyExam.MyClinic2. In the following program, the BUFNO= option is given the value MAX, that denotes the maximum buffer available in the current environment.
options bufno=max;
libname exam ‘D:\MyProgram’;
data MyExam.MyClinic
set MyExam.MyClinic2;
run;
24: Is it necessary to specify the BUFNO= option every time a data set is processed?
Answer:
It is mandatory to specify the BUFNO= option every time a data set is processed. This is required since the buffer varies every time a data set is opened and closed. Moreover, the BUFNO= value set is valid only while a data set is open in the current session.
25: What are the general guidelines for specifying the buffer size and buffer number in the case of small data sets?
Answer:
The main objective behind specifying the buffer size and buffer number is to reduce the number of I/O operations. In the case of small data sets, care must always be taken to allocate as many buffers as there are pages in the data set. This ensures that the entire data set can be loaded into the memory using a single I/O operation.
26: How does the BUFSIZE= and BUFNO= impact the following program?
data exam.clinic1 (bufsize=12288 bufno=10);
set exam.clinic2;
run;
Answer:
The above program reads the data set exam.clinic2 and creates exam.clinic1. The BUFSIZE= option specifies that exam.clinic1 is created with a buffer size of 12288 bytes. The BUFNO= option specifies that 10 pages of data are loaded into memory with each I/O transfer.
27: Explain the SASFILE statement.
Answer:
The SASFILE statement loads the SAS data file into the memory to be available further to the program. With SASFILE you can free the buffers. Instead, the file is loaded and kept in the system memory with a pointer in the program to access it.
The following example explains the use of SASFILE in a simple way. The SASFILE statement opens the data set MyExam.MyClinic and allocates the buffer. It reads the file and loads it into the memory so that it is available to both the PROC PRINT as well as the PROC MEANS step. Finally, the SASFILE data file is closed with the CLOSE statement and the buffer is cleared.
sasfile MyExam.MyClinic load;
proc print data= MyExam.MyClinic
var. Serial No result;
run;
proc means data= MyExam.MyClinic;
run;
sasfile MyExam.MyClinic close;
28: What happens if the size of file in the memory increases during the execution of SASFILE statement?
Answer:
When the SASFILE statement is executed, SAS assigns some buffer to the data file based on the number of pages to be loaded and the size of the index file. Once this is done, the file data is loaded into the memory for updates. The buffer size is automatically increased as the file size to be saved increases. The initial buffer memory size allocated is only the minimum memory allocated to load the file. It automatically increases provided there is ample memory left in the current operating system.
29: Mention the guidelines to be followed while using SASFILE statement.
Answer:
While using the SASFILE statement, the following procedures are to be followed:
a) There should be sufficient real memory to load the file.
b) In case, there is not enough memory to load the entire file into one SAS data set, the DATA step should be used to create a subset of the file which will fit into the available memory. Since one part of the file is already loaded into the memory, the rest of the file data can also be easily accessed by the program. This reduces the CPU time significantly.
30: When is the buffer allocated by the SASFILE statement freed?
Answer:
The buffer allocated by the SASFILE statement to load the data file is freed in two instances:
a) When the SASFILE CLOSE statement is executed, the file is closed, and the buffer allocated for the data file is closed.
Example: In the following program the SASFILE statement opens the data set MyExam.MyClinic and allocates the buffer. It reads the data into the memory which is available through the PROC PRINT and PROC MEANS steps. The last SASFILE statement closes the SAS data file and frees the buffer allocated for the file.
sasfile MyExam.MyClinic load;
proc print data= MyExam.MyClinic
var Serial No result;
run;
proc means data= MyExam.MyClinic;
run;
sasfile MyExam.MyClinic close;
b) The SASFILE buffer is allocated only as long as the session is open. When SAS session ends, it frees the buffer and closes the data file.
31: Which operations are not allowed in a file opened with SASFILE statement?
Answer:
There are certain operations that cannot be performed on a file opened with SASFILE statement, such as replacing the file and renaming the variables.
32: How do you calculate the total number of bytes occupied by a data file if you know the page size?
Answer:
The total number of bytes that a data file occupies can be calculated by multiplying the page size by the number of pages.
Example: If the data file exam.clinic1 has a page size of 8192 and number of pages is 900, then the data file occupies 7372800 bytes (8192 * 9423).
*****
Data Storage Space
33: What compresses the data storage space required to store a data set?
Answer:
SAS programs comprise of many temporary data sets which hold information during the runtime. You can choose to hold the data permanently in one or more data sets depending upon the available space and program requirements. Ideally, you can save the space for data sets by reducing the number and size of data sets and by cleaning up the storage space of everything unnecessary. SAS uses compression algorithms to reduce the size of the data sets. The COMPRESS= YES or Binary option is used to compress the data set. COMPRESS= YES is used with data sets that primarily contain character data. COMPRESS= Binary is used with data sets that primarily contain numeric data. The REUSE= YES is used when you want to reuse the space after compression.
34: How does the WHERE statement help in reducing data storage space?
Answer:
The WHERE statement lets you remove all unnecessary observations or records being fetched into the data set. When using the WHERE statement, only those records that satisfy the WHERE condition will be fetched by the data set. So, it helps to filter the data being fetched thereby reducing the data storage space.
35: How do you clean up the storage space?
Answer:
You can clean up the storage space by using the DATASETS or DELETE procedures. While using the PROC DATASETS method, you have to mention the library and then the data set to delete. When using the PROC DELETE method, you have to mention the exact data to be deleted. The more popular method is to use the PROC DATASETS option. This makes sure that the temporary file created to hold the data is deleted as soon as it is not required.
36: Explain COMPRESS= System option.
Answer:
The COMPRESS= System option is used to compress all data files created during a particular session. It is used as COMPRESS= NO/YES/BINARY/CHAR. By default, Compress is set to NO which means no compression. When you set it to YES or CHAR, using the RLE algorithm, the trailing blanks and zeros are trimmed off. It basically compresses the character data. The BINARY option used with COMPRESS runs a Ross Data Compression (RDC) which uses a combination of RLE and sliding-window compression wherein a dictionary of frequently used words or character patterns are stored. The dictionary assigns a number and replaces the phrase with that number on each occurrence. A map of these numbers and phrases are maintained separately. Thus, the main data set is compressed.
37: I have a compressed data set. I want to add an observation to it. Will it allow me to add the new observation? If yes, where will it be added?
Answer:
Yes, you can add new observations to an already compressed data set. The new observation will be added to the end of the existing list. This is because the descriptor of the data set will rest after the last observation in the data set. If any observation is deleted, it is not reused or tracked. Instead, the new observations are added at the end of the current data set.
38: Explain POINTOBS= data set option.
Answer:
Typically, the data set is traversed sequentially. But the POINTOBS= option provides you direct access to a particular observation using the observation number. You can set whether direct access is allowed or not by using the Data
39: Explain LENGTH, ATTRIB, KEEP and DROP statements.
Answer:
The LENGTH, ATTRIB, KEEP and DROP statements are used to compress the data stored in variables. Both LENGTH and ATTRIB statements can be used to limit the size of the variable. ATTRIB can also be used to format the variable. In case, the length of the variable is not adequate, the data may be truncated. KEEP specifies that certain variables need to be kept in the memory until they are explicitly dropped. DROP specifies that the variables need to be dropped or deleted since they may not be accessed again.
40: What factors are considered by SAS when calculating the data storage space required for a SAS data file?
Answer:
The following factors are considered by SAS when calculating the data storage space required for a SAS data file:
a) Storage space required by the descriptor portion
b) Storage space required by the observations
c) Any storage overhead
d) Storage space required for associated indexes
41: How does a SAS character variable store data and what is the default length of a character variable?
Answer:
SAS character variables store data as one character per byte. The default length of a character variable is 8 bytes.
42: Which step can be taken to reduce the length of a character variable?
Answer:
A LENGTH statement can be used to control the length of character variable.
Example: In the following program the data set exam.clinic1 is created from the data set exam.clinic2. The variable, name, is assigned a value of 5. So, the variable name of the data set exam.clinic1 will have a length of 5.
data exam.clinic1;
length name $ 5;
set exam.clinic2;
run;
43: How does SAS store numeric values and what is the default length of a numeric variable?
Answer:
SAS stores numeric values using double precision floating point representation (form of scientific notation). This helps with storing numbers of large magnitude and to perform computations that require precision after the decimal point. The default length of numeric variables is 8 bytes.
44: Explain the significance of PROC COMPARE.
Answer:
PROC COMPARE is used to compare the contents of two SAS data sets. It compares the following:
a) Data set attributes
b) Variables
c) Observations
d) Variable attributes and values of matching variables
Example: The following PROC COMPARE step compares the two data sets exam.result1 and exam.result2 and prints the result in SAS log:
proc compare base= exam.result1
compare= exam.result2;
run;
45: What all conditions make a data file an ideal candidate for compression?
Answer:
A data file becomes an ideal candidate for compression if it satisfies one or more of the following conditions:
a) It is large
b) It has many missing values
c) It has many lengthy character values
d) It has repeated characters or binary zeroes
e) It has repeated values in the variable which are physically stored next to one another
46: Explain the compression of a data set.
Answer:
A SAS data file by default is uncompressed. It can be compressed to conserve disk space. A data set can be compressed by using the COMPRESS= option.
Example: The following program creates a compressed data set exam.result1 from the data set exam.result2. When the data set is created SAS writes a note to the log indicating the percentage of reduction in size obtained by compressing the data set. Here it uses the RLE (Run Length Encoding) algorithm for compressing the data set. RLE algorithm compresses the observations by reducing the repeated consecutive characters to 2-byte or 3-byte representations.
data exam.result1 (compress= yes);
set exam.result2;
run;
47: Which option is used for accessing an observation directly in an uncompressed data set?
Answer:
The POINT= option can be used for accessing an observation directly in an uncompressed data set.
Example: The following program creates data set exam.result1 from the data set exam.result2. This program accesses the 5th observation directly from the SAS data set. number is a temporary variable that is created and contains the observation number of observation to be read. It is assigned a value before the SET statement is executed. The OUTPUT statement is used to override the automatic output and write this observation directly to the data set. The STOP statement is used for preventing continuous looping.
data exam.result1 ;
number=5;
set exam.result2 point=number;
output;
stop;
run;
48: Which option is used for controlling direct access in a compressed data set?
Answer:
The POINTOBS= option can be used for controlling whether direct access is allowed or not in a compressed data set.
Example: The following program creates a compressed data set exam.result1 from the data set exam.result2. The option POINTOBS= YES ensures that random access to the compressed data set exam.result1 is allowed.
data exam.result1 (compress= yes pointobs= yes);
set exam.result2;
run;
49: Once a SAS data file is compressed, is it possible to change the setting to uncompressed?
Answer:
Once a data file is compressed the settings become a permanent attribute of the file. The file has to be created again to change the setting to uncompressed.
50: Explain the significance of REUSE= option.
Answer:
REUSE= option is used to specify whether SAS reuses the space when observations get added to the compressed data set. If the REUSE= system option is set to YES, then SAS tracks and reuses the free space in the compressed data set which is created.
Example: The following program creates a compressed data set exam.result1 from the data set exam.result2. Since the option REUSE= is set to YES, SAS tracks and reuses the free space in the exam.result1 data set.
data exam.result1 (compress= yes reuse= yes);
set exam.result2;
run;
51: What is the main difference between a SAS data file and a SAS data view?
Answer:
SAS data file and SAS data view are both SAS data sets. The main difference is that SAS data file contains both descriptor information and data values. Descriptor