OS FIle System
OS FIle System
OS FIle System
Files
– Data collections created by users
– The File System is one of the most important parts of
the OS to a user
– Desirable properties of files:
• Long-term existence - Files are stored on disk or other
secondary storage and do not disappear when a user logs off
• Sharable between processes - Files have names and can have
associated access permissions that permit controlled sharing
• Structure - Files can be organized into hierarchical or more
complex structure to reflect the relationships among files
File Systems
File Structure
• Four terms are commonly used when discussing
files:
– Field
– Record
– File
– Database
Structure Terms
• File
– Basic element of data
– Contains a single value
– Fixed or variable length
• Database
– Collection of related data
– Relationships among elements of data are explicit
– Designed for use by a number of different applications
– Consists of one or more types of files
Structure Terms
• File
– Collection of similar records
– Treated as a single entity
– May be referenced by name
– Access control restrictions usually apply at the file level
• Record
– Collection of related fields that can be treated as a unit
by some application program
– Fixed or variable length
File Management System
• Meet the data management needs of the user
• Guarantee that the data in the file are valid
• Optimize performance
• Provide I/O support for a variety of storage device
types
• Minimize the potential for lost or destroyed data
• Provide a standardized set of I/O interface routines
to user processes
• Provide I/O support for multiple users in the case
of multiple-user systems
Indexed
Pile Sequential Indexed Hashed
Sequential
Logical I/O
Device Drivers
• Lowest level
• Communicates directly with peripheral devices
• Responsible for starting I/O operations on a device
• Processes the completion of an I/O request
• Considered to be part of the operating system
Basic File System
• Also referred to as the physical I/O level
• Primary interface with the environment
outside the computer system
• Deals with blocks of data that are
exchanged with disk or tape systems
• Concerned with the placement of blocks on
the secondary storage device
Access Method
• Level of the file system closest to the user
• Provides a standard interface between
applications and the file systems and
devices that hold the data
• Different access methods reflect different
file structures and different ways of
accessing and processing the data
Physical blocks Physical blocks
Records in main memory in secondary
buffers storage (disk)
File
Structure
Directory Access
management method Disk
Blocking scheduling
File
allocation
User access
control
n
Index Main File
levels
Index
2
1
Overflow
File
The Pile
• Least complicated form of file organization
• Data are collected in the order they arrive
• Each record consists of one burst of data
• Purpose is simply to accumulate the mass of
data and save it
• Record access is by exhaustive search
Variable-length records
Variable set of fields
Chronological order
Overflow
File
Indexed File
• Records are accessed only through their indexes
• Variable-length records can be employed
• Exhaustive index contains one entry for every
record in the main file
• Partial index contains entries to records where the
field of interest exists
• Used mostly in applications where timeliness of
information is critical
• Examples would be airline reservation systems
and inventory control systems
Exhaustive Exhaustive Partial
index index index
Primary File
(variable-length records)
B-Trees
• A balanced tree structure with all branches
of equal length
• Standard method of organizing indexes for
databases
• Commonly used in OS file systems
• Provides for efficient searching, adding, and
deleting of items
Key1 Key2 Keyk–1
B-Tree Characteristics
• A B-tree is characterized by its minimum
degree d and satisfies the following
properties:
– Every node has at most 2d – 1 keys and 2d
children or, equivalently, 2d pointers
– Every node, except for the root, has at least d –
1 keys and d pointers, as a result, each internal
node, except the root, is at least half full and
has at least d children
B-Tree Characteristics
• A B-tree is characterized by its minimum
degree d and satisfies the following
properties:
– The root has at least 1 key and 2 children
– All leaves appear on the same level and contain
no information. This is a logical construct to
terminate the tree; the actual implementation
may differ
– A nonleaf node with k pointers contains k – 1
keys
23 51 61 71
2 10 30 32 39 43 44 52 59 60 67 68 73 85 88 96
23 51 61 71
2 10 30 32 39 43 44 52 59 60 67 68 73 85 88 90 96
23 39 51 61 71
2 10 30 32 43 44 45 52 59 60 67 68 73 85 88 90 96
(c) Key = 45 inserted. This requires splitting a node into two parts and promoting one key to the root node.
51
23 39 61 71 88
2 10 30 32 43 44 45 52 59 60 67 68 73 84 85 90 96
(d) Key = 84 inserted. This requires splitting a node into two parts and promoting one key to the root node
This then requires the root node to be split and a new root created.
File Name Name as chosen by creator (user or program). Must be unique within a specific
directory.
Address Information
Information
Volume
Starting Address
Indicates device on which file is stored
Starting physical address on secondary storage (e.g., cylinder, track, and block
number on disk)
Elements of
Size Used
Size Allocated
Current size of the file in bytes, words, or blocks
User who is assigned control of this file. The owner may be able to grant/deny
Directory
access to other users and to change these privileges.
Access Information A simple version of this element would include the user's name and password for
each authorized user.
Usage Information
Date Last Read Access Date of the last time a record was read
Date of Last Backup Date of the last time the file was backed up on another storage medium
Current Usage Information about current activity on the file, such as process or processes that
have the file open, whether it is locked by a process, and whether the file has been
updated in main memory but not yet on disk
Operations Performed on a
Directory
• To understand the requirements for a file
structure, it is helpful to consider the types
of operations that may be performed on the
directory:
– Search
– Create files
– Delete files
– List directory
– Update directory
Two-Level Scheme
• There is one directory for each user and a master directory
• Master directory has an entry for each user directory
providing address and access control information
• Each user directory is a simple list of the files of that user
• Names must be unique only within the collection of files of
a single user
• File system can easily enforce access restriction on
directories
Master Directory
Directory Directory
"User_C" Directory "User_B" "User_A"
Draw
Word
Unit_A ABC
Directory "Unit_A"
ABC File
"ABC"
Pathname: /User_B/Word/Unit_A/ABC
File Sharing
• Two issues arise when allowing files to be
shared among a number of users:
– Access rights
– Management of simultaneous access
Access Rights
• None - The user would not be allowed to read the
user directory that includes the file
• Knowledge - The user can determine that the file
exists and who its owner is and can then petition
the owner for additional access rights
• Execution - The user can load and execute a
program but cannot copy it
• Reading - The user can read the file for any
purpose, including copying and execution
Access Rights
• Appending - The user can add data to the file but
cannot modify or delete any of the file’s contents
• Updating - The user can modify, delete, and add
to the file’s data
• Changing protection - The user can change the
access rights granted to other users
• Deletion - The user can delete the file from the
file system
User Access Rights
• Owner
– Usually the initial creator of the file
– Has full rights
– May grant rights to others
• Specific Users
– Individual users who are designated by user ID
• User Groups
– A set of users who are not individually defined
• All
– All users who have access to this system
– These are public files
Record Blocking
• Blocks are the unit of I/O with secondary
storage
– For I/O to be performed records must be
organized as blocks
• Given the size of a block, three methods of
blocking can be used:
– Fixed-Length Block
– Variable-Length Spanned Bloc
– Variable-Length Unspanned Blocking
Fixed-Length Block
– Fixed-Length Blocking – fixed-length records
are used, and an integral number of records are
stored in a block
– Internal fragmentation – unused space at the
end of each block
Variable-Length Blocking
• Variable-Length Spanned Blocking –
variable-length records are used and are
packed into blocks with no unused space
• Variable-Length Unspanned Blocking –
variable-length records are used, but
spanning is not employed
Record 1 Record 2 Record 3 Record 4 Track 1
File Allocation
On secondary storage, a file consists of a
collection of blocks
The operating system or file management system
is responsible for allocating blocks to files
The approach taken for file allocation may
influence the approach taken for free space
management
File Allocation
• Space is allocated to a file as one or more
portions (contiguous set of allocated
blocks)
• File allocation table (FAT)
– Data structure used to keep track of the portions
assigned to a file
Portion Size
• Items to be considered:
1) Contiguity of space increases performance,
especially for Retrieve_Next operations, and greatly
for transactions running in a transaction-oriented
operating system
2) Having a large number of small portions increases
the size of tables needed to manage the allocation
information
3) Having fixed-size portions simplifies the reallocation
of space
4) Having variable-size or small fixed-size portions
minimizes waste of unused storage due to
overallocation
Alternatives
• Two major alternatives:
– Variable, large contiguous portions
• Provides better performance
• The variable size avoids waste
• The file allocation tables are small
– Blocks
• Small fixed portions provide greater flexibility
• They may require large tables or complex structures for their
allocation
• Contiguity has been abandoned as a primary goal
• Blocks are allocated as needed
15 16 17 18 19
File C
20 21 22 23 24
File E
25 26 27 28 29
File D
30 31 32 33 34
20 21 22 23 24
25 26 27 28 29
30 31 32 33 34
10 11 12 13 14
15 16 17 18 19
20 21 22 23 24
25 26 27 28 29
30 31 32 33 34
10 11 12 13 14
15 16 17 18 19
20 21 22 23 24
25 26 27 28 29
30 31 32 33 34
10 11 12 13 14
15 16 17 18 19
1
20 21 22 23 24 8
3
25 26 27 28 29 14
28
30 31 32 33 34
10 11 12 13 14
15 16 17 18 19
Start Block Length
20 21 22 23 24 1 3
28 4
25 26 27 28 29 14 1
30 31 32 33 34
Bit Tables
• This method uses a vector containing one
bit for each block on the disk
• Each entry of a 0 corresponds to a free
block, and each 1 corresponds to a block in
use
• Advantages:
– Works well with any file allocation method
– It is as small as possible
Chained Free Portions
• The free portions may be chained together
by using a pointer and length value in each
free portion
• Negligible space overhead because there is
no need for a disk allocation table
• Suited to all file allocation methods
Volumes
• A collection of addressable sectors in secondary
memory that an OS or application can use for data
storage
• The sectors in a volume need not be consecutive
on a physical storage device
– They need only appear that way to the OS or
application
• A volume may be the result of assembling and
merging smaller volumes
UNIX File Management
• In the UNIX file system, six types of files
are distinguished:
– Regular, or ordinary - Contains arbitrary data
in zero or more data blocks
– Directory - Contains a list of file names plus
pointers to associated inodes (index nodes)
– Special - Contains no data but provides a
mechanism to map physical devices to file
names
mode
Data Data Data
owners (2)
Data
timestamps (4)
Data Data Data
size
direct(0) Data
Pointers
Data Data
Pointers
direct(12)
triple indirect
Pointers
block count Data
reference count
Pointers Pointers
flags (2)
Data
Pointers
generation number
extended
attribute Pointers
blocks Data
Inode
Direct 12 48K
i1 Name1
i2 Name2
i3 Name3
i4 Name4
Volume Structure
• A UNIX file system resides on a single logical
disk or disk partition and is laid out with the
following elements:
– Boot block - Contains code required to boot the
operating system
– Superblock - Contains attributes and information
about the file system
– Inode table - Collection of inodes for each file
– Data blocks - Storage space available for data files
and subdirectories