0% found this document useful (0 votes)
29 views

DBMS Notes

DBMS Notes

Uploaded by

Dustin Henderson
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
29 views

DBMS Notes

DBMS Notes

Uploaded by

Dustin Henderson
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 20
DBMS NOTES What is Database? The database is a collection of interrelated data which is used to retrieve, insert and delete the data jently. Itis also used to organize the data in the form of a table, schema, views, and reports, etc. Database Management System ‘© Database management system is a software which is used to manage the database. For example: MySQL Oracle ete are a very popular commercial database which is used in different applications. © DBMS provides an interface to perform various operations like database creation, storing data init, updating data, creating a table in the database and a lot more. * Itprovides protection and security to the database. In the case of multiple users, it also maintains data consistency. Characteristics of DBMS © Ituses a digital repository established on a server to store and manage the information. Itcan provide a clear and logical view of the process that manipulates data. ‘© DBMS contains automatic backup and recovery procedures. ‘© Itcontains ACID properties which maintain data in a healthy state in case of failure. © Itcan reduce the complex relationship between data, ‘* _Itis used to support manipulation and processing of data. * Itis used to provide security of data. ‘© It can view the database from different viewpoints according to the requirements of the user. Advantages of DBMS It can control data redundancy because it stores all the data in one single database file and that recorded data is placed in the database. ‘© Data sharing: In DBMS, the authorized users of an organization can share the data among ‘multiple users. ‘* Easily Maintenance: It ean be easily maintainable due to the centralized nature of the database system. ‘© Reduce time: It reduces development time and maintenance need. ‘* Backup: It provides backup and recovery subsystems which create automatic backup of data from hardware and software failures and restores the data if required. ‘© multiple user interface: It provides different types of user interfaces like graphical user interfaces, application program interfaces Disadvantages of DBMS . ware and Software: It requires a high speed of data processor and large memory Size: Itoccupies a large space of disks and large memory to run them efficiently. . : Database system creates additional complexity and requirements. ‘© Higher impact of failure: Failure is highly impacted the database because in most of the organization, all the data stored in a single database and if the database is damaged due to electric f lure or database corruption then the data may be lost forever. ‘What is RDSMS (Relational Database Management System) RDBMS stands for Relational Database Management System. All modern database management systems like SQL, MS SQL Server, IBM DB2, ORACLE, My-SQL, and Microsoft Access are based on RDBMS. It is called Relational Database Management System (RDBMS) because it is based on the relational model introduced by E.F. Codd. How it works Data is represented in terms of tuples (rows) in RDBMS. A relational database is the most commonly used database. It contains several tables, and each table has its primary key. Due toa collection of an organized set of tables, data can be accessed easily in RDBMS. DBMS Architecture + The DBMS design depends upon its architecture. The basic client/server architecture is used to deal with a large number of PCs, web servers, database servers and other components that are connected with networks. * The client/server architecture consists of many PCs and a workstation which are connected via the network. © DBMS architecture depends upon how users are connected to the database to get their request done. ‘Types of DBMS Architecture (ons Database architecture can be seen as a single tier or multi-tier. But logically, database architecture is of two types. 7 4-Tier Architecture \ ) * Inthis architecture, the database is directly available to the user. It means the user can directly sit on the DBMS and uses it. Any changes done here will directly be done on the database itself. It doesnt provide a handy tool for end users. The 1-Tier architecture is used for development of the local application, where programmers can directly communicate with the database for the quick response. 2-Tier Architecture © The 2-Tier architecture is same as basic client-server. In the two-tier architecture, g Databsse system Server applications on the client end can directly commur For th cate with the database at the server si interaction, API's like: ODBC, JDBC are used. ‘* The user interfaces and application programs are run on the client-side. Application © The server side is responsible to provide the I Client functionalities like: query processing and transaction 7 management. _ © To communicate with the DBMS, client-side application establishes a connection with the server side. 3-Tier Architecture The 3-Tier architecture contains another layer betw n the client and server. In this architecture, client can't rectly communicate with the server, The application on the client-end interacts with an application server which further communicates with the database system. End user has no idea about the existence of the database beyond the application server. The database also has no idea about any other user beyond the application. The 3-Tier architecture is used in case of large web application. Three schema Architecture ‘The three-schema architecture is as follows: Applicaton Server ‘Application Gent —_ The three schema architecture is also called ANSI/SPARC architecture or three-level architecture, This framework is used to describe the structure of a specific database system. Cllent The three schema architecture is also used to separate the user applications and physical database. The three schema architecture contains three-levels. It breaks the database down into three different categories. It shows the DBMS architecture. Conceptual Schema Mapping is used to transform the request and response between various database levels of architecture. Internal schema Mapping is not good for small DBMS because it takes more time. EsternalSehemna [ External Level ) / ‘Concepiual Level | Database = | External Conceptual Mapping Mapping ‘* In External / Conceptual mapping, it is necessary to transform the request from external level to conceptual schema, © In Conceptual / Internal mapping, DBMS transform the request from the conceptual to internal levThe ER-model (Entity Relationship model) is a conceptual model used in database design. It represents entities as rectangles, relationships as diamonds, and attributes as ovals. The basic concepts of the ER-model include entities, attributes, and relationships. Entities are objects or concepts in the real world that have a unique identity and can be distinguished from other entities. Examples of entities might include customers, orders, products, or employees. Attributes describe the properties or characteristics of entities. Examples of attributes might include a customer's name, address, or phone number. Relationships describe how two or more entities are related to each other. For example, a customer might place an order, which involves a relationship between the customer and the order. Relationships can have cardinality constraints, such as one-to-one, one-to-many, or many-to-many. Weak entity types are entities that do not have a unique identity of their own and depend on another entity for identification. For example, a line item in an order might depend on the order itself for identification. Weak entity types are represented using double rectangles. ER-diagrams are diagrams that depict entities, relationships, and attributes. They are used to model the structure of a database and provide a visual representation of the data. Subclasses and superclasses are used to represent hierarchical relationships between entities. Subclasses inherit attributes and relationships from their parent superclass, For example, a car might be a subclass of a vehicle superclass, inheriting attributes such as make, model, and year. Inheritance, specialization, and generalization are used to model complex relationships between entities. Inheritance involves a subclass inheriting all the attributes and relationships of its parent superclass. Specialization involves creating new entities from a superclass, while generalization involves combining entities into a more general entity Overall, the ER-model provides a framework for understanding the structure and relationships of data in a database, allowing for effective database design and management. ‘Types of entities, types of relationshi types of attributes In database design, entities are objects or concepts that are significant and independent and are represented in the database. Relationships define how entities relate to each other. Attributes describe the properties of entities and relationships. Here are some common types of entities, relationships, and attributes: Types of Entities: 1. Strong Entity: Itis an entity that can be uniquely identified by its attributes alone. 2. Weak Entity: It is an entity that cannot be uniquely identified by its attributes alone and depends on its relationship with a strong entity. 3. Associative Entity: Its an entity that connects the relationships between two or more other entities. Types of Relationships: 1. One-to-One Relationship: Itis a relationship in which one entity is related to exactly one other entity. 2. One+to-Many Relationship: tis a relationship in which one entity is related to multiple instances of another entity. 3. Many-to-Many Relationship: Its a relationship in which multiple instances of one entity are related to multiple instances of another entity. Types of Attributes: 1. Simple Attribute: It is an attribute that cannot be divided into smaller parts 2. Composite Attribute: It is an attribute that can be divided into smaller parts, 3. Derived Attribute: It is an attribute that can be calculated or derived from other attributes. 4, Key Attribute: itis an attribute that uniquely identifies an entity in a database 5. Null Attribute: Itis an attribute that does not have a value for a particular entity or relationship instance. DATA MODELS Data models define how the logical structure of a database is modeled. Data Models are fundamental entities to introduce abstraction in a DBMS. Data models define how data is connected to each other and how they are processed and stored inside the system. The very first data model could be flat data-models, where all the data used are to be kept in the same plane. Earlier data models were not so scientific, hence they were prone to introduce lots of duplication and update anomalies Relational Model The most popular data model in DBMS is the Relational Model. It is more scientific a model attributes, predicate logic and defines a table as an n-ary relation. ‘The main highlights of this model are — © Datais stores tables called vabe (lator) relations. ‘* Relations can be normalized. ed relations, values saved are atomic values. * Each row in a relation contains a unique value. © Each column in a relation contains values from a same domain Hierarchical model A hierarchical database model is a data model in which the data are organized into a tree-like structure. The data are stored as records which are connected to one another through links. A record is a collection of fields, with each field containing only one value. The type of a record defines which fields the record contains. The hierarchical database model mandates that each child record has only one parent, whereas each parent record can have one or more child records. In order to retrieve data from a archical database, the whole tree needs to be traversed starting from the root node. This model is recognized as the first database model created by IBM in the 1960s Applications of hierarchical model : ‘* Hierarchical models are generally used as semantic models in practice as many real-world occurrences of events are hierarchical in nature like biological structures, Political, or social structures. ‘* Hierarchical models are also commonly used as physical models because of the inherent jerarchical structure of the disk storage system like tracks, cylinders, ete. ‘There are various examples such as Information Management System (IMS) by IBM, NOMAD by NCSS, ete. NETWORK MODEL The network database model was created to solve the shortcomings of the hierarchical database model. In this type of model, a child can be Linked to multiple parents, a feature that was not supported by the hierarchical data model. The parent nodes are known as owners and the child nodes are called members. ‘The network data model can be represented as Advantages of Network Model The network model can support many to many relationships as seen in the diagram. D2 and C3 ‘each have multiple masters. The masters for D2 are C1 and C2 while for C3 are B1 and B2. In this way, the network data model can handle many to many relationships where the hierarchical data model didn't. Disadvantages of Network Model There are some disadvantages in the network model even though it is an improvement over the hierarchical model. These are ¢ The network model is much more complicated than the Hierarchical model. As such, it is difficult to handle and maintain. * Although the Network model is more flexible than the Hierarchical model, it still has flexi xy problems. Not all relations can be handled by assigning them in the form of ‘owners and members. The structure of the Network Model is quite complicated and so the programmer has ‘to understan« well in order to implement or modify it. Keys are one of the basic requirements of a relational database model. It is widely used to identify the tuplesirows) uniquely in the table. We also use keys to set up relations amongst various columns and tables of a relational database. Different Types of Key: ‘the Relational Model 1. Candidate Key: The minimal set of attributes that can uniquely identify a tuple is known as a candidate key. For Example, STUD_NO in STUDENT relation. It is a minimal super key. It is a super key with no repeated data is called a candidate key. The minimal set of attributes that can uniquely identify a record. t must contain unique values. lt can contain NULL values. Every table must have at least a single candidate key. A table can have multiple candidate keys but only one primary key (the primary key cannot have a NULL value, so the candidate key with a NULL value can't be the primary key). The value of the Candidate Key is unique and may be null for a tuple. There can be more than one candidate key in a relationship. The candidate key can be simple (having only one attribute) or composite as well. 2. Primary Key: There can be more than one candidate key in relation out of which one can be chosen as the primary key. For Example, STUD_NO, as well as STUD_PHONE, are candidate keys for relation STUDENT but STUD_NO can be chosen as the primary key (only one out of many cani keys). Itis a unique key. lt cannot be NULL. bea primary key for a table. 3. Super Key: The set of attributes that can uniquely identify a tuple is known as Super Key. For Example, STUD_NO, (STUD_NO, STUD_NAME), etc. A super key is a group of single or multiple keys that identifies rows in a table. It supports NULL values. * Adding zero or more attributes to the candidate key generates the super key. It can identify only one tuple (a record) at a time. thas no duplicate values, it has unique values. late Primary keys are not necessarily to be a single column; more than one column can also ‘* Acandidate key is a super key but vice versa is not true. 4, Alternate Key: The candidate key other than the primary key is called an alternate key. ‘* All the keys which are not primary keys are called alternate keys. ¢ Its a secondary key. ‘¢ It contains two or more fields to identify two or more records. ‘* These values are repeated. © Eg SNAME, and ADDRESS is Alternate keys Candidate Key ~——~——!____ a == [nee | preree_oonoe primal ay ‘Alternate Key 5. Foreign Key: If an attribute can only take the values which are present as values of some other attribute, it will be a foreign key to the attribute to which it refers. The relation which is being referenced is called referenced relation and the corresponding attribute is called referenced attribute the relation which refers to the referenced relation is called referencing relation and the corresponding attribute is called referencing attribute. The referenced attribute of the referenced relation should be the primary key to it. It is a key it acts as a primary key in one table and it acts as secondary key in another table. ‘¢ It combines two or more relations (tables) at a time. © They act as a cross-reference between the tables. ‘¢ For example, DNO is a primary key in the DEPT table and a non-key in EMP. It may be worth noting that, unlike the Primary Key of may contain duplicate tuples i.e. it need not follow ee al STUDENT_COURSE relation is not unique. It has been Ld — 7 na = repeated for the first and third tuples. However, the =} —_— STUD_NO in STUDENT relation is a primary key and it Student Details Student Marks needs to be always unique, and it cannot be null. 6. Composite Key: Sometimes, a table might not have a single column/attribute that uniquely identifies all the records of a table. To uniquely identify rows of a table, a combination of two or more columns/attributes can be used. It still can give duplicate values in rare cases. So, we need to find the optimal set of attributes that can uniquely identify rows in a table. © Itacts as a primary key if there is no primary key in a table ‘* Two or more attributes are used together to make a composite key. '* Different combinations of attributes may give different accuracy in terms of identifying the rows uniquely. tablet ———F patie ‘Atertte Key Different Types of Keys Why are keys necessary for DBMS? ‘* Keys are one of the important aspects of DBMS. Keys help us to find the tuples(rows) uniquely in the table. It is also used in developing various relations amongst columns or tables of the database. What is a Unique Key? ® Unique Keys are the keys that define the record uniquely in the table. It is different from Primary Keys, as Unique Key can contain one NULL value but Primary Key does not any NULL values. What is an Artificial Key? ‘© Artificial Keys are the keys that are used when no attributes contain all the properties of the Primary Key or if the Primary key is very large and complex. Data Independence © Data independence can be explained using the three-schema architecture. © Data independence refers to the characteristic of being able to modify the schema at one level of the database system without altering the schema at the next higher level. ‘There are two types of data independence: 1. Logical Data Independence ‘© Logical data independence refers to the characteristic of being able to change the conceptual schema without having to change the external schema. ‘© Logical data independence is used to separate the external level from the conceptual view. © Ifwe do any changes in the conceptual view of the data, then the user view of the data would not be affected. ‘© Logical data independence occurs at the user interface level. 2. Physical Data Independence ‘* Physical data independence can be defined as the capacity to change the internal schema without Loker Data ndenendence having to change the conceptual schema. Locate ‘If we do any changes in the storage size of the Physic Dats napa database system server, then the Conceptual Physical vol structure of the database will not be affected. ‘© Physical data independence is used to separate conceptual levels from the internal levels. © Physical data independence occurs at the logical interface level. DBMS vs. File System File System Approach File based systems were an early attempt to computerize the manual system. It is also called a traditional based approach in which a decentralized approach was taken where each department stored and controlled its own data with the help of a data processing specialist. The main role of a data processing specialist was to create the necessary computer file structures, and also manage the data within structures and design some appl ion programs that create reports based on file data. Consider an example of a student's file system. The student file will contain information regarding the student roll no, student name, course etc.). Similarly, we have a subject file that contains information about the subject and the result file which contains the information regarding the result. Some fields are duplicated in more than one file, which leads to data redundancy. So to overcome this problem, we need to create a centralized system, i.e. DBMS approach. Spanned Unspanned Mapping In computer storage, spanned and unspanned mapping are two different methods of storing data on a disk, Unspanned mapping is a simple method where each file is stored contiguously on the disk. This means that the file is stored in a single location and can be accessed easily. The main advantage of this method is that itis fast to access the file as the operating system only needs to access one location on the disk to retrieve the entire file, However, the downside is that if the file size changes, it may not fit in the original location and additional space may need to be allocated, leading to fragmentation of the disk Spanned mapping, on the other hand, is a method where a file is stored in non-contiguous locations on the disk. This is necessary when a file is too large to fit into a single contiguous location. In this case, the file is broken into smaller parts, which are then stored in different locations on the disk. The operating system keeps track of the different parts of the file and retrieves them as needed. This method reduces fragmentation, but it takes longer to access the file because the operating system needs to access multiple locations on the disk to retrieve the entire file. Sorted vs unsorted file structure In a computer system, data is often stored in files for easy access and retrieval. Two common file structures are sorted and unsorted files. In an unsorted file, the records are stored in no particular order. This means that records can be added or deleted from the file without affecting the order of the other records. However, searching for a specific record can take longer since all records must be scanned until the desired record is found. Ina sorted file, records are arranged in a specific order, such as alphabetically or numerically. This makes searching for a specific record much faster since binary search or other efficient search algorithms can be used. However, adding or deleting records from a sorted file requires rearranging the order of the records, which can be time-consuming and require more resources. ‘The choice between sorted and unsorted file structure depends on the specific application requirements. If the main operation is searching for specific records, then a sorted file structure may be more appropriate. If the main operation is adding or deleting records, then an unsorted file may be more efficient, Dems: A database approach is a well-organized collection of data that are related in a meaningful way which can be accessed by different users but stored only once in a system. The various operations performed by the DBMS system are: Insertion, deletion, selection, sorting etc. sere Users In the above figure, duplication of data is reduced due to centralization of data. ‘There are the following differences between DBMS and File systems: Basle DBMS Approach Fle System Approach Meaning DBMS is a collection of data. In DBMS, ‘The file system is a collection of data, In this the user is not required to write the system, the user has to rite the procedures for procedures. managing the database. ‘Sharing of data Due to the centralized approach, data Data is distributed in many files, and it may be of sharing is easy. diferent format, sot isnt easy to she Data Abstraction DBMS gives an abstract view of data ‘The file system provides the detall of the data that hides the details representation and storage of data, Security and Protection Recovery Mechanism ‘Manipulation Techniques ‘concurrency Problems Where to use cost Data Redundancy Inconsistency DBMS provides @ good protection mechanism. DBMS. provides a crash recovery mechanism, Le., DBMS protects the user from system f DBMS contains a wide variety of sophisticated techniques to store and retrieve the data DBMS takes care of Concurrent access of data using some form of locking. Database approach used in large systems which interrelate many files The database system is expensive to design. Due to the centralization of the database, the problems of data redundancy and Inconsistency are controled It Iont easy to protect a file under the file system, The file system doesnt have @ crash mechanism, ie, if the system crashes while centering some data, then the content ofthe file ‘The fle system cant efficiently store and retrieve the data, Inthe File system, concurrent access has many problems like redirecting the fle while deleting ‘some information or updating some information. File system approach used in large systems which interrelate many files ‘The filesystem approach is cheaper to design. In this, the files and application programs are created by diferent programmers so that there ‘exists a fot of duplication of data which may lead to inconsistency. structure Data Independence Integrity constraints Data Models Flexibility Examples ‘The database structure is complex to design In this system, Data Independence exists, andi can be of two types. Logical Data Independence Physical Data Independence Inegrity Constraints In the database approach, 3 types of data models exist: Hierarchal data models Network data models Relational data models, Changes are often a necessity to the content of the data stored in any system, and these changes casily with a database approach, Oracle, SQL Server, Sybase et. ‘The fle system approach has a simple structure, In the Filesystem approach, there exists no Data Independence. fe system. Inthe fil system approach, ther is no concept of data models exists ‘The flexibility of the system is less as compared tothe DBMS approach. Cobol, c+ ete Database Management Challenges 1. Increased problem in the appearance of data. 2. Limitations on mitigation 3. Increase data volume.4. Data security.5. Data Management and Distribution. ‘What are the components of DBMS? Hardware, Software, Data, Database Access Language, Procedures and Users all together form Indexing in DBMS. © Indexing is used to optimize the performance of a database by mit the number of | m= | accesses required when a query is processed. © The index is a type of data structure. It is used to locate and access the data in a database table quickly. Index structure: Indexes can be created using some database columns. Search key Dam Reference Fig: Structure of Index ©The first column of the database is the search key that contains a copy of the primary key or candidate key of the table. The values of the primary key are stored in sorted order so that the corresponding data can be accessed easily. ‘The second column of the database is the data reference. It contains a set of pointers holding the address of the disk block where the value of the particular key can be found. ees Indexing Methods \ / \. /\ JN \ = = The indices are usually sorted to make searching faster. The indices which are sorted are known as ordered ini Primary Index © Ifthe index is created on the basis of the primary key of the table, then it is known as primary Indexing. These primary keys are unique to each record and contain 1:1 relation between the records. As primary keys are stored in sorted order, the performance of the searching operation is quite efficient. The primary index can be classified into two types: Dense index and Spars Dense index © The dense index contains an index record for every search key value in the data fi searching faster. © Inthis, the number of records in the index table is same as the number of records in the main table. w 1 ‘Agra | 1604300 © Itneeds more space to store index Usa usa | Chicago [2,789,378 record itself. The index records have Nepal Nepal | Kathmandu | 1,456,634 the search key and a pointer to the UK te UK | Cambridge | 1.360368 actual record on the disk. Sparse index © Inthe data file, index record appears only for a few items. Each item points to a block. e a Nepal sy Chie 2789378 main table, the index points to the records in the a ~ main table in a gap. WK Se Kathmandu | 1456634 UW | Cami | 130036 © Inthis, instead of pointing to each record in the Clustering Index © Acluster index can be defined as an ordered data file. Sometimes the index is cr non-primary key columns which may not be unique for each record. © Inthis case, to identify the record faster, we will group two or more columns to get the unique value and create index out of them. This method is called a clustering index. 9 The records which have si lar character jos are grouped, and indexes are created for these group. ‘The previous schema is little confusing because one disk block is shared by records which belong to the different cluster. If we use separate disk block for separate clusters, then it is called better technique. Secondary Index In the sparse indexing, as the size of the table grows, the size of mapping also grows. These mappings are usually kept in the primary memory so that address fetch should be faster. Then the secondary memory searches the actual data based on the address got from mapping. If the mapping size grows then fetching the address itself becomes slower. In this case, the sparse index will not be efficient. To overcome this problem, secondary indexing is introduced. In secondary indexing, to reduce the size of mapping, another level of indexing is introduced. In this method, the huge range for the columns is selected initially so that the mapping size of the first level becomes small. Then ich range is further divide into smaller ranges. The mapping of the first level is stored in the primary memory, so that address fetch is faster. The mapping of the second level and actual data are stored in the secondary memory (hard disk).

You might also like