1181 BD Databases-Ramaz
1181 BD Databases-Ramaz
1181 BD Databases-Ramaz
EQUIJOIN and NATURAL JOIN overview of, 245 LIS (local internal schema), 889
variations, 159–161 specialization and generalization, List constructor, 359
hybrid hash-join, 696 269 Literals (values), 378–382
implementing, 689–690 Label-based security atomic formulas as, 973
join selection factors, 693–694 administrator defining policy for, atomic literals, 378
multiple relation queries and 853 collection literals, 382
JOIN ordering, 718–719 Oracle Label Security, 868–870 complex types for, 358–360
nested-loop joins, 690–693 Label Security administrator, 853 in OO systems, 358
overview of, 157–158 Labels, semistructured data and, 417 structured literals, 378
partition-hash joins, 694–696 LANs (local area networks), 44, 879 Loading/converting data, in data-
Join operations, in QBE, 1094–1095 Large databases, 304 base application life cycle, 308
Join selection factors, 693–694 Latches, for short term locks, 802 Loading databases, initial state and,
Join selectivity ratio, 160, 715 Late (dynamic) binding, in ODMS, 33
Joined tables (relations), 123–124 368 Loading utility, for loading data files
JPEG image format, 966 Latency. See Rotational delay (rd) to database, 42–43
K-means clustering algorithm, Lattices, for specialization, 255–256 Local area networks (LANs), 44, 879
1055–1056 LCS (local conceptual schema), 889 Local conceptual schema (LCS), 889
KDD (Knowledge Discovery in LDAP (Lightweight Directory Access Local internal schema (LIS), 889
Databases), 1036 Protocol), 919–921 Local query optimization, 902
Key attribute, 209 Leaf classes, 265 Local schema, in federated database
Key constraints Leaf-deep trees, 718 architecture, 890
on entity attributes, 208–209 Leaf nodes, of tree structures, 646 Localization, in distributed query
integrity constraints in databases, Learning approaches processing, 901
21 classification and, 1051 Location analysis, for spatial data-
overview of, 68–70 clustering and, 1054 bases, 959
specifying in SQL, 95–96 neural networks and, 1058 Location transparency, 880
Key field, sorted files and, 603 Legacy data models, 51 Lock compatibility table, 792
Keys Legacy database systems, 49, 60 Lock manager subsystem, in DBMS,
candidate and primary in rela- Legal relation states (legal exten- 779
tional databases, 518–519 sions), 514 Lock table, 779
indexes on multiple, 660–661 Levels of isolation, of transaction, Locking. See also Two-phase locking
methods for simple selection, 686 755 for concurrency control, 777
in ODMG object model, 388–390 Libraries of functions. See Function granularity level in, 795–796
specifying in XML schema, 430 calls, database programming index locking and predicate lock-
Keyword queries with ing, 801
overview of, 39 Lifespan temporal attributes, multiple granularity level,
searching with, 995 953–954 796–798
types of queries in IR systems, Lightweight Directory Access used in indexes, 798–800
1007 Protocol (LDAP), 919–921 Locks
Kleinberg, Jon, 1021 LIKE comparison operator, in string binary, 778–780
Knowledge-based systems, 932, 1007 pattern matching, 105 certify locks, 792–793
Knowledge Discovery in Databases Linear hashing, 614–616 conversion of, 782
(KDD), 1036 Linear regression, 1058 shared/exclusive (read/write),
Knowledge discovery process Linear searches 780–782
data mining in, 1036–1037 with brute force algorithm, in two-phase locking, 778
goals of, 1037–1038 685–686 Log buffers, 753–754
types of knowledge discovered, cost functions for SELECT opera- Log records, 753
1038–1039 tions, 713 Log sequence number (LSN), 822
KR (knowledge representation) of file blocks on disk, 597 Logic databases, 932. See also
aggregation and association, of files, 602 Deductive database systems
269–271 Lines, on maps, 960 Logic programming, 970
classification and instantiation, Link structure, of Web pages, Logical (conceptual) level, goodness
268 1020–1021 of relation schemas and, 501
compared with semantic data Linked allocation, of file blocks on Logical data independence, in three-
models, 267–268 disk, 597 schema architecture, 35–36
identification, 269 Links, in UML class diagrams, 227 Logical data models, 341
Index 1153
collection operators in OQL, Order preserving functions, hashing Parser, checking query syntax with,
403–405 and, 609 679
comparison operators in SQL, 98 Ordered (indexed) Partial categories, 260
concatenate operator in PHP, 485 collection expressions in OQL, 405 Partial dependencies, 523
grouping operator in QBE, cost functions for SELECT opera- Partial keys, 219
1095–1098 tions, 713 Partial order, of transaction sched-
logical connectives. See AND, OR, query results in SQL, 106–107 ule, 757
NOT connectives Ordered (sorted files), in records, Partial specialization, 253–254, 264
overloading. See Polymorphism 603–606 Partially committed state, transac-
(operator overloading) Ordering field, file organization and, tions, 752
relational, 980–981, 983 603 Partially replicated catalogs, 913
SELECT operator (σ), 147–149 Ordering key, sorted files and, 603 Participation constraints, on binary
spatial, 960–961 Organization context, for database relationships, 217
Operators, database workers behind systems, 304–307 Partition algorithm, for local fre-
the scene, 17 OSs (operating systems) quent itemsets, 1047
Optical jukebox memories, 586 DBMS access and disk read/write, Partition-hash joins
Optimization, in data mining, 1038 40 methods for implementing joins,
Optimizing queries. See Query pro- multiprogramming, 744–745 690
cessing and optimizing support for transaction process- overview of, 694–696
Optional fields, in file records, 595 ing in distributed databases, Partitioned hashing, 661–662
OQL (object query language) 909 Passwords, DBAs assigning, 839
collection operators and, 403–405 OUTER JOIN operations Path expressions
extracting single elements from implementing, 699–700 dot notation for build path
singleton collections, 403 vs. inner joins, 160–161 expressions in SQL, 376
group by clause in, 405–406 overview of, 169–170 in OQL, 400
in ODMG standard, 376, 398 in SQL, 123–124 specifying with XPath, 432–434
ordered (indexed) collection Outer queries, 117 Patterns
expressions, 405 OUTER UNION operation, in rela- analysis phase of Web usage
overview of, 398–399 tional algebra, 170–171 analysis, 1027
query results and path expres- Outliers, spatial, 965 data mining for discovering, 1057
sions, 400–402 Overflow (transaction) files, 605 substring pattern matching in
simple OQL queries, database Overlapping SQL, 105–106
entry points, and iterator vari- entity sets, 253 within time series, 1039
ables, 399–400 specialization and, 264 PEAR (PHP Extension and
specifying views as named OWL (Web Ontology Language), Application Repository), 491
queries, 402–403 969 Peer-to-peer database systems, 915
OR logical connective. See AND, OR, Owner accounts, granting/revoking Performance
NOT connectives privileges, 843–844 advantages of distributed data-
Oracle Package diagrams, UML, 330 bases for, 882
Cartridge, 931 PageRank algorithm, 1021 DBMS utilities for monitoring, 43
distributed databases, 915–919 Parallel architecture, for servers, Persistence
query optimization in, 721–722 1079 collections, 367
Oracle Internet Directory, 919–921 Parallel database management data, 586
Oracle Label Security systems, vs. distributed archi- objects, 363–364, 378
architecture of, 869 tecture, 887–888 Persistent storage, of program
combining data labels and user Parallel processing objects in databases, 19
labels, 869–870 on disks, 593–594 Persistent stored modules (PSM),
overview of, 868 handling multiple processes, 745 474–476
virtual private database technolo- Parameterized statements (bind Personal databases, 305
gy, 868–869 variables), protecting against Personalization, of information in
ORDBMS (object-relational data- SQL injection, 858 Web searches, 1019
base management systems), 354 Parameters Personnel costs, in choosing a
ORDER BY clause, SQL disk blocks (pages), 1087–1089 DBMS, 323–324
ordering query results, 106–107 SQL/PSM (SQL/Persistent Stored PGP (Pretty Good Privacy), 854
in retrieval queries, 129–130 Modules), 474 Phantom records, concurrency con-
sorting query results, 682–683 Parametric users, interfaces for, 39 trol techniques, 800–801
1158 Index
converting query trees into query distributed query processing RDBs (relational databases)
execution plans, 709–710 using semijoin operation, 904 designing. See relational database
cost components of query execu- overview of, 901–902 design
tion, 711–712 query update and decomposition, overview of, 395–396
cost functions for JOIN, 715–718 905–907 schemas. See relational database
cost functions for SELECT, Query results schemas
713–715 cursors for looping over tuples in, RDF (Resource Description
DBMS module for, 20 450 Framework), 436
disjunctive selection conditions, ordering, 106–107 Reachability, of objects, 363
688 path expressions and, 400–402 Read command, hard disks, 591
external sorting, 682–685 retrieval queries from database Read-only transaction, 745
heuristic algebraic optimization tables, 494–495 READ operation, transactions, 751
algorithm, 708–709 Query (transaction) server, in two- Read (or Get) operation, on files, 600
heuristic optimization of query tier client/server architecture, Read phase, of optimistic concur-
trees, 703–706 47 rency control, 794
heuristics used in query optimiza- Query trees Read-set, of transaction, 747
tion, 700–701 converting into query execution Read timestamp, 789
hybrid hash-join, 696 plans, 709–710 Read-write conflicts, in transaction
implementing JOIN operations, creating, 679 schedules, 757
689–690 notation for, 163–165, 701–703 Read/write heads, on hard disks, 591
implementing SELECT opera- optimization of, 703–706 Read/write, OSs controlling disk
tions, 685 R-Trees, for spatial indexing, 962 read/write, 40
join selection factors, 693–694 RAID (Redundant Array of Read-write transactions, 745–747
multiple relation queries and Inexpensive Disks) read_item(X), 746
JOIN ordering, 718–719 levels, 620–621 Real-time database technology, 3
nested-loop joins, 690–693 overview of, 617–619 Reasoning mechanisms, in knowl-
notation for query trees and performance improvements, edge representation, 268
query graphs, 701–703 619–620 Recall metrics, in IR, 1015–1017,
operations, 700 reliability improvements, 619 1019
OUTER JOIN operations, 699–700 RAM (Random Access Memory), Recall/precision curve, in IR, 1017
overview of, 679–681 585 Record-at-a-time DMLs, 38
partition-hash joins, 694–696 Random access storage devices, 592 Record-based data models, 31
PROJECT operations, 696–697 Randomizing function (hash func- Record pointers, 609
query optimization in Oracle, tion), 606 Records. See also Files (of records)
721–722 Range queries, 686, 961 anchor record (block anchor), 633
search methods for complex Range relations, of tuple variables, blocking, 597
selection, 686–687 175–176 catalog information used in query
search methods for simple selec- Rational Rose cost estimation, 712
tion, 685–686 data modeler, 338 fixed-length and variable-length,
selectivity and cost estimates in database design with, 337 595–597
query optimization, 710–711 tools and options for data model- inserting, 493–494
selectivity of conditions and, ing, 338–342 mixed, 616–617
687–688 RBAC (role-based access control), ordered (sorted files), 603–606
semantic query optimization, 851–852 phantom records, concurrency
722–723 RBG (red, blue, green) colors, 967 control techniques, 800–801
set operations, 697–698 RDBMS (relational database man- placing file records on disk, 594
summary and exercises, 723–725 agement systems) spanned/unspanned, 597–598
transformation rules for relation- creating indexes, 731 in SQL/CLI, 464–468
al algebra operations, 706–708 ORDBMS (object-relational data- types of, 594–595
translating SQL queries into rela- base management systems), unordered (heap files), 601–602
tional algebra, 681–682 354 Recoverability, transaction sched-
Query processing and optimizing, in providing application flexibility, ules based o, 757–759
distributed databases 23–24 Recovery. See also Backup and
data transfer costs for distributed two-tier client/server architec- recovery; Database recovery
query processing, 902–904 tures and, 46 techniques
Index 1161
Reset operations, on files, 599 Root tag, XML documents, 423 testing conflict serializability of,
Resource Description Framework Roots, of tree structures, 646 763–765
(RDF), 436 Rotation. See Pivoting (rotation) Schema
Response time, physical database Rotation invariant feature transform conceptual design, 313–321
design and, 326 (RIFT), 968 entity type describing for entity
Restrict option, of delete operation, Rotational delay (rd) sets, 208
77 as disk parameter, 1087 instances and database state and,
Result equivalence, of transaction on hard disks, 591 32–33
schedules, 762 Row-level access control, 852–853 ontologies and, 272
Result relations, 75 Row-level triggers, 937 relational. See Relational database
Result tables, in QBE, 1095 Rows. See Tuples (rows) schemas
Retrieval operations Rows, in SQL, 89 relational data model and, 70–73
database design and, 728 RSA encryption algorithm, 865 three-schema architecture. See
from database tables, 494–495 Rule consideration, in active Three-schema architecture
on files, 599 databases Schema construct, 32, 222
modes of interaction in IR deferred consideration, 942 Schema diagram, 32
systems, 999 overview of, 938–939 Schema evolution, 33
objects, 362 Rule-defined predicates (views), Schema matching, types of Web
QBE (Query-By-Example), 978 information integration, 1023
1091–1095 Rule sets, in active database systems, Schema, SQL
types of relational data model 938 change statements, 137–139
operations, 75 Rules, in deductive databases names, 89
Retrieval transactions, 322 interpretation of, 975–977 overview of, 89–90
Retroactive update, valid time rela- overview of, 21, 932 Schema (view) integration, 316–317,
tions and, 949 in Prolog/Datalog notation, 319–321
Return values, of PHP functions, 970–972 Schemaless XML documents, 422
490 safe, 979–980 Scientific applications, 25
Reverse engineering, Rational Rose Runtime database processor Scope, variable, 490
and, 338 DBMS component modules, 42 Scripting languages, PHP as, 482
Revoking privileges, 844, 845–846 query execution and, 679 SCSI (Small Computer System
Rewrite blocks, file organization Runtime, specifying SQL queries at, Interface), 591
and, 602 458–459 SDL (storage definition language),
Rewrite time, as disk parameter, Safe expressions, in tuple relational 37, 110
1089 calculus, 182–183 Search engines
RIFT (rotation invariant feature Safe rules, in deductive databases, overview of, 998–999
transform), 968 979–980 vertical and metasearch, 1018
Rigorous two-phase locking, 785 Sampling algorithm, in data mining, Search fields, 648
Rivest, Ron, 865 1042 Search trees, 647–649
ROLAP (relational OLAP), 1079 SANs (Storage Area Networks), Searches
Role-based access control (RBAC), 621–622 conversational, 1029–1030
851–852 Saturation, hue, saturation, and faceted, 1028–1029
Role hierarchy, in role-based access value (HSV), 967 information retrieval. See IR
control, 851 SAX (Simple API for XML), 423 (Information Retrieval)
Role names, and recursive relation- Scale-invariant feature transform measures of relevance, 1014–1015
ships, 215 (SIFT), 968 methods for complex selection,
Roll-up display Scan operations, files, 600 686–687
functionality of data warehouses, Scanner, for SQL, 679 methods for simple selection,
1078 Schedules (histories), of transactions 685–686
working with data cubes, characterizing based on recover- navigational, informational, and
1070–1072 ability, 757–759 transactional, 996
ROLLBACK (or ABORT) operation, characterizing based on serializ- social searches, 1029
752 ability, 759–760 Web. See Web search and analysis
Rollbacks, in database recovery, equivalence of, 768–770 Second normal form (2NF)
813–815, 950 overview of, 755–757 general definition of, 526–527
Root element, XML schema lan- serial, nonserial, and conflict- overview of, 523
guage, 429 serializable schedules, 761–763 Secondary access path, 631
1164 Index
Secondary file organization, 587 Selective inheritance, in ODBs describing knowledge discovered
Secondary indexes (object databases), 368 by data mining, 1039
advantages of, 668 Selectivity and cost estimates, in discovery of, 1057
cost functions for SELECT, 714 query optimization in pattern discovery phase of Web
methods for simple selection, 686 catalog information used in cost usage analysis, 1027
overview of, 636–642 functions, 712–713 Serial schedules, 761
tables comparing index types, 642 cost components of query execu- Serializability, of transaction
types of ordered indexes, 632–633 tion, 711–712 schedules
Secondary keys, 636 cost functions for JOIN, 715–718 characterizing schedules based
Secondary storage, 584, 711 cost functions for SELECT, on, 759–760
Secret key algorithms, 863 713–715 serial, nonserial, and conflict-
Sectors, of hard disk, 589 multiple relation queries and serializable schedules, 761–763
Security JOIN ordering, 718–719 testing conflict serializability of
vs. precision, 841 overview of, 710–711 schedules, 763–765
Web security, 1028 Selectivity, of conditions, 687–688 used for concurrency control,
Security and authorization subsys- Self-describing data, 10–11, 416 765–768
tem, DBMS, 19 Semantic constraints view serializability, 768–769
Security, database. See Database relational model constraints, 68 Serialization (precedence) graph,
security template dependencies and, 572 763–765
Seek time (s) types of constraints, 74 Servers
as disk parameter, 1087 Semantic data models client program calling database
on hard disks, 591 abstraction concepts in, 268 server, 451
Segmentation, automatic analysis of aggregation and association, database servers, 42
images, 967 269–271 DBMS module for, 29
SELECT command, SQL classification and instantiation, 268 parallel architecture for, 1079
aggregate functions used in, 125 compared with knowledge repre- PHP variables, 490–491
basic form of, 97–98 sentation, 267–268 server level in two-tier client/
FROM clause, 107 ER (Entity-Relationship) model, server architecture, 47
DISTINCT keyword with, 103 245 specialized servers in client/server
information retrieval with, 97 identification, 269 architecture, 45–46
projection attributes and selec- for information retrieval, Set-at-a-time DMLs, 38
tion conditions, 98, 100 1006–1007 Set constructor, 359
in SQL retrieval queries, 129–130 specialization and generalization, SET DIFFERENCE operation
SELECT-FROM-WHERE structure, 269 algorithms for, 697–698
of SQL queries, 98–100 Semantic query optimization, in relational algebra, 152–155
SELECT operations 722–723 Set null (set default) option, in
cost functions for, 713–715 Semantic relationships, in semantic delete operations, 77–78
disjunctive selection conditions, model for IR, 1006 Set operations
688 Semantic Web, 272–273 algorithms for, 697–698
on files, 599 Semantics query processing and optimizing,
implementing, 685 approach to IR, 1000 697–698
in relational algebra, 147–149 of attributes, 503–507, 514 SQL, 104
search methods for complex equivalence of transaction sched- Set types, in network data model, 51
selection, 686–687 ules and, 769–770 Sets
search methods for simple selec- heterogeneity of in federated equivalence of, 549
tion, 685–686 databases, 886–887 explicit sets of values in SQL, 122
selectivity of conditions, 687–688 integrity constraints and, 21 SQL table as multiset of tuples, 97
SELECT operator (σ), 147 tagging images, 969 tables as, 103–105
Select-project-join queries, 179 Semijoin operation, 904 Shadow directory, 820
Selection cardinality, 712 Semistructured data, 416–417 Shadow paging, 820–821
Selection conditions Separators, XPath, 432 Shamir, Adi, 865
in domain calculus, 184 Sequence diagrams, UML, 329, 331 Shape, automatic analysis of images,
SELECT command and, 98, 100 Sequential order, in accessing data 967
SELECT operation and, 147 blocks, 592 Shape descriptors, 965
Selection, functionality of data Sequential patterns Shared nothing architecture,
warehouses, 1079 in data mining, 1037 887–888
Index 1165
Shared subclasses (multiple inheri- Social searches, 1029 Specialized servers, in client/server
tance), 256, 297 Software costs, choosing a DBMS, architecture, 45
Shared variables, embedded SQL 323 Specific attributes (local attributes),
and, 452 Software developers, 16 of subclass, 249
Sharing data and multiuser transac- Software engineers Specific relationship types, sub-
tions, 13–14 database actors on the scene, 16 classes and, 249–250
Sharing databases, 6 design and testing of applications, Specification, conceptualization and,
Shrinking (second) phase, in two- 199 272
phase locking, 782 Sort-merge joins Speech input and output, queries
SIFT (scale-invariant feature cost functions for, 717 and, 39
transform), 968 methods for implementing joins, SQL-99, 942–943
Simple API for XML (SAX), 423 689–690 SQL/CLI (Call Level Interface)
Simple (atomic) attributes, in ER Sort-merge strategy, 683 database programming with,
model, 205–207 Sorting 464–468
Simple Object Access Protocol external, 682–685 library of functions, 448
(SOAP), 436 functionality of data warehouses, SQL injection attacks
Simultaneous update, 949 1078 code injection, 856
Single inheritance, subclasses and, implementing aggregate opera- function call injection, 856–857
256–257 tions, 699 protecting against, 858
Single-level indexes ordered records (sorted files), risks associated with, 857–858
clustering indexes, 635–636 603–606 SQL manipulation, 856
overview of, 632–633 Space utilization, physical database types of, 855
primary indexes, 633–635 design and, 326 SQL programming techniques
secondary indexes, 636–642 Spamming, Web spamming, 1028 approaches to database program-
tables comparing index types, 642 Spanned/unspanned organization, ming, 449–450
Single-loop joins of records, 597 bibliographic references, 479
cost functions for, 716 Sparse indexes, 633 database programming tech-
methods for implementing joins, Spatial analysis, 959 niques and issues, 448–449
689 Spatial applications, 25 dynamic SQL, 448, 458–459
Single-quoted strings, PHP text Spatial databases embedded SQL. See Embedded
processing, 485–486 applications of spatial data, SQL
Single-relation options, for mapping 964–965 function calls. See Function calls,
specialization or generalization, data indexing, 961–963 database programming with
295 data mining, 963–964 impedance mismatch, 450
Single-sided disks, 589 data types and models, 959–960 overview of, 447–448
Single time points, in temporal dynamic operators, 961 sequence of interactions in, 451
databases, 946 operators, 960–961 SQL/PSM (SQL/Persistent Stored
Single-user systems, 49 overview of, 957–959 Modules). See SQL/PSM (SQL/
Single-user transaction processing Spatial joins/overlays, 961 Persistent Stored Modules)
system, 744–745 Spatial outliers, 965 summary and exercises, 477–478
Single-valued attributes, in ER Special purpose DBMSs, 50 SQL/PSM (SQL/Persistent Stored
model, 206 Specialization/generalization Modules)
Singular value decompositions constraints on, 251–254 overview of, 473
(SVD), 967 definitions, 264 specifying persistent stored
Slice and dice, functionality of data design choices for, 263–264 modules, 475–476
warehouses, 1078 EER-to-Relational mapping, stored procedures and functions,
Small Computer System Interface 294–297 473–475
(SCSI), 591 generalization, 250–251 SQL (Structured Query Language).
SMART document retrieval system, hierarchies and lattices, 254–257 See also Embedded SQL
998 in knowledge representation, 269 * (asterisk) for retrieving all
SMP (symmetric multiprocessor), notation for, 1084–1085 attribute values of selected
1079 refining conceptual schemas, tuples, 102–103
Snowflake schema, for multidimen- 257–258 aliases, 101–102
sional data models, 1073–1074 specialization, 248–250 bibliographic references, 114
SOAP (Simple Object Access UML (Unified Modeling CHECK clauses for specifying
Protocol), 436 Language), 265–266 constraints on tuples, 97
1166 Index
clauses in simple SQL queries, CREATE VIEW command, Statechart diagrams, UML, 329, 333
107 134–135 Statement-level active rules, in
common data types, 92–94 DROP command, 138 STARBURST example, 940–942
CREATE TABLE command, 90–92 EXISTS and NOT EXISTS func- Statement-level triggers
data definition in, 89 tions, 120–122 overview of, 937
dealing with ambiguous attribute explicit sets and renaming of in STARBURST example, 940
names, 100–101 attributes, 122 Statement records, in SQL/CLI,
DELETE command, 109 GROUP BY clause, 126–129 464–468
embedding SQL commands in HAVING clause, 127–129 Static (early) binding, in ODMS,
Java, 459–461 inline views, 137 368
external sorting, 682–685 nested queries, 117–119 Static files, 601
INSERT command, 107–109 outer and inner joins, 123–124 Static hashing, 610
list of features in, 110–111 overview of, 115 Static Web pages, 420
manipulation by SQL injection schema change statements, 137 Statistical analysis, in pattern dis-
attacks, 856 summary and exercises, 139–143 covery phase of Web usage
missing or unspecified WHERE UNIQUE function, 122 analysis, 1026
clauses, 102 view implementation and update, Statistical approach, to IR,
naming constraints, 96–97 135–137 1000–1002
object-relational features in, 354 views (virtual tables) in, 133–134 Statistical database security, 859–860
ordering query results, 106–107 SQL (Structured Query Language), Statistical databases, 837–838, 874
overview of, 87–89 ODB extensions to Statistical queries, 859
QBE compared with, 1098 dot notation for build path Steal/no-steal techniques
schema and catalog concepts in, expressions, 376 in database recovery, 811–812
89–90 encapsulation of operations, UNDO/REDO recovery algorithm,
SELECT-FROM-WHERE structure 374–375 819
of queries, 98–100 inheritance and polymorphism, Stem, of words, 1010
servers, 47 375–376 Stemming, text preprocessing in
specifying attribute constraints OIDs (object identifiers) using information retrieval, 1010
and default values, 94–95 reference types, 373–374 Stopwords
specifying key and referential overview of, 369–370 in keyword queries, 1007
integrity constraints, 95–96 specifying relationships via refer- removal, 1009–1010
substring pattern matching and ence, 376 text/document sources, 966
arithmetic operators, 105–106 tables based on UDTs, 374 Storage
summary and exercises, 111–114 UDTs and complex structures for allocation of file blocks on disk,
tables as sets in, 103–105 objects, 370–373 598
temporal data types, 945 SQLJ bibliographic references, 630
transaction support, 770–772 embedding SQL command in buffer management and, 593–594
translating SQL queries into rela- Java, 459–461 column-based storage of rela-
tional algebra, 681–682 retrieving multiple tuples using tions, 669–670
UDT (user-defined types) in, 111 iterators, 461–464 cost components of query execu-
UPDATE command, 109–110 SQLODE communication variable, tion, 711
SQL (Structured Query Language), 454 covert channels, 861
advanced features SQLSTATE communication variable, database storage, 586–587
aggregate functions, 124–126 454 database storage reorganization,
ALTER command, 138–139 Standards 43
bibliographic references, 143 database approach and, 22 database tuning and, 733
clauses in retrieval queries, database design specification, 328 file headers (descriptors) and, 598
129–130 SQL, 88 file systems and. See Files (of
comparisons involving NULL and Star schema, 1073 records)
three-valued logic, 116–117 Starvation, concurrency control files, fixed-length records, and
correlated nested queries, and, 788 variable-length records,
119–120 State 595–597
CREATE ASSERTION command, in ODMG object model, 382 hardware structures of disk
131–132 relational database state, 70–72 devices, 588–592
CREATE TRIGGER command, transaction, 751–752 iSCSI (Internet SCSI), 623–624
132–133 State constraints, 75 magnetic tape devices, 592–593
Index 1167
measuring capacity, 585 specific attributes (local attrib- recovery needed due to system
memory hierarchies and, 584–586 utes) of, 249 error, 750
NAS (network-attached storage), specific relationship types and, security issues at system level, 836
622–623 249–250 System designers, 16
overview of, 583–584 union types or categories, System environment
parallelization of access. See RAID 258–260 DBMS module, 40–42
(Redundant Array of Subset of Cartesian product, 63 tools, application environments,
Inexpensive Disks) Subsets, of attributes, 68–69 and communication facilities,
placing file records on disk, 594 Substring pattern matching, in SQL, 43–44
record blocking and, 597 105–106 utilities for, 42–43
records and record types, 594–595 Subtrees, 646 System independent mapping, in
SANs (Storage Area Networks), Subtypes, 247, 365–366 choosing a DBMS, 326
621–622 SUM function System logs. See also Logs/logging
secondary storage devices, 587 aggregate functions in SQL, auditing and, 839–840
spanned/unspanned records, 124–125 database recovery and, 808
597–598 grouping and, 166, 168 tracking transaction operations,
summary and exercises, 624–630 implementing aggregate opera- 753–754
Storage Area Networks (SANs), tions, 698 Systems analyst, 16
621–622 Superclass/subclass relationships Table inheritance, in SQL, 376
Storage definition language (SDL), in EER model, 264 Tables
37, 110 overview of, 247 ALTER TABLE command, 138–139
Storage medium, physical, 584 union types or categories, assigning privileges at table level,
Stored attributes, in ER model, 206 258–260 842–843
Stored data manager module, Superclasses base tables (relations) vs. virtual
DBMS, 40, 42 base class and, 265 relations, 90
Stored procedures, 21, 473–475 in EER model, 246–248, 264 basing on UDTs, 374
Stream-based processing, 700 generalization and, 250 DROP TABLE command, 138
Streaming XML documents, 423 options for mapping specializa- in relational model, 60, 61
Strict hierarchies, 255 tion or generalization, 294 retrieval queries from database
Strict schedules, 759 specialization and, 248 tables, 494–495
Strict timestamp ordering, 790–791 Superkeys in SQL, 89
Strict two-phase locking, 784–785 defined, 518 SQL table as multiset of tuples,
Strings relational model constraints, 69 97, 103–105
pattern matching, 105 Supertypes, 247, 365 virtual. See Views
PHP text processing, 485 Superuser accounts, 838 Tags
Strong entity types, 219, 287 Supervised learning HTML, 418–419
Struct (tuple) constructors, 358–359 classification and, 1051 semistructured data and, 417
Structural constraints, of relation- neural networks and, 1058 Tape jukeboxes, 586
ships, 218 Support, for association rules, 1040 Tape, magnetic, 592–593
Structural diagrams, UML, 329 Surrogate keys, 298 Tape reel, 592
Structured data Survivability, challenges in database Taxonomies, 272
extracting, 1022 security, 867 Technical metadata, in data ware-
overview of, 416 SVD (singular value decomposi- housing, 1078
vs. unstructured, 993–994 tions), 967 Templates
Structured domains, in UML class Symmetric key algorithms, 863 dependencies, 572
diagrams, 227 Symmetric multiprocessor (SMP), in Query-By-Example, 1091
Structured literals, 378 1079 Temporal aggregation, 957
Subclasses Synonyms, thesaurus as collection Temporal databases
in EER model, 246–248, 264 of, 1010 attribute versioning for incorpo-
generalizing into superclasses, 250 Syntactic analysis, in semantic rating time in OODBs,
as leaf classes in UML, 265 model for IR, 1006 953–954
options for mapping specializa- System bitemporal time relations,
tion or generalization, 294 accounts, 838 950–952
predicate-defined and user- catalog, 42 options for storing tuples in tem-
defined, 252 definition in database application poral relations, 952–953
shared, 256 life cycle, 308 overview of, 943–945
1168 Index
querying constructs using TSQL2 text preprocessing in information Timing channels, covert, 861
language, 954–956 retrieval, 1010–1011 TO. See Timestamp ordering (TO)
time representation, calendars Third normal form (3NF) Tool developers, 17
and time dimensions, 945–947 dependency-preserving and non- Tools, DBMS, 43–44
time series data, 957 additive join decomposition Top-down methodology
transaction time relations, into, 558–563 for conceptual refinement, 257
949–950 dependency-preserving decompo- for database design, 502
valid time relations, 947–949 sition into, 558–559 for schema design, 315–316
Temporal intersection join, 952 general definition of, 528 Topical relevance, in IR, 1015
Temporal normal form, 952 overview of, 523–525 Topological operators, 960
Temporal variables, 948 Thomas’s write rule, 791 Topological relationships, among
Temporary updates (dirty reads), Threats, to database security, spatial objects, 959
concurrency control and, 836–837 Topologies, network, 879
748–749 Three-phase commit (3PC) proto- Total categories, 260
Term frequency-inverse document col, 908 Total participation, binary relation-
frequency. See TF-IDF (term three-schema architecture ships and, 217
frequency-inverse document data independence and, 35–36 Total specialization constraint, 253
frequency) levels of, 34–35 Tracks, on hard disks, 589
Terminated state, transactions, 752 overview of, 33 Trade-off analysis, 345
Terms (keywords) Three-tier architectures Training costs, in choosing a DBMS,
modes of interaction in IR client/server architecture, 323–324
systems, 999 892–894 Transaction-id, 753
sets of terms in Boolean model PHP, 482 Transaction processing systems
for IR, 1002 for Web applications, 47–49 ACID properties, 754–755
Ternary relationships Three-valued logic, 116–117 bibliographic references, 775
choosing between binary and ter- Time constraints, on queries and characterizing schedules based on
nary relationships, 228–231 transactions, 729 recoverability, 757–759
constraints on, 232 TIME data type, 945 characterizing schedules based on
in ER (Entity-Relationship) Time dimensions, in temporal data- serializability, 759–760
model, 213–214 bases, 945–947 commit point of transactions,
Tertiary storage, 584, 586 Time periods, in temporal data- 754
Testing bases, 946 concurrency control, 747–750
conflict serializability of sched- Time representation, in temporal database design and, 306
ules, 763–765 databases, 945–947 equivalence of schedules, 769–770
in database application life cycle, Time series overview of, 743–744
308 management systems, 957 recovery, 750–751
Texels (texture elements), 967 patterns in, 1039, 1057 schedules (histories) of transac-
Text as specialized database applica- tions, 756–757
preprocessing in information tions, 25 serial, nonserial, and conflict-
retrieval, 1009–1012 in temporal databases, 946, 957 serializable schedules, 761–763
sources in multimedia databases, Time-varying attributes, 953 serializability used for concurren-
966 Timeouts, for dealing with dead- cy control, 765–768
storing XML document as, 431 locks, 788 single-user vs. multiuser, 744–745
Texture, automatic analysis of TIMESTAMP data type, SQL, 93, 945 SQL support for transactions,
images, 967 Timestamp ordering (TO) 770–772
TF-IDF (term frequency-inverse basic, 789–790 summary and exercises, 772–774
document frequency) for concurrency control, 777 system log, 753–754
applying to inverted indexing, 1013 multiversion technique based on, testing conflict serializability of
in vector space model for IR, 792 schedules, 763–765
1003–1004 strict timestamp ordering, transaction states and operations,
Thematic analysis, for spatial data- 790–791 751–752
bases, 959 Thomas’s write rule, 791 transactions, database items,
Theorem proving, in deductive Timestamps read/write operations, and
databases, 976 overview of, 789 DBMS buffers, 745–747
Thesaurus read and write, 789 view equivalence and view serial-
ontologies, 272 transaction time relations and, 949 izability, 768–769
Index 1169
Transaction processing systems, in Tree data models. See Hierarchical tuple variables and range rela-
distributed databases data models tions, 175–176
catalog management, 913 Tree structures. See also B+-trees; universal quantifier used in
concurrency control, 909–912 B-trees queries, 180–182
operating system support, 909 decision making in database Tuple versioning approach, to
overview of, 907–908 design, 730 implementing temporal data-
recovery, 912–913 FP-tree (frequent-pattern tree) bases, 947–953
two-phase and three-phase com- algorithm, 1043–1045 bitemporal time relations,
mit protocols, 908–909 leaf-deep trees, 718 950–952
Transaction Table, in ARIES recov- overview of, 646–647 implementation considerations,
ery algorithm, 822 R-trees, 962 952–953
Transaction time, in temporal data- search trees, 647–649 transaction time relations and,
bases, 946 specialization hierarchy, 255 949–950
Transaction time relations, in tem- TV-trees (telescoping vector valid time relations and,
poral databases, 949–950 trees), 967 947–949
Transaction timestamp, 786 Triggers Tuples (rows)
Transactional databases, distinguish- active rules specified by, 933 classification in mandatory access
ing data warehouses from, associating with database tables, control, 848
1069 21 combining using JOIN operation,
Transactional searches, 996 before, after, and instead triggers, 157–158
Transactions 938 comparison of values in, 118
ACID properties, 754–755 CREATE TABLE command, component values of, 67
canned, 15 132–133 dangling tuples in relational
commit point of, 754 CREATE TRIGGER command, design, 563–565
committed and aborted, 750 936 defined, 61
defined, 6 creating in SQL, 111 disallowing spurious, 510–513
designing, 322–323 overview of, 932 eliminating duplicates, 150
interactive, 801 row-level and statement-level, hypothesis tuples, 572
multiuser, 13–14 937 n-tuple for relations, 62
recovery needed due to transac- specifying constraints, 74 ordering in relations, 64
tion error, 750 in SQL-99, 942–943 ordering values within, 64–65
relational data model and, 79 Truth values, of atoms, 184 reducing NULL values in,
schedules (histories) of, 756–757 TSQL2 language, 954–956 509–510
SQL transaction control com- Tuning databases reducing redundant information
mands, 111 design, 735–736 in, 507–509
states and operations, 751–752 guidelines for, 738–739 retrieving all attribute values of
throughput in physical database implementation and, 311 selected, 102–103
design, 327 indexes, 734–735 retrieving multiple tuples in
types of, 745 overview of, 733–734 SQLJ, 461–464
Transfer rate (tr), disk blocks, 1088 queries, 736–738 retrieving multiple tuples using
Transformation approach, to image system implementation and cursors, 455–457
database queries, 966 tuning, 327–328 SQL table as multiset of, 97
Transience Tuple-based constraints, 97 storing in temporal relations,
collections, 367 Tuple relational calculus 952–953
data, 586 examples of queries in, 178–179 unspecified WHERE clause and,
object lifetime and, 378 existential and universal quanti- 102
objects, 355, 363 fiers, 177–178 valid time relations and, 948
Transition constraints, 75 expressions and formulas, values and NULLS in, 65–66
Transition tables, in STARBURST 176–177 versioning for incorporating time
example, 940 notation for query graphs, in relational databases, 953
Transitive closure, of relations, 168 179–180 Tuples variables
Transitive dependencies, in 3NF, overview of, 174–175 aliases and, 101
523–524 safe expressions, 182–183 looping with iterators, 98
Transparency SQL based on, 88 range relations and, 175–176
autonomy as complement to, 882 transforming universal and exis- TV-trees (telescoping vector trees),
in distributed databases, 879–881 tential quantifiers, 180 967
1170 Index
Two-phase commit (2PC) protocol Unary relational operations Universal relation assumption, 552
recovery in multidatabase sys- CARTESIAN PRODUCT opera- Universal relation schema, 552
tems, 825–826 tion, 155–157 Universal relations, 544
transaction management in dis- overview of, 146 Universe of discourse (UoD), 4
tributed databases, 908 PROJECT operation, 149–150 University student database example
Two-phase locking SELECT operation, 147–149 data records in, 6–9
basic locks, 784 UNION, INTERSECTION, and EER schema applied to, 260–263
binary locks, 778–780 MINUS operations, 152–155 Unordered (heap files) records,
conversion of locks, 782 Unbalanced trees, 646 601–602
overview of, 777–778 Unconstrained write assumption, Unrepeatable read problem, 750
serializability guaranteed by, 769 Unstructured data
782–784 UNDO/NO-REDO recovery HTML and, 418–420
shared/exclusive (read/write) immediate update techniques, information retrieval dealing
locks, 780–782 818–819 with, 993–994
variations on two-phase locking, overview of, 807, 809 Unsupervised learning
784–785 Undo operations, transactions, 753 clustering and, 1054
Two-tier client/server architecture, UNDO phase, of ARIES recovery neural networks and, 1058
46–47 algorithm, 823 UoD (universe of discourse), 4
Two-way joins, 689 UNDO/REDO recovery Update anomalies, avoiding redun-
Type (class) hierarchies immediate update techniques, dant information in tuples,
constraints on extents corres- 819 507
ponding to, 366–367 overview of, 807, 809 UPDATE command, SQL
inheritance and, 369 UNDO, write-ahead logging and, active rules and, 936
in OO systems, 356 810–811 overview of, 109–110
simple model for inheritance, Unidirectional associations, in UML Update operations
364–366 class diagrams, 227 bitemporal databases and, 950
Type-compatible relations, 697 Unified Modeling Language. See database design and, 728
Type constructors UML (Unified Modeling factors influencing physical data-
atom constructor, 358 Language) base design, 729
collection constructor, 359 UNION operation operations on files, 599
defined, 369 algorithms for, 697–698 query processing in distributed
ODB features included in SQL, in relational algebra, 152–155 databases, 905–907
370 SQL set operations, 104 in relational data model, 78–79
ODL and, 359–360 Union types (categories) types of relational data model
struct (tuple) constructor, EER-to-Relational mapping, operations, 75
358–359 297–299 Update transactions, 322
Type generator, 358–359 modeling, 258–260 Usage projections, data warehousing
UDT (user-defined types) UNIQUE function, SQL, 122 and, 1080
creating, 370–373 Unique identity, in ODMS, 357 Use case diagrams, UML, 329–331
in SQL, 111 UNIQUE KEY clause, CREATE User accounts, database security
tables based on, 374 TABLE command, 96 and, 839–840
UML (Unified Modeling Language) Unique keys, in relational models, 70 User-defined subclasses, 252, 264
class diagrams, 226–228 Uniqueness constraints User-defined time, 947
for database application design, on entity attributes, 208–209 User-defined types. See UDT (user-
329 factors influencing physical data- defined types)
as design specification standard, base design, 729 User-friendly interfaces, 38
328 integrity constraints in databases, User interfaces
diagram types, 329–334 21 GUIs (graphical user interfaces),
notation for ER diagrams, 224 overview of, 68–70 20, 39, 1061
object modeling with, 200 specifying in SQL, 95–96 multiple users, 20
representing specialization/gener- Universal quantifiers User labels, combining with data
alization in, 265–266 transforming, 180 labels, 869–870
University student database in tuple relational calculus, Users
example, 334–337 177–178 classifying DBMSs by number of,
UMLS metathesaurus, 1010–1011 used in queries, 180–182 49
Index 1171
database actors on the scene, scope, 490 Virtual tables. See Views (virtual
15–16 shared, 452 tables), SQL
measures of relevance in IR, 1015 temporal, 948 Visible/hidden attributes, of objects,
multiuser transactions, 13–14 tuple, 98, 101, 175–176 361
types of users in information VDL (view definition language), 37 Vocabularies
retrieval, 995–996 Vector space model, for information in inverted indexing, 1012
Utilities, DBMS system, 42–43 retrieval, 1003–1005 searching, 1013–1014
Valid event data, 957 Vertical fragmentation, in distrib- Volatile storage, 586
Valid state uted databases, 881, 895 Voting method, distributed concur-
database states, 33 Vertical partitioning, database tun- rency control based on, 912
relational databases, 71 ing and, 735 VPDs (virtual private databases),
Valid time databases, 946 Vertical propagation, of privileges, 868–869
Valid time, in temporal databases, 847 Wait-die transaction timestamp, 786
946 Vertical search engines, 1018 Wait-for graph, 787
Valid time relations, in temporal Very large databases, 586 WAL (write-ahead logging),
databases, 947–949 Victim selection algorithm, for 810–812
valid XML documents, 422–425 deadlock prevention, 788 WANs (wide area networks), 879
Validation Video applications, 25 Weak entity types, 219–220,
in database application life cycle, Video clips, in multimedia data- 288–289
307–308 bases, 932, 965 Web
of queries, 679 Video segments, in multimedia access control policies for,
Validation (optimistic) concurrency databases, 966 854–855
control, 777, 794–795 Video sources, in multimedia data- hypertext documents and, 415
Validation phase, of optimistic con- bases, 966 interchanging data on, 24
currency control, 794 View definition language (VDL), 37 Web analysis, 1019, 1027
Value, hue, saturation, and, 967 View equivalence, of transaction Web applications, architectures for,
Value references, in RDBs, 396 schedules, 768–769 47–49
Value sets (domains), of attributes, View integration approach, in con- Web-based user interfaces, 38
209–210 ceptual schema design, 315 Web browsers, 38
Values View materialization, 135 Web clients, 38
stored in records, 594 View serializability, of transaction Web content analysis
in tuples, 65–66 schedules, 768–769 agent-based approach to,
Values (literals) Views 1024–1025
atomic formulas as, 973 data warehouses compared with, concept hierarchies in, 1024
atomic literals, 378 1079–1080 database-based approach to, 1025
collection literals, 382 database designers creating, 15 ontologies and, 1023–1024
complex types for, 358–360 granting/revoking privileges, 844 overview of, 1022
in OO systems, 358 multiple views of data supported segmenting Web pages and
structured literals, 378 in databases, 12 detecting noise, 1024
Variable-length records, 595–597 specifying as named queries in structured data extraction, 1022
Variables OQL, 402–403 types of Web analysis, 1019
bind variables (parameterized Views (virtual tables), SQL Web information integration,
statements), 858 vs. base tables, 134 1022–1023
communication variables in SQL, CREATE VIEW command, Web crawlers, 1028
454 134–135 Web databases, programming. See
domain, 183 implementation and update, PHP
instance, 356 135–137 Web forms, collecting data
iterator variables, in OQL, inline views, 137 from/inserting record into,
399–400 overview of, 89, 133–134 493–494
limited, 980 Virtual data, in views, 12 Web interface, for database applica-
PHP, 485–486 Virtual data warehouses, 1070 tions, 449
PHP server, 490–491 Virtual private databases (VPDs), Web Ontology Language (OWL), 969
PHP variable names, 484–485 868–869 Web pages
program, 599 Virtual relations, specifying with analyzing link structure of,
in Prolog languages, 971 CREATE VIEW command, 90 1020–1021
1172 Index
content analysis, 1024 preprocessing phase of, XML (eXtended Markup Language)
ranking, 1000 1025–1026 data model, 51
Web query interface integration, types of Web analysis, 1019 interchanging data on Web using,
1023 Well-formed XML, 422–425 24
Web search and analysis WHERE clause XML (Extensible Markup Language)
analyzing link structure of Web DELETE command, 109 bibliographic references, 443
pages, 1020–1021 explicit sets of values in, 122 converting graphs into trees, 441
comparing with information missing or unspecified, 102 hierarchical (tree) data model,
retrieval, 1018–1019 in SQL retrieval queries, 129–130 420–422
HITS ranking algorithm, UPDATE command, 109–110 hierarchical XML views over flat
1021–1022 Wide area networks (WANs), 879 or graph-based data, 436–440
overview of, 1018 Wildcard (*) languages, 432
PageRank algorithm, 1021 types of queries in IR systems, languages related to, 436
practical uses of Web analysis, 1008–1009 overview of, 415–416
1027–1028 using with XPath, 433 storing/extracting XML docu-
searching the Web, 1020 WITH CHECK OPTION, view ments from databases,
Web content analysis, 1022–1025 updates and, 137 431–432, 442
Web searches combining brows- WordNet thesaurus, 1011 structured, semistructured, and
ing and retrieval, 1000 Wound-wait transaction timestamp, unstructured data, 416–420
Web usage analysis, 1025–1027 786 summary and exercises, 442–443
Web security, 1028 Wrappers, structured data extrac- well-formed and valid docu-
Web servers tion and, 1022 ments, 422–425
middle tier in three-tier architec- Write-ahead logging (WAL), XML schema language, 425–430
ture, 48 810–812 XPath, 432–434
specialized servers in client/server Write command, hard disks and, XQuery, 434–435
architecture, 45 591 XML schema language, 425–430
Web Services Description Language Write phase, of optimistic concur- example schema file, 426–428
(WSDL), 436 rency control, 794 list of concepts in, 428–429
Web spamming, 1028 Write-set, of transactions, 747 overview of, 425
Web structure analysis Write timestamp, 789 XPath, 432–434
analyzing link structure of Web Write-write conflicts, in transaction XQuery, 434–435
pages, 1020–1022 schedules, 757 XSL (Extensible Stylesheet
types of Web analysis, 1019 write_item(X), 746 Language), 415, 436
Web usage analysis WSDL (Web Services Description XSLT (Extensible Stylesheet
pattern analysis phase of, 1027 Language), 436 Language Transformations),
pattern discovery phase of, XML access control, 853–854 415, 436
1026–1027 XML declaration, 423