The Relational Database Model
* Tthe relational database model takes a logical view of data
* The relational model's basic components are entities, attributes, and relationships among entities
* How entities and their attributes are organized into tables
* About relational database operators, the data dictionary, and the system catalog
* How data redundancy is handled in the relational database model
* Why indexing is important
A logical View of Data
* Relational model
o Enables us to view data logically rather than physically
o Reminds us of simpler file concept of data storage or spreadsheet concept of table
* Table
o Two-dimensional structure composed of rows and columns
o Has advantages of structural and data independence
o Resembles a file from conceptual point of view
o Easier to understand than its hierarchical and network database predecessors
Tables
* Table also called a relation because the relational model's creator, Codd, used this term.
* A relation (and a relational DBMS's table) differs from a table in that there is no default order of rows or columns
o Note SQL does use a column order but this is a limitation of SQL, not the model.
* Each row (tuple) represents a single entity in an entity set
* Each column represents an attribute and each must have a distinct name
* Each row/column intersection (cell) represents a single atomic value
* All values in a column must have same data format
* Each column has a specific range of values known as a domain
* Each row must have some combination of attributes that uniquely identifies it (key)
Student Table Example
Student Table
Keys
* A key consists of one or more attributes that determine other attributes
* Primary key (PK) is an attribute (or a combination of attributes) that uniquely identifies any given entity (row)
* Key's role is based on determination
o If you know the value of attribute A, you can look up (determine) the value of attribute B
Types of Keys
Composite key
Key consisting of multiple attributes
Key attribute
Any attribute that is part of a key
Superkey
Any key that uniquely identifies each entity
The entire row (all attributes) is a superkey
Candidate key
A superkey without redundancies
Null Values
* Can represent
o An unknown attribute value
o A known, but missing, attribute value
o A 'not applicable' condition
* Not permitted in Primary key attributes
* Not the same as 0 or ""
* Can create problems in logic and using formulas
Controlled Redundancy
* Makes the relational database work
* Tables within the database share common attributes that enable us to link tables together
* Multiple occurrences of values in a table are not redundant when they are required to make the relationship work
* Uncontrolled Redundancy is unnecessary duplication of data
* In general any values duplicated between tables should be part of a Foreign Key -- Primary key relation (see below)
Foreign Keys
Foreign key (FK)
An attribute whose values match primary key values in the related table
Referential integrity
FK contains a value that refers to an existing valid tuple (row) in another relation
Secondary key
Key used strictly for data retrieval purposes
Keys (Summary)
Key Type Definition
Superkey Any attribute or combination of attributes that uniquely identifies a row in the table
Candiate Key aka Minimal Superkey. A superkey that does not contain a subset of attributes that is itself a superkey
Primary Key The candidate key selected to uniquely identify all rows
Cannot contain null values
Foreign Key An attribute (or combination of attributes) in one table that must either match the primary key of another table or be null
Secondary Key An attribute or combination of attributes used to make data retrieval more efficient.
Integrity Rules
Entity Integrity
* All entities are unique
* Each entity has unique primary key
Referential Integrity
* Foreign key must match primary key of referenced table or be null
* Impossible to delete row whose primary key has matching foreign key in another table
Relational Algebra
Key operators:
* SELECT
* PROJECT
* JOIN
Other operators
* INTERSECT
* UNION
* DIFFERENCE
* PRODUCT
* DIVIDE
SELECT
Subset of rows based on condition
Select
PROJECT
Subset of columns
PRODUCT
All possible pairings of two tables
Product
JOIN
* Combines information from two or more tables
* Real power behind the relational database, allowing the use of independent tables linked by common attributes
* Join is actually a combination of Product, Select, and Project
Fig 3-11 Join tables
Natural JOIN
Links tables by selecting rows with common values in common attribute(s) (typically equality on foreign keys).
Three stage process:
* Product creates one table
* Select yields appropriate rows
* Project yields single copy of each attribute to eliminate duplicate columns
Natural Join: Step 1 Product
Natural Join: Select
Natural Join Step 2: Select
Natural Join: Step 2 Select
Natural Join Step 3: Project
Natural Join Step 3: Project
Natural Join: Outcome
* Final outcome yields table that
o Does not include unmatched pairs
o Provides only copies of matches
* If no match is made between the table rows,
o the new table does not include the unmatched row
Other Joins
EquiJOIN
* Links tables based on equality condition that compares specified columns of tables
* Does not eliminate duplicate columns
* Join criteria must be explicitly defined
Theta JOIN
* EquiJOIN that compares specified columns of each table
Outer JOIN
* Matched pairs are retained
* Unmatched values in other tables left null
* Right and left
Intersect
Difference
Yields all rows in one table not found in the other table�that is, it subtracts one table from the other
Union
Divide
Requires user of single-column table and two-column table (I have never seen a DIVIDE)
Divide
Data Dictionary and System Catalog
Data Dictionary
* Metadata
* Detailed account of all tables in database
System Catalog
* System-created database
* Stores database characteristics and contents
* The database stores its own description
* Tables can be queried just like any other tables
* Automatically produces database documentation
Relationships within Relational Database
Classified by cardinalities of entities participating
* 1:M
o Relational modeling ideal
o Should be the norm in any relational database design
* M:N
o Must be avoided because they lead to data redundancies
o Split into a pari of 1:M Relationships
* 1:1
o Should be rare
The 1:1 Relationship
* One entity can be related to only one other entity, and vice versa
* Often means that entity components were not defined properly
* Could indicate that two entities actually belong in the same table
* Sometimes 1:1 relationships are appropriate to help avoid nulls
E-R Diagram
* Graphical representation of model
* Chen
* Crows Feet
ERD Symbols
* Rectangles represent entities
* Chen:
o diamonds represent relations
o numbers or m and n are cardinalities
* Crows Foot
o Lines represent relationships
o bar crossing line is "1" side
o 3 pronged crows foot is many side
Example 1:M Relationship
ERD diagram 1:M Relationship
Implementation of 1:M Relationship
M:N Relationships
M:N Relationship ERD
M:N Relationship
* Cannot be directly implemented in a Relational DBMS
Student and Class Tables
Linking Table
* Implementation of a composite entity
* Yields required M:N to 1:M conversion
* Composite entity table must contain at least the primary keys of original tables
* Linking table contains multiple occurrences of the foreign key values
* Additional attributes may be assigned as needed
Composite ERD
Composite Entities
Linking Tables: Implementation
Linking Tables
Data Redundancy Revisited
Foreign keys control and reduce redundancy.
It is not that there is no redundancy in a properly designed database: the redundancy there is is controlled.
Indexes
Note: I do not consider indexes part of the relational model. They are an implementation artifact to improve performance
* Arrangement used to logically access rows in a table
* Index key
o Index�s reference point
o Points to data location identified by the key
* Unique index
o Index in which the index key can only have one pointer value (row) associated with it
* Each index is associated with only one table
Subscribe to:
Post Comments (Atom)


No comments:
Post a Comment