What is
Database Integrity?
Integrity
1. Steadfast adherence to a strict moral or ethical code.
2. The state of being unimpaired; soundness.
3.
The quality or condition of being whole or undivided; completeness.
The data within our databases must
constantly follow the rules and constraints placed within the data models.
Without constraints, the data would hold no meaning. The reliability and
integrity of that data would always be in question and the users of such data
would always question the validity of it. If a database were to neglect the
rules and overstep the boundaries that were placed upon it, there would be
mayhem within the database and that database would cease to exist for any value.
Data Integrity is an umbrella
term that refers to the consistency, accuracy, and
correctness of data stored in a database. Think of data
integrity in terms of the old adage: 'garbage in, garbage out;' data integrity
is about keeping the garbage out.
Types of Data Integrity
There are four primary types of data integrity:
entity, domain, referential, and user-defined. In general terms; entity
integrity applies at the row level; domain integrity applies at the column
level; and referential integrity applies at the table level.
Entity Integrity
Key Points
|
Constraints
|
Entity integrity ensures
that each row in the table is uniquely identified |
Entity integrity is most
often enforced by placing a primary key (PK) constraint on a specific column
(although it can also be enforced with a UNIQUE constraint, a unique index,
or the IDENTITY property) |
Entity integrity ensures a table does not have any duplicate rows |
A
PK will not allow any NULLs to be inserted into the column(s) (A NULL entry
would be disallowed even if it would be the only NULL in the column and
therefore unique.) |
Entity integrity ensures that each row in a table has a unique identifier
that allows one row to be distinguished from another |
A composite PK is
a PK that consists of more than one column; it is used when none of the
columns in the composite key is unique by itself |
Physical independence is achieved by being able to reference each row by a
unique value, sometimes referred to as a 'key' |
There can be only one PK
per table
|
Example:
If the ' student ID' column is the PK in the 'students' table, two students
cannot have the same student numbers, this would violate entity integrity.
Table Student
Student ID |
Name |
Address |
Major |
123456789 |
Saenz, Lupe |
123 Mesa |
Accounting |
234567890 |
Chung, Mel |
456 Spring |
Marketing |
123456789 |
Adams, John |
548 Hague |
Economics |
456789012 |
Elam, Mary |
156 Law |
Education |
|
Baker, Chris |
555 Austin |
Nursing |
Domain Integrity
Key Points
|
Constraints
|
Domain integrity
requires that a set of data values fall within a specific range (domain) in
order to be valid |
Data types limit fields
to broad categories (e.g., integers) |
Domain integrity defines
the permissible entries for a given column by restricting the data type,
format, or range of possible values |
A default is a
definition of a value that can be inserted into a column; a rule is a
definition of acceptable values that can be inserted into a column |
A domain in database
terminology refers to a set of permissible values for a column |
|
Domain integrity is
sometimes referred to as 'attribute' integrity |
|
Example:
Domain integrity might be used to ensure an entry in the 'age' field is an
integer and must be between the values of 17 and 75. In this example, integers
between 17 and 75 are the only permissible entries for the ‘age’ field.
Table Student
Customer ID |
Name |
Address |
Age |
123456789 |
Saenz, Lupe |
123 Mesa |
23 |
234567890 |
Chung, Mel |
456 Spring |
16 |
477568995 |
Adams, John |
548 Hague |
18 |
456789012 |
Elam, Mary |
156 Law |
83 |
|
Baker, Chris |
555 Austin |
29 |
Referential Integrity
Key Points
|
Constraints
|
Referential integrity is
concerned with keeping the relationships between tables synchronized |
An Foreign Key (FK)
is a column or combination of columns in one table (referred to as the 'child
table') that takes its values from the PK in another table (referred to
as the 'parent table') |
Referential integrity is
typically enforced with a Primary Key (PK) and Foreign Key (FK) combination |
In order for referential
integrity to be maintained, the FK in the 'child' table can only accept
values that exists in the PK of 'parent' table |
Once this relationship
is established, it is possible to 'relate' or 'tie' each order to a
particular customer |
Enforcing referential
integrity means the relationship between the tables must be preserved when
records are added (INSERT), changed (UPDATE), or deleted (DELETE) |
Note that while PK-FK
combinations represent logical relationships among data, they do not
necessarily limit the possible access paths through the data |
|
The primary objective of referential integrity is to prevent 'orphans;'
i.e., records in the child table that cannot be related to a record in the
parent table |
|
Example:
To maintain referential integrity, the PhysID in the ‘patient’ can only accept
values that exist in the PhysID of the ‘physician’ table. You cannot change a
PhysID in the 'physician' table if that change would produce an 'orphan’; in
other words, that change would leave records in the 'patient' table that did not
reference to a valid PhysID.
Physician Table
Patient Table
PhysID |
Name |
Speciality
|
Address |
User-Defined
Integrity
| User-defined integrity refers to specific business rules not
covered by the other integrity categories. |
| It is typically implemented through triggers and stored procedures. |
To get more information on Data Integrity,
visit:
http://www.frick-cpa.com/ss7/Theory_DataIntegrity.asp
http://www.databasejournal.com/features/oracle/article.php/3083291
http://www.utexas.edu/its/windows/database/datamodeling/rm/rm5.html
You should be able to answer the following
questions.
1. What is Data Integrity?
Data Integrity is an umbrella term that
refers to the consistency, accuracy, and correctness of data stored in a
database.
2. What are the different types of Data
Integrity?
There are four primary types of data
integrity: entity, domain, referential, and user-defined.
3. When dealing with the relation of
tables, what type of data integrity is being enforced?
Referential Integrity
4. If two (2) separate students had the
same primary key, would there be a violation of data integrity, if so, what type
of integrity is being violated?
a)
Entity
Integrity
b)
Domain
c)
Referential
d)
User-Defined
(Answer: a)
|