According to
webopedia“Normalization is the formal process of determining which attributes
should be grouped together in a relation” What it means is that you are trying to group attributes
that are related and that are key attributes.
You should try to get rid of non-key attributes and also make sure that
relations make sense. Normalization is
used to be used as a tool to assist in grouping attributes in a relation. We do normalization to have domain
constraints, which is to make sure that the are no
illegal values. Also to ensure entity
integrity which checks to make sure 3 things:
1)Every
table/relation has only one primary key/unique identifier
2)We to
make sure that the primary key is a unique identifier.
3)Last we
must make sure there is a value for the primary key and that it is not null.
Now you ask how do I do this?
We need to establish a couple of general rules first.
Relational integrity
§
Relational rows
must be consistent.
§
A foreign key
cannot exist unless it is a primary key in another table and/relation. The foreign key can also be null.
Here the situation:
A patient can only have one physician.
A physician can have many patients.
ONE to MANY relationship
The first major problem that we need to get rid of is repeating columns. Repeating columns happen cause there many
repeating values that are the same in many columns.
Example:
PhysID |
Name |
PatID
|
PatID2 |
PatID3 |
PatID n |
5 |
Paco |
123456 |
234567 |
345678 |
|
6 |
Dave |
666666 |
|
|
|
17 |
Larry |
999999 |
444444 |
|
|
§
This is a one to
many relationship but the problem is that we are putting PatID as the foreign
key in the physician table, which is causing the multiple groups because as we
said a physician can have multiple patients. As you saw in the example you see
that a Physician has many patients and they can come multiple times for the
same reason, which creates many problems and repeating groups.
We need to put it in (1NF)
Example:
PhysID |
Name |
PatID
|
5 |
Paco |
123456 |
5 |
Dave |
234567 |
5 |
Larry |
345678 |
6 |
Joe |
456789 |
§
Because there was
a one to many relationship which caused the multiple groups we need to adjust
our table. We need to make sure that we
have the PhysID in the patient table to eliminate the repeating groups. Now we see a patient can only one physician
and one physician can have many patients.
IMPORTANT TIPS:
1
All attributes of
the Primary Key must be defined.
2
Must eliminate
repeating groups.
3
Attributes must
be functionally dependent on the primary key
Now we must look at FUNCTIONAL DEPENDENCY
When attributes can be determined by another attribute it means they are functionally dependent. In the example you see that the attributes are not no longer functionally dependent on PhysID but on PatID as well. This problem creates duplication and possibly errors.
Now we look at: (2NF)
We must assume certain things.
Example:
PhysID
|
PhysName |
PhysAddr |
PatID
|
PatName |
PatAddr |
Front
Desk |
|
|
|
|
|
|
Personnel |
5 |
Paco |
1615
Oregon |
123456 |
David |
999
Street |
Betty |
Example:
PhysID
|
PhysName |
PhysAddr |
PatID
|
PatName |
PatAddr |
5 |
Paco |
1615
Oregon |
123456 |
David |
999
Street |
WHAT DO WE DO?????????
Can we get rid of those non-key attributes? Maybe, if we put PhysID into our patient table as a foreign key then yes we can eliminate all non-key attributes.
Physician
Table |
PhysID |
PhysName |
PhysAddr |
Patient
Table |
PatID |
PatName |
PatAddr |
PhysID |
As you can see we put PhysID in the physician table which allows to not have to have two primary keys. Remember we had to have PatID and PhysID as composite keys. Now you can get all the Physician information from the physician table.
PhysID |
PatID |
PatName |
PatAddr |
|
|
|
|
5 |
123456 |
David |
999
Street |
(3NF)
We must assume:
If we assume we have put it in 2NF then all that is left is to eliminate transitive relationships.
“Transitive Relationship is a functional dependency between two (or more) non-key attributes.”
Kirs –Lecture
Slides
If we have a situation where we need to know a patient in order to find out a physician’s specialty then we have a transitive dependency.
Insertion Anomalies – If we need something to be entered in order to find out something else then if it is not entered we will never know the outcome
Deletion anomalies-If you create a table that- shows that if you delete a boy and the only way to know the father is with the boy then you lose in on the father.
Modification Anomalies-If you make modifications than it could cause problems if you modify one thing and you will have to modify other things.
PhysID |
PatID |
PatName |
PatAddr |
Specialty |
|
|
|
|
|
5 |
123456 |
David |
999
Street |
Cancer |
Example:
Physician
Table |
PhysID |
PhysName |
PhysAddr |
Patient
Table |
PatID |
PatName |
PatAddr |
PhysID |
NO!
NO! NO!!!!!!
One more to go
Of course assuming
What if you have an attribute and you must modify it? Won’t this modification cause you to need to modify it some where else.
Example:
Student ID
|
Major |
GPA |
Advisor |
123 |
CIS |
3.6 |
David |
456 |
MGMT |
3 |
BOB |
789 |
MKT |
2.5 |
David |
444 |
ACCT |
2 |
Larry |
If David was to retire we would have to make two changes and if the table were larger we have to make many changes. Wouldn’t it be better to a second table?Hmmmm Why not?
Yes, if we create a second table then a modification only has to be done once.
Student
Student ID
|
Major |
GPA |
Advisor |
||
123 |
CIS |
3.6 |
Have an Advisor ID so you can just make the change in
the advisor table and it will change the other tables that refer to it. |
||
456 |
MGMT |
3 |
888 |
||
789 |
MKT |
2.5 |
999 |
||
444 |
ACCT |
2 |
777 |
Advisor
AdvID
|
Name |
999 |
David |
888 |
BOB |
777 |
Larry |
YES!!!!!!!!!!
Short Answer Questions
What is Normalization?
Answer:
“Normalization is the formal process of determining which attributes should be grouped together in a relation”
Briefly explain (1NF)?
Answer:
The goal is to not have repeating columns. You eliminate these columns by making sure all attributes are defined for the primary key. Also, the attributes must be functionally dependent on the primary key.
Briefly explain (2NF)?
Answer:
The goal is to eliminate all the non-key attributes because it causes problems. The best way to do that is to decompose a table into two and have the non-key information on the other table. Also, if you decompose the tables you make sure that all attributes are fully functionally dependent on the primary key. You want them dependent on a single primary key.
Briefly explain (3NF)?
Answer:
The goal is to eliminate transitive relationships because
they cause anomalies. Also
“Transitive Relationship is a functional dependency between two (or more) non-key attributes.” By decomposing the tables you will eliminate them because you will eliminate the non-key attributes and move them to the tables that they need to go to.
What is (BNF) and what happens in (BNF)?
Answer:
The goal is to eliminate a modification anomaly. If you have a table and you need to modify it then you will have to modify every occurrence of it. If you decompose the table and link it to another all you need to do is modify in one table and it will modify everywhere else.
What is Functional Dependency?
Answer:
When attributes can be
determined by another attribute it means they are functionally dependent. In the example you see that the attributes
are not no longer functionally dependent on PhysID but on PatID as well. This problem creates duplication and possibly
errors.
Name three problems with transitive dependencies?
Answer:
Insertion Anomalies – If we need something to be entered in order to find out something else then if it is not entered we will never know the outcome
Deletion anomalies-If you create a table that- shows that if you delete a boy and the only way to know the father is with the boy then you lose in on the father.
Modification Anomalies-If you make modifications than it could cause problems if you modify one thing and you will have to modify other things.
Multiple Choice
1)My goal it to eliminate all transitive dependencies which form am I going into?
A.(2NF) B.(3NF)
C.(4NF) D.Insertion Anomaly
Answer: B
2)If we need something to be entered in order to find out something else then if it is not entered we will never know the outcome what kind of anomaly is this?
A.Insertion B.Deletion
C.Modification D.Separation
Answer: A
3)If we say a foreign key cannot exist unless it is a primary key in
another table and/relation. The foreign
key can also be null. What is this an example of?
A.Relational Integrity B.Entity
Integrity
C.Functional Dependency D.Insertion
anomaly
Answer: A
4) If you make modifications than
it could cause problems if you modify one thing and you will have to modify
other things what kind of anomaly is this?
A.Insertion B.Deletion
C.Modification D.Separation
Answer: C
5) If you create a table that- shows that if you delete a boy and the only way to know the father is with the boy then you lose in on the father what kind of anomaly is this?
A.Insertion B.Deletion
C.Modification D.Separation
Answer: B
References: