Normalization 

 


According to webopedia“Normalization is the formal process of determining which attributes should be grouped together in a relation  What it means is that you are trying to group attributes that are related and that are key attributes.  You should try to get rid of non-key attributes and also make sure that relations make sense.  Normalization is used to be used as a tool to assist in grouping attributes in a relation.  We do normalization to have domain constraints, which is to make sure that the are no illegal values.  Also to ensure entity integrity which checks to make sure 3 things:

 

1)Every table/relation has only one primary key/unique identifier

2)We to make sure that the primary key is a unique identifier.

3)Last we must make sure there is a value for the primary key and that it is not null.

 

Now you ask how do I do this?

 

We need to establish a couple of general rules first.

 

Relational integrity

 

§         Relational rows must be consistent.

 

§         A foreign key cannot exist unless it is a primary key in another table and/relation.  The foreign key can also be null.

 

 

 

First Normal Form (1NF) 

 

 

 


Here the situation:

A patient can only have one physician.  A physician can have many patients.

ONE to MANY relationship

 

The first major problem that we need to get rid of is repeating columns.  Repeating columns happen cause there many repeating values that are the same in many columns.

 

 

 

 

 

 

 

 

 

 


Example:

PhysID

Name

PatID

PatID2

PatID3

PatID n

5

Paco

123456

234567

345678

 

6

Dave

666666

 

 

 

17

Larry

999999

444444

 

 

 

 

§         This is a one to many relationship but the problem is that we are putting PatID as the foreign key in the physician table, which is causing the multiple groups because as we said a physician can have multiple patients. As you saw in the example you see that a Physician has many patients and they can come multiple times for the same reason, which creates many problems and repeating groups.

 

We need to put it in (1NF)

Example:

PhysID

Name

PatID

5

Paco

123456

5

Dave

234567

5

Larry

345678

6

Joe

456789

 

§         Because there was a one to many relationship which caused the multiple groups we need to adjust our table.  We need to make sure that we have the PhysID in the patient table to eliminate the repeating groups.  Now we see a patient can only one physician and one physician can have many patients.

 

IMPORTANT TIPS:

 

1        All attributes of the Primary Key must be defined.

2        Must eliminate repeating groups.

3        Attributes must be functionally dependent on the primary key

 

Now we must look at FUNCTIONAL DEPENDENCY

 

 

When attributes can be determined by another attribute it means they are functionally dependent.  In the example you see that the attributes are not no longer functionally dependent on PhysID but on PatID as well.  This problem creates duplication and possibly errors.

 

 

Now we look at:  (2NF)

 

 

 

Second Normal Form (2NF) 

 

 

 

 


We must assume certain things.

 

 

Example:

PhysID

PhysName

PhysAddr

PatID

PatName

PatAddr

Front Desk

 

 

 

 

 

 

Personnel

5

Paco

1615 Oregon

123456

David

999 Street

Betty

 

 

Example:

PhysID

PhysName

PhysAddr

PatID

PatName

PatAddr

5

Paco

1615 Oregon

123456

David

999 Street

 

 

 

WHAT DO WE DO?????????

 

Can we get rid of those non-key attributes?  Maybe, if we put PhysID into our patient table as a foreign key then yes we can eliminate all non-key attributes.

 

 

SO WE DID IT                

 

 

Physician Table

PhysID

PhysName

PhysAddr

 

 

Patient Table

PatID

PatName

PatAddr

PhysID

 

As you can see we put PhysID in the physician table which allows to not have to have two primary keys.  Remember we had to have PatID and PhysID as composite keys.  Now you can get all the Physician information from the physician table.

 

 

 

 

 

PhysID

PatID

PatName

PatAddr

 

 

 

 

5

123456

David

999 Street

 

 

 

 

MORE OH NO!!!!!!!!!!!!!

 

 

Third Normal Form (3NF) 

 

 

 


(3NF)

We must assume:

 

If we assume we have put it in 2NF then all that is left is to eliminate transitive relationships.

 

“Transitive Relationship is a functional dependency between two (or more) non-key attributes.”

Kirs –Lecture Slides

 

If we have a situation where we need to know a patient in order to find out a physician’s specialty then we have a transitive dependency.

 

Problems with Transitive Dependencies

 

Insertion Anomalies – If we need something to be entered in order to find out something else then if it is not entered we will never know the outcome

 

Deletion anomalies-If you create a table that- shows that if you delete a boy and the only way to know the father is with the boy then you lose in on the father.

 

Modification Anomalies-If you make modifications than it could cause problems if you modify one thing and you will have to modify other things.

 

 

 

 

 

PhysID

PatID

PatName

PatAddr

Specialty

 

 

 

 

 

5

123456

David

999 Street

Cancer

 

 

 

WHAT DO WE DO????????????aaaaaaaaaaaahhhh!!!!!!!!!!!!

 

 

Example:

Physician Table

PhysID

PhysName

PhysAddr

 

 

Patient Table

PatID

PatName

PatAddr

PhysID

 

 

 


MORE????????

NO! NO! NO!!!!!!

 

 

 

 

 

 

One more to go

 

 

 

 

Boyce Codd Normal Form (BCNF) 

 

 

 

 


Of course assuming

 

 

What if you have an attribute and you must modify it?  Won’t this modification cause you to need to modify it some where else.

 

Example:

Student ID

Major

GPA

Advisor

123

CIS

3.6

David

456

MGMT

3

BOB

789

MKT

2.5

David

444

ACCT

2

Larry

 

 

 

 

 

 

 

If David was to retire we would have to make two changes and if the table were larger we have to make many changes.  Wouldn’t it be better to a second table?Hmmmm Why not?

Yes, if we create a second table then a modification only has to be done once.

 

 

 

 

Student

Student ID

Major

GPA

Advisor

123

CIS

3.6

Have an Advisor ID so you can just make the change in the advisor table and it will change the other tables that refer to it.

 

 
999

456

MGMT

3

888

789

MKT

2.5

999

444

ACCT

2

777

 

 

Advisor

AdvID

Name

999

David

888

BOB

777

Larry

 

 

 

 YES!!!!!!!!!!

 

 

 

 

 

 

 

 

 

Short Answer Questions

 

 

What is Normalization?

 

Answer:

“Normalization is the formal process of determining which attributes should be grouped together in a relation”

 

 

 

Briefly explain (1NF)?

 

Answer:

The goal is to not have repeating columns.   You eliminate these columns by making sure all attributes are defined for the primary key.  Also, the attributes must be functionally dependent on the primary key.

 

 

Briefly explain (2NF)?

 

Answer:

The goal is to eliminate all the non-key attributes because it causes problems.  The best way to do that is to decompose a table into two and have the non-key information on the other table.  Also, if you decompose the tables you make sure that all attributes are fully functionally dependent on the primary key.  You want them dependent on a single primary key.

 

 

Briefly explain (3NF)?

 

Answer:

The goal is to eliminate transitive relationships because they cause anomalies. Also

“Transitive Relationship is a functional dependency between two (or more) non-key attributes.”  By decomposing the tables you will eliminate them because you will eliminate the non-key attributes and move them to the tables that they need to go to.

 

 

 

What is (BNF) and what happens in (BNF)?

 

Answer:

The goal is to eliminate a modification anomaly.  If you have a table and you need to modify it then you will have to modify every occurrence of it.  If you decompose the table and link it to another all you need to do is modify in one table and it will modify everywhere else.

 

What is Functional Dependency?

 

Answer:

When attributes can be determined by another attribute it means they are functionally dependent.  In the example you see that the attributes are not no longer functionally dependent on PhysID but on PatID as well.  This problem creates duplication and possibly errors.

 

 

 

Name three problems with transitive dependencies?

 

Answer:

Insertion Anomalies – If we need something to be entered in order to find out something else then if it is not entered we will never know the outcome

 

Deletion anomalies-If you create a table that- shows that if you delete a boy and the only way to know the father is with the boy then you lose in on the father.

 

Modification Anomalies-If you make modifications than it could cause problems if you modify one thing and you will have to modify other things.

 

 

Multiple Choice

 

1)My goal it to eliminate all transitive dependencies which form am I going into?

 

A.(2NF)                       B.(3NF)

C.(4NF)                       D.Insertion Anomaly

 

Answer: B

 

2)If we need something to be entered in order to find out something else then if it is not entered we will never know the outcome what kind of anomaly is this?

 

A.Insertion                   B.Deletion

C.Modification D.Separation

 

Answer: A

 

3)If we say a foreign key cannot exist unless it is a primary key in another table and/relation.  The foreign key can also be null. What is this an example of?

 

A.Relational Integrity                B.Entity Integrity

C.Functional Dependency         D.Insertion anomaly

 

Answer:  A

4) If you make modifications than it could cause problems if you modify one thing and you will have to modify other things what kind of anomaly is this?

 

A.Insertion                   B.Deletion

C.Modification D.Separation

 

Answer:  C

5) If you create a table that- shows that if you delete a boy and the only way to know the father is with the boy then you lose in on the father what kind of anomaly is this?

A.Insertion                   B.Deletion

C.Modification D.Separation

 

Answer:  B

 

References:

           

 

  1. http://databases.about.com/library/weekly/aa080501a.htm
  2. http://itmanagement.webopedia.com/TERM/N/normalization.html
  3. http://troubleshooters.com/littstip/Itnorm.html
  4. Kirs, Peeter Database Design, Lecture slides.