MIT5314: Database Applications
                                                               Fall, 2003

                         What is 2NF and what is an example of it?


  Definition:  A table is in 2NF, (Second Normal Form), if it is in 1NF, (First Normal Form), and all non-key attributes  are dependent only on the primary key.

  If a partial dependency occurs when a non-key attribute is dependent on only a part of the (composite) key, the definition of 2NF is sometimes phrased as, "A table is in 2NF if it is in 1NF and if it has no partial dependencies."

      ???What is Normalization???

            Normalization is the process of efficiently organizing data in a database.
           The goal is to eliminate redundant data in a database and to ensure data dependencies make sense.

            The basic ideas of normalization are simple enough.
                  We want to create  a design for a relational database files where:
                       1.  All the data necessary for the purposes  that the database is to serve
                       2.  Have as little redundancy as possible
                       3.  Accommodate multiple values for types of data that require them
                       4.  Permit efficient updates of the data in the database
                       5.  Avoid the danger of losing data unknowingly

 

???Why do we need normalization???

 Normalization is needed to prevent possible corruption of a database stemming from update anomalies such as
insertion, deletion, and modification.

              Insertion anomaly is a failure to place information about a new database entry into all the places in the database where
                              information for that entry needs to be stored.
              Deletion anomaly
is a failure to remove information about an existing database when it is time to remove that entry.
              Update anomalies
may be additions, deletions or both.

 

??? Ok, remind me again.  What does 1NF mean ???
 

   First Normal Form (1NF)   
         A table (relation) is in 1NF if:
               1.  There are no duplicated rows in the table
               2.  Entries in a column, (attribute or field), are of the same kind.
               3.  Each cell is single-valued or there are no repeating groups or arrays


   Example:  The following relation is in  First Normal Form. 

Table 1  
Order # Company  Purchase Agent Item Number Price
121 Best Windows Tom Jones A2351  $              165.20
122 AAA Construction James Smith A5675  $           2,553.60
123 Able Doors Oscar Sanchez A6689  $                56.00
124 Best Windows Tom Jones A12356  $           4,568.10
125 Don Home Improvement Don Whatshisname A2351  $              165.20
126 Able Windows Oscar Sanchez A7956  $              900.50

       ???What is wrong with this???   

         There are several problems with this table. 

                1.   Insertion anomaly   If you wanted to insert an order you would have to know or look up all
                        of the information needed. This leaves room for many errors in the entry process.

                2.   Deletion anomaly  If you wanted to delete order # 122 you would lose several pieces of information:
                        The company name of AAA Construction.
                        Their purchase agent's name for the AAA Construction Company.
                        The price of item number A5675.
                2a.  If you just removed one piece of data, say the item number the order would still exist but
                         what was the order for?
                      If you just removed the company name the order would still exist but who ordered it?

                3.  Update anomaly  If you wanted to modify  order # 122 to Item Number A2351 you could lose
                        item number A5675 and its price.

 

To avoid these problems you should put your database in Second Normal Form!!!

 

???But how do we make it Second Normal Form???    

  In the table above, Order # serves as the primary key.    

                   Did you notice that the Company and  Price are dependent on the Order #.
                                  This information is specific to each order. 

                       The Purchase Agent is only dependent on the  Company.   

                      The  Price is dependent on the Item Numbers.

!!! The solution is to make 3 separate tables!!!

 Table 2  
Order # Company  Price
121 Best Windows  $               165.20
122 AAA Construction  $            2,553.60
123 Able Doors  $                 56.00
124 Best Windows  $            4,568.10
125 Don Home Improvement  $            1,234.00
126 Able Windows  $               900.50

In table 2, all of the information included is unique to each order.

 

Table 3
Company  Purchase Agent
Best Windows Tom Jones
AAA Construction James Smith
Able Doors Oscar Sanchez
Don's Home Improvement Don Whatshisname

 In table 3,  the Purchase Agent is dependent on the primary key which is the Company.

 

Table 4
Item Number  Price
A2351  $               165.20
A5675  $            2,553.60
A6689  $                 56.00
A12356  $            4,568.10
A7956  $               900.50

In table 4, the the Price is dependent on the Item Number.

!!!Now we have Second Normal Form!!!

???How do we know it is 2NF???

We started with a table in 1NF, and then we broke it up until all non-key attributes were dependent only on the primary key.

    Now lets look at each table and see how the data is related:
            In table 2, all of the information included is unique to each order.
            In table 3,  the Purchase Agent is dependent on the primary key which is the Company.
            In table 4, the the Price is dependent on the Item Number.

      !!!So we do have 2NF!!!

 

???Why is 2NF good???

             By creating three separate tables we eliminated the dependency problem found in table 1. 

                       Lets look at the results:

                         1.  If you wanted to insert an order you would not have to know or look up all of the information needed. 
                              All you would need is the Company and the Item Number they wanted. 
                              You could automatically get an Order #.
                              The Purchase Agent and Price could be easily called up.           

                         2.  If you wanted to delete order # 122 you would only be deleting the Order #, Company, and the Price.
                              You would still have the Item Number listing, the Company's Purchase Agent, the Item Number, and Price.
                              You would also not have order #122 taking up space because it would be deleted completely.

                        3.  If you wanted to modify  Order # 122 to item number A2351 you would not lose item number A5675
                               and its price. You would just be modifying what they ordered.

   ???Where can I find out More???

                             http://www.shsu.edu/~csc_tjm/summer2000/cs334/Chapter05/Chapter5.html
                             http://databases.about.com/library/glossary/bldef-2nf.htm
                             http://www.utexas.edu/its/windows/database/datamodeling/rm/rm7.html
                             http://www.cs.jcu.edu.au/Subjects/cp1500/1998/Lecture_Notes/normalisation/2nf.html