Why do we need to estimate?

                                                          

        Data volume and usage estimation is crucial for the proper administration of databases. As you all know, we need a storage space to store and maintain our database. In order to make the proper storage size decision for our database we need to estimate the data volume and usage.

 

What happens if we don't estimate?

 

The consequences of NOT estimating data volume and usage frequency is severe. Think about an e-tailer (web-based retailer). Let's assume that the e-tailer's management  chose a database storage space using the cost as the sole criterion. Since the e-tailer wants to save bucks from the initial set-up costs, they chose the smallest storage space available by the vendor. After a serious advertising campaign using web and other media (!), they started their online operations. Everything was going fine, until one day they found out that their web site is crashed due to data overload and high level of usage frequency. Now the company ended up having:

 

This is a simple example, but it is enough to show how important this estimation process is...

 

                    When do we need to estimate data volume and usage?

 

 

Data volume and usage analysis is an integral part of the physical design of the database. Whenever, you want to develop a database, you should estimate the data volume and usage frequency.

 

               

                How can we estimate?

 

 

Estimating data volume and usage frequency is not an easy task. However, at least putting the necessary effort to estimate this activity is far better than designing the database without estimating the activity. We do not necessarily need precise numbers but we need to do our best to estimate the numbers.

Let's see how we can estimate data volume and usage using the sample database tables below:

 

STEP.1. Estimating Total Bytes per Record

 

TABLE Customer

CustomerID        CHAR(9)              Field Size Total = 9+30+40= 79 

CustomerName   CHAR(30)            # of Fields =  3

Address              CHAR(40)            Total Bytes per Record= 82

 

TABLE Item

ItemID                CHAR(12)            Field Size Total = 12+20= 32

ItemName           CHAR(20)            # of Fields = 2

                                                      Total Bytes per Record= 34

 

TABLE Order

CustomerID        CHAR(9)            Field Size Total = 9+12+11(*)=32

ItemID               CHAR(12)           # of Fields = 3

Quantity             INTEGER            Total Bytes per Record= 35

 

(*) 11 bytes assigned to INTEGER Data type

 

STEP.2. Analyze The Current Usage

 

            20 x 82 bytes = 1640 bytes required for Customer Table

            30 x 34 bytes = 1020 bytes required for Item Table

            20 x 5 x 35 bytes = 3500 bytes required for Order Table

     We currently need 6160 bytes of storage space.

 

        If there is no current data in regards to the process activity  then you can develop some of these statistics such as average items purchased by customer by using the industrial benchmarks.

 

STEP.3. Estimate The Future Activity

 

          First of, we need to determine our forecasting period using information such as the economic life of the system we are using or recordkeeping period required by law.

    Let's say our forecasting period is 3 years in this example;

 

    20  x (1+10%)3 = 27 customers x 82 bytes = 2183 bytes required for Customer Table

    30 x (1+5%)3 =35 items x 34 bytes = 1180 bytes required for Item Table

    27 x 7 x 35 bytes =6615 bytes required for Order Table

    Under this scenario, we will need 9978 bytes of storage space, estimating 62% increase in the volume in 3 years.

 

    We can develop alternative scenarios using different growth rates for statistics (i.e. # of customers will grow by 15% instead of 10% per year). This will enable us to make proper decisions about the storage requirements.

 

Related Websites:

Physical Database Design, www.acsu.buffalo.edu/~zelli/chap06a.ppt

Business Systems Design, http://bear.cba.ufl.edu/teets/ISM4113/MDMChapt7.ppt