Database aggregation to improve database performance

By Reuven Bakalash

"The only problem with dollars," a respected financial technology executive once noted, "is that they are all green." He was referring to the problems business analysts encounter when they attempt to identify the real sources of profitability across large enterprises. To operational data processing systems, every dollar looks alike. As the dollar flows through the organization from customer to investor, its origin and history are often left behind, along with much of its meaning.

As a result, large business organizations have great difficulty in telling which products, customer relationships, or business units actually generate the greatest and most sustainable profits. Operational systems can tell us how much money we are making, but they can’t necessarily tell us how to make more. It is this last question — the one that senior executives really want to have answered — that draws the database administrator into the realm of business intelligence.

Over the past ten years, businesses have made enormous investments in systems designed to transform operating and accounting data into business intelligence. They have created terabyte-scale data warehouses, developed powerful analytical programs to aggregate and query these huge data stores, and spent millions on multi-processing servers powerful enough to process the resulting volumes.

This demand for data warehouses and their associated business intelligence applications shows no signs of abating. IDC, a leading research firm, estimates that the worldwide market for data warehousing software will grow at an average of 24 percent annually through 2004. The market for customer relationship management (CRM) data warehousing software will grow even faster, by nearly 33 percent a year. IDC’s survey of 11,000 contacts in 17 countries revealed that the majority of organizations planning to adopt e-commerce, customer relationship management and supply chain management applications also plan to adopt data warehousing.

Increased Data Volumes and Performance Degradation

As data volumes continue to grow, performance problems will proliferate. Current business trends will drive up data storage and analysis requirements to previously unimaginable levels. E-commerce operations require businesses to record and store every click on every web page. CRM applications demand repeated analyses of detailed transaction histories for millions of customers. New supply chain management systems drive up storage and traffic loads.

The more data processing capacity we create, the more data we are required to keep. Business organizations merge, demand rapid integration of their disparate systems and databases, and then restructure before the integration effort is complete. Firms initiate global operations that boldly go where no database has ever gone before, introducing new business, regulatory and reporting requirements that can wreak havoc on database structures.

Even the most powerful database servers cannot always keep pace with the analytical demands imposed by scores of financial analysts, marketing researchers, strategic planners, and customer relationship managers. As a result, many business organizations are literally choking on the terabytes of data they have collected. Some queries create processing loads that can drive resource use beyond acceptable reserve limits, and a few may even exceed system capacities. Complex queries can take hours and even days to aggregate and process. Systems performance slows to a crawl, and the lights on the DBA’s phone begin to flash as the complaints roll in.

The Real Impact on Business Intelligence

The impact of degraded processing performance can be subtle and far-reaching. Business intelligence is not really a property of an information system, but of the interaction between the data in the system and the intelligence, curiosity and expertise of the analyst and the decision maker. To get the right answer, once must ask the right question. Unfortunately, the right question usually makes its appearance at the end of a long line of wrong or at least partially wrong questions. Good business intelligence, like good science, proceeds by a process of trial and error. The analyst must almost always work through repeated iterations, those celebrated "what if" scenarios that materialize so quickly in commercials and so slowly in real life.

Interaction is a direct function of processing performance. The analyst or decision maker who can initiate and receive timely responses to a dozen or more queries in one morning has a reasonable chance of unearthing something genuinely useful to the enterprise. The analyst/decision maker who must wait overnight is fighting against much higher odds. He or she must constantly cut corners, reduce sample populations, and neglect promising alternatives along with the off-the-wall speculations that sometimes pay off in revolutionary understanding. The relationships here are obvious and linear: the better the processing performance, the better and more frequent the interaction; the better and more frequent the interaction, the better and more profitable the business intelligence.

Back