Following a rigorous methodology is key to delivering customer satisfaction and expanding analytics use cases across the business.
The purpose of capacity planning is to determine the amount of disk space needed for a production data warehouse (DW). Informatica recommends taking the size of known data feeds expected in the near future and adding a 20% buffer to accommodate secondary and join indexes. Database parameters and vendor specific architecture best practices have not been considered for this article.
Database capacity planning is an ongoing activity that should be flexible to changing data-centric business scenarios. Effective database capacity planning is directly proportional to the availability and scalability of any OLAP application consuming data from the DW. Insufficient capacity planning can increase unplanned DW outages, adversely impacting the decision support systems and decision-making capabilities of the organization.
The following matrix depicts a sample current state resource utilization report from a production Teradata database. The grid should be updated to represent the specific database technology that hosts the data warehouse.
Database/Instance Name |
Resource Name |
Peak Utilization/Month |
Threshold |
Threshold breach in a month (No of days) |
PROD_DB |
AMP CPU |
<Peak AMP CPU utilization> |
< Threshold assigned for AMP CPU> |
10 |
PROD_DB |
PERM SPACE |
<Peak PERM space Usage in TB> |
< Threshold assigned for PERM Space> |
16 |
PROD_DB |
SPOOL SPACE |
<Peak SPOOL Space usage in TB> |
< Threshold assigned for SPOOL Space> |
16
|
An historic approach for capacity planning is used. The planning relies heavily on past system performance metrics to predict forthcoming workloads. The data growth for the next 60 months has been calculated based on historical consumption.
The flowchart bellows illustrates a sequential flow of preceding events culminating in the planning phase.
The attached sizing model provides a template for capturing the estimated data growth over a five-year period for Teradata, Oracle, and DB2. The template can be altered to include other database technologies.
{fa-file-excel-o fa-2x color=rgb(0,169,187)} Database Sizing Model
Refer to specific database vendor documentation for estimating bytes assigned to data types.
The following matrix depicts space requirements over a period of 5 years. The summary is based on the capacity sizing model.
Current Year |
12 Months |
24 Months |
36 Months |
48 Months |
60 Months |
|
Total Disk Space (In GB) |
22.93 |
24.93 |
25.83 |
27.73 |
29.63 |
31.43 |
Cumulative Total |
22.93 |
47.86 |
73.69 |
101.42 |
131.05 |
162.48
|
Success
Link Copied to Clipboard