TPC photo
The TPC defines transaction processing and database benchmarks and delivers trusted results to the industry
    Document Search         Member Login    
     Home
About the TPC
Benchmarks
      Newsletter
      Join the TPC
      Downloads
      Technical Articles
      TPCTC
      Performance-Pulse

TPC Benchmark Status
May, 1999

By Kim Shanley, TPC Chief Operating Officer

TPC Benchmark Status is published about every two months. The first and primary purpose of the newsletter is to keep interested parties informed about the content, issues, and schedule of the TPC's benchmark development efforts. The second purpose is to invite new members to join these important development efforts. We've already outlined most of the reasons for joining the TPC in another article, Why Join. To receive the status report by email, please click here.

Last Meeting
The TPC held a General Council meeting April 13-14 in Colorado Springs. There wasn’t any substantial change in the direction of the TPC-C or TPC-W Subcommittees, so I will just cite their new development schedules. As was the case in our March Newsletter, I will focus primarily on the activities surrounding decision support and TPC-D.

TPC-C Milestones
  • 8/1999: TPC-C Version 4 (V.4) submitted TPC company review.
  • 12/1999: TPC-C V. 4 submitted for TPC mail ballot approval.
  • 2/2000: TPC-C V. 4 approved as an official TPC benchmark.
  • 4/2000: First TPC-C V. 4 results can be published.
TPC-W Milestones
  • 10/1999: TPC-W Version 1 (V.1) submitted TPC company review.
  • 4/2000: TPC-W V. 1 submitted for TPC mail ballot approval.
  • 6/2000: TPC-W V. 1 approved as an official TPC benchmark.
  • 8/2000: TPC-W V. 1 results can be published.
TPC-R and TPC-H Benchmarks Become Official TPC Benchmarks, TPC-D Obsoleted
As documented in some detail in our March report, the TPC’s decision support benchmark, TPC-D, was split into two different benchmarks, TPC-R (business reporting) and TPC-H (ad-hoc querying). After struggling with preserving a single, unified decision support benchmark for some time, at the February 1999 General Council meeting the Council decided that TPC-D was one benchmark trying to represent two different business environments. The first is the environment in which users know the queries very well and can optimize their DBMS to execute these queries very rapidly (business reporting environment). The other environment is the original TPC-D ad-hoc environment in which users don’t know the queries in advance and the execution times can be very long. A mail ballot was sent out to all TPC members after the February meeting to have the TPC membership officially vote to obsolete TPC-D and create TPC-R and TPC-H. The ballot measured passed by the required two-thirds majority, and therefore, the TPC now has 3 official benchmarks:

  • TPC-C (on-line transaction processing)
  • TPC-R (business reporting, decision support)
  • TPC-H (ad-hoc querying, decision support)
New Primary Performance Metric for TPC-R and TPC-H
The other major change that is now in place with the passing of these new benchmarks is that the primary performance metric in TPC-R and TPC-H is now represented by one number rather than the two numbers in TPC-D. In TPC-D, the primary metrics were the power metric (QppD), representing the performance of the system for a single power user, and throughput (QthD), representing the throughput capacity of the system with multiple users. While the two performance metrics truly reflected two different performance evaluation perspectives, ultimately the dual metrics generated more confusion than illumination in the marketplace. Both vendors and users alike want one primary performance number, and the TPC responded by creating one composite performance metric for TPC-R and TPC-H. The composite metric (queries per hour) equally weighs the contribution of the old TPC-D single user power metric and the old TPC-D multiple user throughput metric. For TPC-R, the composite performance metric is QphR (queries-per-hour) and for TPC-H, it is QphH.

A few other similarities and one difference to note between TPC-R and TPC-H:

  • Both benchmarks use the same database.
  • Both benchmarks calculate the metrics the same way. However, these are two different workloads and the metrics from these benchmarks should not be compared.
  • To implement the intention of each benchmark, TPC-R and TPC-H use different partitioning and indexing schemes.
  • In TPC-H, horizontal partitioning is allowed with some restrictions, but in general, these partitioning schemes cannot rely on knowledge of the data stored in the partitioned columns. Because TPC-H is an ad-hoc benchmark, auxiliary data structures that pre-compute the answers to the queries during the database load time cannot be used.
  • In TPC-R, horizontal partitioning is also allowed. While there are fewer restrictions than in TPC-H, the same general rule applies: these partitioning schemes cannot rely on knowledge of the data stored in the partitioned columns. Finally, as TPC-R is a benchmark where the general phrasing of most queries is well known in advance, the use of auxiliary data structures that pre-compute the answers to the queries is allowed.
TPC-D in Retrospective
Benchmarks follow a predictable cycle. When a benchmark is new, it usually uncovers mainstream performance issues, frequently referred to as low hanging fruit. Over time, all of the low hanging fruit is picked and the most difficult problems are left. For a benchmark to continue to drive technology improvements, the workload must adapt to changing technology.

TPC-D was born in April 1995. It was the industry’s first well-accepted decision support benchmark. Since that time, the markets and technologies for data mining, OLAP, data marts, data warehousing all have evolved considerably. The set of 17 read-only queries that formed the core workload of TPC-D represented a very significant challenge to the most powerful decision support systems in 1995. From a hardware perspective, even today, TPC-D represents a significant performance challenge. However, over time, the computational task of executing any pre-defined set of queries, no matter how complex, can be optimized using new software technologies. In the case of TPC-D, the introduction of new database technologies enabled vendors to build structures (similar to indexes) which contained pre-computed or nearly pre-computed answers to TPC-D’s queries. At run time, these indexes could be accessed transparently and the answers produced almost instantaneously. In this sense, the history of TPC-D follows the general pattern of how many decision support environments evolve.

As long as the schema in a decision support warehouse is not well understood, and the types of queries that will be commonly run against the schema are not yet defined, the queries are truly ad-hoc and the query times are quite long. As the schema becomes better known and a common set of frequent queries is defined; database administrators generate intermediate structures and query patterns (indexes, etc.) that produce faster response times. The system can support significant numbers of users running variations of these well-known and optimized queries. This business reporting workload component of a decision support environment is modeled by TPC-R.

Interestingly, as reported by several decision support specialists at the last TPC meeting, a typical decision support environment doesn’t simply devolve to a business reporting environment where canned queries are executed very quickly. While a common set of queries is defined and the answers are optimized, the answers lead users to ask new questions of the system and its data warehouse. The new ad-hoc questions generate queries for which there may be no specialized indexes, or for which the schema is not particularly optimized. As a result, query executive times may be quite lengthy. TPC-H is relevant to the ad-hoc query component of a decision support workload.
Though this may be interesting from a technologist’s perspective, what does the evolution of TPC-D mean to the end-user? Simply, TPC-D has driven substantial performance gains in the decision support arena, particularly among DBMS vendors. TPC-D set a common performance bar and asked the vendors to leap over it. By leaping over it, vendors improved their hardware and software and reduced the time end-users spent waiting for answers to critical business questions. The evolution of TPC-D into TPC-H and TPC-R is the next stage in the lifecycle of the TPC’s decision support benchmarking.

All Benchmark Status Reports
 

Valid XHTML 1.0 Transitional Valid CSS!