MySQL Cluster 7.2: Over 8x Higher Performance than Cluster 7.1
- by Mat Keep
0
0
1
893
5092
Homework
42
11
5974
14.0
Normal
0
false
false
false
EN-US
JA
X-NONE
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
mso-para-margin:0cm;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:Cambria;
mso-ascii-font-family:Cambria;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Cambria;
mso-hansi-theme-font:minor-latin;
mso-ansi-language:EN-US;}
Summary
The scalability
enhancements delivered by extensions to multi-threaded data nodes enables MySQL
Cluster 7.2 to deliver over 8x higher
performance than the previous MySQL Cluster 7.1 release on a recent benchmark
What’s New in MySQL Cluster 7.2
MySQL Cluster 7.2
was released as GA (Generally Available) in February 2012, delivering many
enhancements to performance on complex queries, new NoSQL Key / Value API,
cross-data center replication and ease-of-use. These enhancements are
summarized in the Figure below, and detailed in the MySQL Cluster New Features
whitepaper
Figure 1: Next
Generation Web Services, Cross Data Center Replication and Ease-of-Use
Once of the key
enhancements delivered in MySQL Cluster 7.2 is extensions made to the
multi-threading processes of the data nodes.
Multi-Threaded Data Node Extensions
The MySQL Cluster 7.2 data node is now functionally divided into seven thread
types:
1) Local Data Manager threads (ldm). Note – these are sometimes also called LQH
threads.
2) Transaction Coordinator threads (tc)
3) Asynchronous Replication threads (rep)
4) Schema Management threads (main)
5) Network receiver threads (recv)
6) Network send threads (send)
7) IO threads
Each of these thread
types are discussed in more detail below.
MySQL Cluster 7.2
increases the maximum number of LDM threads from 4 to 16. The LDM contains the
actual data, which means that when using 16 threads the data is more heavily partitioned
(this is automatic in MySQL Cluster). Each LDM thread maintains its own set of
data partitions, index partitions and REDO log. The number of LDM partitions
per data node is not dynamically configurable, but it is possible, however, to
map more than one partition onto each LDM thread, providing flexibility in modifying
the number of LDM threads.
The TC domain stores
the state of in-flight transactions. This means that every new transaction can
easily be assigned to a new TC thread. Testing has shown that in most cases 1
TC thread per 2 LDM threads is sufficient, and in many cases even 1 TC thread
per 4 LDM threads is also acceptable. Testing also demonstrated that in some
instances where the workload needed to sustain very high update loads it is
necessary to configure 3 to 4 TC threads per 4 LDM threads. In the previous MySQL
Cluster 7.1 release, only one TC thread was available. This limit has been increased
to 16 TC threads in MySQL Cluster 7.2. The TC domain also manages the Adaptive
Query Localization functionality introduced in MySQL Cluster 7.2 that
significantly enhanced complex query performance by pushing JOIN operations
down to the data nodes.
Asynchronous Replication was separated into its own thread with the release of
MySQL Cluster 7.1, and has not been modified in the latest 7.2 release.
To scale the number of TC threads, it was necessary to separate the Schema
Management domain from the TC domain. The schema management thread has little
load, so is implemented with a single thread.
The Network receiver
domain was bound to 1 thread in MySQL Cluster 7.1. With the increase of threads
in MySQL Cluster 7.2 it is also necessary to increase the number of recv
threads to 8. This enables each receive thread to service one or more sockets used
to communicate with other nodes the Cluster.
The Network send thread is a new thread type introduced in MySQL Cluster 7.2.
Previously other threads handled the sending operations themselves, which can
provide for lower latency. To achieve
highest throughput however, it has been necessary to create dedicated send
threads, of which 8 can be configured. It is still possible to configure MySQL Cluster 7.2 to a legacy mode
that does not use any of the send threads – useful for those workloads that are
most sensitive to latency.
The IO Thread is the final thread type and there have been no changes to this
domain in MySQL Cluster 7.2. Multiple IO threads were already available, which
could be configured to either one thread per open file, or to a fixed number of
IO threads that handle the IO traffic. Except when using compression on disk,
the IO threads typically have a very light load.
Benchmarking the Scalability Enhancements
The scalability
enhancements discussed above have made it possible to scale CPU usage of each
data node to more than 5x of that possible in MySQL Cluster 7.1. In addition, a
number of bottlenecks have been removed, making it possible to scale data node
performance by even more than 5x.
Figure 2: MySQL Cluster 7.2
Delivers 8.4x Higher Performance than 7.1
The flexAsynch
benchmark was used to compare MySQL Cluster 7.2 performance to 7.1 across an 8-node Intel
Xeon x5670-based cluster of dual socket commodity servers (6 cores each).
As the results
demonstrate, MySQL Cluster 7.2 delivers over 8x higher performance per data
nodes than MySQL Cluster 7.1.
More details of this
and other benchmarks will be published in a new whitepaper – coming soon, so
stay tuned!
In a following blog
post, I’ll provide recommendations on optimum thread configurations for
different types of server processor. You
can also learn more from the Best Practices Guide to Optimizing Performance of
MySQL Cluster
Conclusion
MySQL Cluster has
achieved a range of impressive benchmark results, and set in context with the previous 7.1 release, is able to deliver over 8x
higher performance per node.
As a result, the
multi-threaded data node extensions not only serve to increase performance of
MySQL Cluster, they also enable users to achieve significantly improved levels
of utilization from current and future generations of massively multi-core,
multi-thread processor designs.