mirror of
https://github.com/boostorg/mysql.git
synced 2025-05-12 14:11:41 +00:00
Added one_small_row, one_big_row, many_rows, stmt_params benchmarks against libmysqlclient and libmariadb Added a CI build to compile and run benchmarks Added a Python script to run the benchmarks Refactored the connection_pool benchmark to be use data independent from examples close #458
104 lines
5.0 KiB
Plaintext
104 lines
5.0 KiB
Plaintext
[/
|
|
Copyright (c) 2019-2025 Ruben Perez Hidalgo (rubenperez038 at gmail dot com)
|
|
|
|
Distributed under the Boost Software License, Version 1.0. (See accompanying
|
|
file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
|
|
]
|
|
|
|
[section:benchmarks Benchmarks against the official connectors]
|
|
[nochunk]
|
|
|
|
MySQL and MariaDB ship with official C connectors:
|
|
[@https://dev.mysql.com/downloads/c-api/ libmysqlclient] and
|
|
[@https://mariadb.com/kb/en/mariadb-connectorc-api-functions/ libmariadb].
|
|
Both implement the client/server protocol, as Boost.MySQL does.
|
|
The question then arises: is Boost.MySQL as fast as the official drivers?
|
|
|
|
[note
|
|
TL;DR: Boost.MySQL is as fast as the official C APIs, and may be faster under some circumstances.
|
|
]
|
|
|
|
|
|
[heading Design decisions]
|
|
|
|
These benchmarks focus on [*the speed of the protocol implementation], in an attempt to
|
|
answer the question above. This should take into account, at least,
|
|
(de)serialization and buffering. It shouldn't take into account features
|
|
unique to Boost.MySQL, like the static interface or connection pooling.
|
|
|
|
Both libmysqlclient and libmariadb offer a connection type, similar to [reflink any_connection],
|
|
with both sync and async primitives. Sync functions are similar to the ones in Boost.MySQL (although C-flavored).
|
|
Async functions are much lower-level, and often require either integration into a framework
|
|
(like Asio or libuv) or writing `poll`/`epoll` code by hand. None of these options is trivial.
|
|
Additionally, sync functions have less overhead, so they're best suited to answer our question.
|
|
For this reason, [*we only use sync functions] in the benchmarks.
|
|
|
|
The benchmarks [*use prepared statements only]. The official drivers handle text
|
|
queries (issued by `mysql_real_query`) and prepared statements differently.
|
|
Rows generated by text queries are returned as strings, and need to be parsed by
|
|
the user. Boost.MySQL handles this parsing automatically for you.
|
|
For this reason, comparing text queries doesn't make much sense.
|
|
Prepared statements are handled similarly, and are better suited for
|
|
big rows and datasets.
|
|
|
|
[*All tests use a real database]. Neither Boost.MySQL nor the official C clients
|
|
expose (de)serialization functions. Buffering and optimizing the number of system
|
|
calls is also critical for efficiency, and can only be measured with real communication.
|
|
The downside is that database processing introduces delays, and might end up
|
|
being the bottleneck.
|
|
|
|
The benchmarks try to [*minimize communication overhead by using UNIX sockets].
|
|
|
|
|
|
[heading Benchmark procedure]
|
|
|
|
Benchmark source code can be found in the [@https://github.com/boostorg/mysql/tree/master/bench bench/]
|
|
folder of the repo. The following benchmarks are performed:
|
|
|
|
* One small row. Executes a statement yielding a single row with 15 fields,
|
|
including most of the possible types. Each row weighs around 500 bytes.
|
|
Execution is repeated 10000 times. The Boost.MySQL version uses [refmem any_connection execute].
|
|
* One big row. Like the above, but rows have 17 fields, and each row weighs between 72 and 108 KB.
|
|
The Boost.MySQL version uses [refmem any_connection start_execution], which allows zero-copying.
|
|
* Many rows. Executes a statement that yields 5000 of the "big rows" described above.
|
|
The statement is executed only once. The Boost.MySQL version uses [refmem any_connection start_execution]
|
|
because the resultset size is big.
|
|
* Statement with parameters. Executes a statement with 17 parameters, roughly matching the "big row"
|
|
structure described above. Intended to measure serialization speed.
|
|
The statement is executed 1000 times.
|
|
|
|
Benchmark conditions:
|
|
|
|
* Database: MySQL 8.4.1, running on a Docker container in localhost.
|
|
* OS: Ubuntu 24.04
|
|
* CPU: Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz, 8 cores.
|
|
* Compiler: g++-14, using CMake's Release config, C++23.
|
|
* MySQL C API: libmysqlclient24 (as included in the official MySQL 8.4.4 release).
|
|
* MariaDB C API: libmariadb3 1:10.11.11-0ubuntu0.24.04.2 (official Ubuntu package).
|
|
* Boost.MySQL: Boost 1.87.0. The header-only version is used
|
|
(without defining `BOOST_MYSQL_SEPARATE_COMPILATION`), since it's slightly faster.
|
|
|
|
|
|
|
|
[heading Results]
|
|
|
|
[$mysql/images/bench-protocol.png [align center]]
|
|
|
|
The three libraries exhibit a similar level of performance, which is expected
|
|
from a correctly implemented binary protocol. Boost.MySQL outperforms libmysqlclient
|
|
in the single row benchmarks, and is on par with libmariadb. Differences in the
|
|
other benchmarks don't appear to be statistically significant.
|
|
|
|
During these benchmarks, some potential performance improvement areas
|
|
have been identified. See [https://github.com/boostorg/mysql/issues/458 this issue]
|
|
for details.
|
|
|
|
Remember that protocol is just one piece to the whole puzzle.
|
|
Correctly using features like [reflink connection_pool], [reflink with_params],
|
|
multi-function operations and multi-queries can make a huge performance difference
|
|
in your application. Never assume anything and always measure!
|
|
|
|
Acknowledgments: thanks [@https://github.com/LowLevelMahn LowLevelMahn] for proposing the benchmarks.
|
|
|
|
[endsect]
|