mysql/doc/qbk/04_overview.qbk
Anarthal (Rubén Pérez) 33660e2710
Starting operations while engaged in multi-function ops is no longer UB
Attempting to start an operation involving server communication after a
multi-function operation has been started and before all its associated
packets have been read now fails with
client_errc::engaged_in_multi_function, instead of causing undefined
behavior.
Attempting to start a read_some_rows or read_resultset_head operation
with no in-progress multi-function operation now fails with
client_errc::not_engaged_in_multi_function.
Attempting to start an operation that requires an established session
before a successful connect now fails with client_errc::not_connected.
Added the three client_errc enum values mentioned above.
Renamed detail::connection_status (connection_pool) to node_status.
Added detail::connection_status (connection_state_data).
Removed redundant spotcheck tests.

close #448 
close #450
2025-02-22 19:15:03 +01:00

333 lines
10 KiB
Plaintext

[/
Copyright (c) 2019-2025 Ruben Perez Hidalgo (rubenperez038 at gmail dot com)
Distributed under the Boost Software License, Version 1.0. (See accompanying
file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
]
[section:overview Overview]
[nochunk]
This section briefly explains the library main classes and functions, and how to use them.
Boost.MySQL exposes sync and async functions implementing functionality involving I/O.
As explained [link mysql.tutorial_async in the second tutorial],
it's advisable to use async functions when possible, because they are more flexible.
Boost.MySQL supports the Boost.Asio universal async model. This means that
a variety of async programming paradigms can be used with the library,
including callbacks, stackful coroutines and C++20 coroutines.
We will use C++20 coroutines throughout the document because they're
easy to use.
[note
Still not using C++20? Don't worry, you can use
[link mysql.examples.coroutines_cpp11 stackful coroutines] and
[link mysql.examples.callbacks callbacks] even in C++11.
]
[section Connection establishment]
[reflink any_connection] is the most primitive I/O object in the library.
It can establish and close connections, run queries and manage prepared statements.
Like most I/O objects, `any_connection` can be constructed from an execution context:
[tutorial_sync_connection]
`any_connection` is named like this for historic reasons:
a templated connection class came before it.
We currently recommend using `any_connection` for new
code because it's simpler and more powerful.
The MySQL client/server protocol is session-oriented. Before anything else,
you must perform session establishment by calling [refmem any_connection async_connect]:
[overview_connect]
[refmemunq any_connection async_connect] performs the hostname resolution,
TCP session establishment, TLS handshake and MySQL handshake.
By default, TLS is used if the server supports it.
You can configure a number of parameters here, including
the database to use, TLS options and buffer sizes.
See [link mysql.connection_establishment this section] for more info.
Boost.MySQL also supports
[link mysql.connection_establishment.unix using UNIX-domain sockets].
To cleanly terminate a connection, use [refmemunq any_connection async_close].
This sends a packet informing of the imminent close and shuts down TLS.
The connection destructor will also close the socket, so no leak occurs.
[endsect]
[section Running queries]
The simplest way to run a SQL query is using [refmem any_connection async_execute].
You can execute queries by passing a string as first parameter:
[overview_text_query]
Most queries contain user-supplied input. [*Never use raw string concatenation]
to build queries, since this is vulnerable to SQL injection.
Boost.MySQL provides two interfaces to run queries with parameters:
[table
[
[Feature]
[Code]
]
[
[
[link mysql.text_queries Client-side SQL formatting]:
* Securely expands queries client-side.
* Text-based protocol.
* Adequate for general use.
]
[
[overview_with_params]
]
]
[
[
[link mysql.prepared_statements Prepared statements]:
* Parsed and executed in two different operations.
* Binary protocol.
* Adequate when running a query several times or retrieving
lots of numeric data.
]
[
[overview_statement]
]
]
]
By default, we recommend using [reflink with_params] because it's
simpler and entails less round-trips to the server.
See [link mysql.text_queries.comparison the comparison section] for more info.
Client-side SQL formatting can also be used to
[link mysql.sql_formatting_advanced.expand expand queries]
without sending them to the server.
[endsect]
[section The dynamic and the static interfaces]
In MySQL, a ['resultset] refers to the results generated by a SQL query.
A resultset is composed of rows, [link mysql.meta metadata] and
additional info, like the last insert ID.
There are two different interfaces to access resultsets.
You can use the [reflink results] class to access rows using a dynamically-typed interface,
using variant-like objects to represent values retrieved from the server. On other other hand,
[reflink static_results] is statically-typed. You specify the shape of your rows at compile-time,
and the library will parse the retrieved values for you into the types you provide.
You can use almost every feature in this library (including text queries and prepared statements) with both interfaces.
For example, given the following table :
[/ (TODO: this code is duplicated.) ]
[!teletype]
```
CREATE TABLE employee(
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
first_name VARCHAR(100) NOT NULL,
last_name VARCHAR(100) NOT NULL,
... -- other fields not relevant for us
);
```
This is how you would access its contents using either of the interfaces:
[table
[
[Interface]
[Description]
[Example]
]
[
[
Dynamic interface: [reflink results]
]
[
* Variant based
* Available in C++11
* [link mysql.dynamic_interface Learn more]
]
[
[overview_ifaces_dynamic]
]
]
[
[
Static interface: [reflink static_results]
]
[
* Parses rows into your own types
* Requires C++20 when using Boost.Pfr, C++14 when using Boost.Describe
* [link mysql.static_interface Learn more]
]
[
[overview_static_struct][br]
[overview_ifaces_static]
]
]
]
Prefer using the static interface when possible.
[endsect]
[section Running INSERT, UPDATE and DELETE statements]
The same APIs explained above can be used for SQL statements
that don't retrieve data:
[overview_update]
When performing INSERTs, you might find [refmem results last_insert_id]
handy, which retrieves the last AUTO INCREMENT ID generated by the executed statement.
See [link mysql.tutorial_updates_transactions our tutorial on UPDATEs and transactions]
for more info.
[endsect]
[section:async Single outstanding async operation per connection]
At any given point in time, an `any_connection` can only have a single async operation outstanding.
In other words, connections implement no asynchronous locking or queueing, which
keeps code simple and efficient. If you need to perform several operations in parallel,
you can open more connections or use [reflink connection_pool].
Trying to run operations concurrently on a single connection is detected at
runtime and generates a `client_errc::operation_in_progress` error:
[overview_async_parallel]
[endsect]
[section:errors Error handling]
An operation fails if a network error happens,
a protocol violation is encountered, or the server reports an error.
For instance, SQL syntax errors make `async_execute` fail.
When the server reports an error, it provides a diagnostic string
describing what happened. The [reflink diagnostics] class encapsulates
this message. Some library functions generate diagnostics strings, too.
Both the sync functions in [link mysql.tutorial_sync the first tutorial]
and the coroutines in this exposition throw exceptions when they fail.
The exception type is [reflink error_with_diagnostics], which inherits
from `boost::system::system_error` and adds a [reflink diagnostics] object.
Async functions use [reflink with_diagnostics], a completion token adapter,
to transparently include diagnostics in exceptions.
You can avoid exceptions when using coroutines with `asio::redirect_error`:
[overview_no_exceptions]
[reflink2 error_code mysql::error_code] is an alias for `boost::system::error_code`.
[endsect]
[section Multi-function operations]
Until now, we've been using [refmemunq any_connection async_execute], which
executes some SQL and reads all generated data into an in-memory object.
Some use cases may not fit in this simple pattern. For example:
* When reading a very big resultset, it may not be efficient (or even possible) to completely
load it in memory. Reading rows in batches may be more adequate.
* If rows contain very long `TEXT` or `BLOB` fields, it may not be adequate to copy these values
from the network buffer into the `results` object. A view-based approach may be better.
For these cases, we can break the execute operation into several steps,
using a ['multi-function operation] (the term is coined by this library). This example reads an entire
table in batches, which can be the case in an ETL process:
[overview_multifn]
[note
Once you start a multi-function operation with [refmemunq any_connection async_start_execution],
the server immediately sends all rows to the client. [*You must read all rows] before engaging in further operations.
If you try to initiate an unrelated operation before reading all rows, you will get a
`client_errc::engaged_in_multi_function` error.
]
Multi-function operations are powerful but complex. Only use them when there is a strong reason to do so.
Multi-function operations also work with the static interface.
Please refer to [link mysql.multi_function this section] for more information on these operations.
[endsect]
[section Connection pools]
Connection pooling is a technique where several long-lived connections
are re-used for independent logical operations. When compared to
establishing individual connections, it has the following benefits:
* It provides better performance. Please consult [link mysql.connection_pool.benchmarks our benchmarks]
for more info.
* It simplifies connection management. The connection pool will establish sessions,
perform retries and apply timeouts out of the box.
This is how you can create a pool of connections:
[connection_pool_create]
[refmem connection_pool async_run] must be called exactly once per pool.
This function takes care of actually keeping connections healthy.
To retrieve a connection, use [refmem connection_pool async_get_connection]:
[connection_pool_get_connection]
For more info, see [link mysql.connection_pool this section].
[endsect]
[endsect]