mirror of
https://github.com/boostorg/mysql.git
synced 2025-05-12 06:01:42 +00:00
228 lines
11 KiB
Plaintext
228 lines
11 KiB
Plaintext
[/
|
|
Copyright (c) 2019-2025 Ruben Perez Hidalgo (rubenperez038 at gmail dot com)
|
|
|
|
Distributed under the Boost Software License, Version 1.0. (See accompanying
|
|
file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
|
|
]
|
|
|
|
[section:multi_function Multi-function operations]
|
|
[nochunk]
|
|
|
|
Multi-function operations allow running operations as a set of separate
|
|
steps, which gives you better control over execution. They work by splitting
|
|
some of the reads and writes into several function calls.
|
|
|
|
You can use multi-function operations to execute text queries and prepared statements,
|
|
and through the dynamic or the static interface.
|
|
|
|
[heading Protocol primer]
|
|
|
|
To make a good use of multi-function operations, you should have a basic understanding
|
|
of the underlying protocol.
|
|
|
|
The protocol uses ['messages] to communicate. These are delimited by headers containing the message length.
|
|
All operations are initiated by the client, by sending a single ['request message], to which
|
|
the server responds with a set of ['response messages].
|
|
|
|
The diagram below shows the message exchange between client and server for text queries and statement
|
|
executions. Each arrow represents a message.
|
|
|
|
[$mysql/images/protocol.svg [align center]]
|
|
|
|
|
|
There are two separate cases:
|
|
|
|
* If your query retrieved at least one column (even if no rows were generated), we're in ['case 1]. The server sends:
|
|
* An initial packet informing that the query executed correctly, and that we're in ['case 1].
|
|
* Some metadata packets describing the columns that the query retrieved. These become available
|
|
to the user and are necessary to parse the rows.
|
|
* The actual rows.
|
|
* An OK packet, which marks the end of the resultset and contains information like `last_insert_id` and
|
|
`affected_rows`.
|
|
* If your query didn't retrieve any column, we're in ['case 2]. The server will just send an OK packet,
|
|
with the same information as in ['case 1].
|
|
|
|
[refmem connection execute] handles the full message exchange. In contrast,
|
|
[refmem connection start_execution] will not read the rows, if any.
|
|
|
|
Some takeaways:
|
|
|
|
* The distinction between single-function and multi-function operations exists only
|
|
in the client. The wire messages exchanged by both are the same.
|
|
* There is no way to tell how many rows a resultset has upfront. You need to read row by row until
|
|
you find the OK packet marking the end of the resultset.
|
|
* When the server processes the request message, [*it sends all the response messages immediately].
|
|
These responses will be waiting in the client to be read. If you don't read [*all] of them,
|
|
subsequent operations will mistakenly read them as their response, causing packet mismatches.
|
|
Be careful and don't let this happen!
|
|
|
|
|
|
|
|
[heading Using multi-function operations through the dynamic interface]
|
|
|
|
[reflink execution_state] is the main class for the dynamic interface in multi-function
|
|
operations. An execution state holds information required to progress the execution operation,
|
|
like metadata (required to parse the rows) and protocol state. Contrary to `results`, it doesn't contain
|
|
the rows.
|
|
|
|
Given the following table definition:
|
|
|
|
[multi_function_setup]
|
|
|
|
You can start a multi-function operation using [refmem connection start_execution]:
|
|
|
|
[multi_function_dynamic_start]
|
|
|
|
We now [*must] read all the generated rows
|
|
by calling [refmem connection read_some_rows], which will return a batch of an unspecified size:
|
|
|
|
|
|
[multi_function_dynamic_read]
|
|
|
|
Some remarks:
|
|
|
|
* If there are rows to be read, `read_some_rows` will return at least one, but may return more.
|
|
* [refmem execution_state complete] returns `true` after we've read the final OK packet for this operation.
|
|
* The final `row_batch` may or may not be empty, depending on the number of rows and their size.
|
|
* Calling `read_some_rows` after reading the final OK packet returns an empty batch.
|
|
|
|
`read_some_rows` returns a [reflink rows_view] object pointing into the connection's internal buffer.
|
|
This view is valid until the connection performs any other operation involving a network transfer.
|
|
|
|
Note that there is no need to distinguish between ['case 1] and ['case 2] in the diagram above in our code,
|
|
as reading rows for a complete operation is well defined.
|
|
|
|
|
|
[heading Using multi-function operations through the static interface]
|
|
|
|
The mechanics are similar to what's been exposed above. The static interface uses
|
|
[reflink static_execution_state] to carry state. As with [reflink static_results], we must define and
|
|
pass a type describing our rows:
|
|
|
|
[describe_post]
|
|
|
|
We can now start our operation using the same [refmem connection start_execution]:
|
|
|
|
[multi_function_static_start]
|
|
|
|
We now [*must] read all the generated rows by calling [refmem connection read_some_rows]:
|
|
|
|
[multi_function_static_read]
|
|
|
|
Some remarks:
|
|
|
|
* [reflink static_execution_state] doesn't store rows anyhow. It uses the row types passed
|
|
as template parameters to validate the metadata returned by the server, and ensure it is compatible
|
|
with the C++ data structures that will be used with `read_some_rows`.
|
|
* We must pass `read_some_rows` a `boost::span` of the appropriate row type.
|
|
We've used `std::array` to place rows on the stack, but you can use any other contiguous range.
|
|
* `read_some_rows` returns the number of read rows. At maximum, this will be the size
|
|
of the span, but there may be less, depending on row and network buffer sizes.
|
|
* If there are rows to be read, `read_some_rows` will return at least one, but may return more.
|
|
* [refmem execution_state complete] returns `true` after we've read the final OK packet for this operation.
|
|
* The final read may or may not return rows, depending on the number of rows and their size.
|
|
* Calling `read_some_rows` after reading the final OK packet always reads zero rows.
|
|
|
|
|
|
|
|
[heading Accessing metadata and OK packet data]
|
|
|
|
You can access metadata at any point, using [refmem execution_state meta] or [refmem static_execution_state meta].
|
|
This function returns a collection of [reflink metadata] objects. For more information, please refer to [link mysql.meta this section].
|
|
|
|
You can access OK packet data using functions like [refmemunq execution_state last_insert_id]
|
|
and [refmemunq execution_state affected_rows] in both `execution_state` and `static_execution_state`.
|
|
As this information is contained in the OK packet,
|
|
[*it can only be accessed once the [refmemunq execution_state complete] function returns `true`].
|
|
|
|
[heading Using multi-function operations with stored procedures and multi-queries]
|
|
|
|
When using operations that return more than one resultset (e.g. when calling stored procedures),
|
|
the protocol is slightly more complex:
|
|
|
|
[$mysql/images/protocol_multi_resultset.svg [align center]]
|
|
|
|
The message exchange is as follows:
|
|
|
|
* A single execution request is sent to the server.
|
|
* The server sends a first resultset, as in the single resultset case.
|
|
The OK packet contains a flag indicating that another resultset follows.
|
|
* Another resultset is sent, with the same structure as the previous one. The process is
|
|
repeated until an OK packet indicates that no more resultsets follow.
|
|
|
|
For example, given the following stored procedure:
|
|
|
|
[/ This is an actual procedure. Code imports from SQL don't work. Make sure it doesn't go out of sync ]
|
|
[!teletype]
|
|
```
|
|
CREATE PROCEDURE get_company(IN pin_company_id CHAR(10))
|
|
BEGIN
|
|
START TRANSACTION READ ONLY;
|
|
SELECT id, name, tax_id FROM company WHERE id = pin_company_id;
|
|
SELECT first_name, last_name, salary FROM employee WHERE company_id = pin_company_id;
|
|
COMMIT;
|
|
END
|
|
```
|
|
|
|
We can write:
|
|
|
|
[table
|
|
[
|
|
[Dynamic interface]
|
|
[Static interface]
|
|
]
|
|
[
|
|
[[multi_function_stored_procedure_dynamic]]
|
|
[[multi_function_stored_procedure_static]]
|
|
]
|
|
]
|
|
|
|
Note that we're using [refmemunq execution_state should_read_rows] instead of [refmemunq execution_state complete]
|
|
as our loop termination condition. `complete()` returns true when all the resultsets have been read,
|
|
while `should_read_rows()` will return false once an individual result has been fully read.
|
|
|
|
When using the static interface with multi-function operations, not all schema mismatches can be found
|
|
by the `start_execution` function, since not all the information is available at this point. Errors may
|
|
be reported by `read_some_rows` and `read_resultset_head`, too. Overall, the same checks are performed as
|
|
when using [refmem connection execute], but at different points in time.
|
|
|
|
[heading Multi-resultset in the general case]
|
|
|
|
`execution_state` and `static_execution_state` can be seen as state machines with four states. Each state
|
|
describes which reading function should be invoked next:
|
|
|
|
* [refmemunq execution_state should_start_op]: the initial state, after you default-construct an `execution_state`.
|
|
You should call [refmem connection start_execution] or [refmem connection async_start_execution] to start the operation.
|
|
* [refmemunq execution_state should_read_rows]: the next operation should be [refmem connection read_some_rows],
|
|
to read the generated rows.
|
|
* [refmemunq execution_state should_read_head]: the next operation should be [refmem connection read_resultset_head],
|
|
to read the next resultset metadata. Only operations that generate multiple resultsets go into this state.
|
|
* [refmemunq execution_state complete]: no more operations are required.
|
|
|
|
For multi-function operations, you may also access OK packet data ever time a resultset has completely
|
|
been read, i.e. when [refmemunq execution_state should_read_head] returns `true`.
|
|
|
|
[link mysql.examples.source_script This example] shows this general case. It uses
|
|
multi-queries to run an arbitrary SQL script file.
|
|
|
|
[heading:read_some_rows More on read_some_rows]
|
|
|
|
To properly understand `read_some_rows`, we need to know that every [reflink connection]
|
|
owns an *internal buffer*, where packets sent by the server are stored.
|
|
It is a single, flat buffer, and you can configure its initial size when creating
|
|
a `connection`, passing a [reflink buffer_params] object as the first argument to `connection`'s constructor.
|
|
The read buffer may be grown under certain circumstances to accommodate large messages.
|
|
|
|
`read_some_rows` gets the maximum number of rows that fit in the internal buffer (without growing it)
|
|
performing a single `read_some` operation on the stream (or using cached data).
|
|
If there are rows to read, `read_some_rows` guarantees to read at least one. This means that,
|
|
if doing what we described yields no rows (e.g. because of a large row that doesn't fit
|
|
into the buffer), `read_some_rows` will grow the buffer or perform more reads until at least
|
|
one row has been read. If you're using the static interface, the number of read rows is limited
|
|
by the size of span you passed, too.
|
|
|
|
If you want to get the most of `read_some_rows`, customize the initial buffer size
|
|
to maximize the number of rows that each batch retrieves.
|
|
|
|
[endsect]
|