mirror of
https://github.com/CLIUtils/CLI11.git
synced 2025-05-07 23:33:52 +00:00
Escape transform and docs (#970)
Update some documentation and add a string escape transformer so escaped strings can be handled on the command line as well as in the config files. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
This commit is contained in:
parent
91101604d5
commit
de1c6a1207
39
README.md
39
README.md
@ -451,8 +451,8 @@ Before parsing, you can set the following options:
|
||||
This equivalent to calling `->delimiter(delim)` and `->join()`. Valid values
|
||||
are `CLI::MultiOptionPolicy::Throw`, `CLI::MultiOptionPolicy::Throw`,
|
||||
`CLI::MultiOptionPolicy::TakeLast`, `CLI::MultiOptionPolicy::TakeFirst`,
|
||||
`CLI::MultiOptionPolicy::Join`, `CLI::MultiOptionPolicy::TakeAll`, and
|
||||
`CLI::MultiOptionPolicy::Sum` 🆕.
|
||||
`CLI::MultiOptionPolicy::Join`, `CLI::MultiOptionPolicy::TakeAll`,
|
||||
`CLI::MultiOptionPolicy::Sum` 🆕, and `CLI::MultiOptionPolicy::Reverse` 🚧.
|
||||
- `->check(std::string(const std::string &), validator_name="",validator_description="")`:
|
||||
Define a check function. The function should return a non empty string with
|
||||
the error message if the check fails
|
||||
@ -702,6 +702,17 @@ filters on the key values is performed.
|
||||
`CLI::FileOnDefaultPath(default_path, false)`. This allows multiple paths to
|
||||
be chained using multiple transform calls.
|
||||
|
||||
- `CLI::EscapedString`: 🚧 can be used to process an escaped string. The
|
||||
processing is equivalent to that used for TOML config files, see
|
||||
[TOML strings](https://toml.io/en/v1.0.0#string). With 2 notable exceptions.
|
||||
\` can also be used as a literal string notation, and it also allows binary
|
||||
string notation see
|
||||
[binary strings](https://cliutils.github.io/CLI11/book/chapters/config.html).
|
||||
The escaped string processing will remove outer quotes if present, `"` will
|
||||
indicate a string with potential escape sequences, `'` and \` will indicate a
|
||||
literal string and the quotes removed but no escape sequences will be
|
||||
processed. This is the same escape processing as used in config files.
|
||||
|
||||
##### Validator operations
|
||||
|
||||
Validators are copyable and have a few operations that can be performed on them
|
||||
@ -873,9 +884,11 @@ through the `add_subcommand` method have the same restrictions as option names.
|
||||
- `--subcommand1.subsub.f val` (short form nested subcommand option)
|
||||
|
||||
The use of dot notation in this form is equivalent `--subcommand.long <args>` =>
|
||||
`subcommand --long <args> ++`. Nested subcommands also work `"sub1.subsub"`
|
||||
would trigger the subsub subcommand in `sub1`. This is equivalent to "sub1
|
||||
subsub"
|
||||
`subcommand --long <args> ++`. Nested subcommands also work `sub1.subsub` would
|
||||
trigger the subsub subcommand in `sub1`. This is equivalent to "sub1 subsub".
|
||||
Quotes around the subcommand names are permitted 🚧 following the TOML standard
|
||||
for such specification. This includes allowing escape sequences. For example
|
||||
`"subcommand".'f'` or `"subcommand.with.dots".arg1 = value`.
|
||||
|
||||
#### Subcommand options
|
||||
|
||||
@ -1209,19 +1222,22 @@ option (like `set_help_flag`). Setting a configuration option is special. If it
|
||||
is present, it will be read along with the normal command line arguments. The
|
||||
file will be read if it exists, and does not throw an error unless `required` is
|
||||
`true`. Configuration files are in [TOML][] format by default, though the
|
||||
default reader can also accept files in INI format as well. It should be noted
|
||||
that CLI11 does not contain a full TOML parser but can read strings from most
|
||||
TOML files, including multi-line strings 🚧, and run them through the CLI11
|
||||
parser. Other formats can be added by an adept user, some variations are
|
||||
available through customization points in the default formatter. An example of a
|
||||
TOML file:
|
||||
default reader can also accept files in INI format as well. The config reader
|
||||
can read most aspects of TOML files including strings both literal 🚧 and with
|
||||
potential escape sequences 🚧, digit separators 🚧, and multi-line strings 🚧,
|
||||
and run them through the CLI11 parser. Other formats can be added by an adept
|
||||
user, some variations are available through customization points in the default
|
||||
formatter. An example of a TOML file:
|
||||
|
||||
```toml
|
||||
# Comments are supported, using a #
|
||||
# The default section is [default], case insensitive
|
||||
|
||||
value = 1
|
||||
value2 = 123_456 # a string with separators
|
||||
str = "A string"
|
||||
str2 = "A string\nwith new lines"
|
||||
str3 = 'A literal "string"'
|
||||
vector = [1,2,3]
|
||||
str_vector = ["one","two","and three"]
|
||||
|
||||
@ -1229,6 +1245,7 @@ str_vector = ["one","two","and three"]
|
||||
[subcommand]
|
||||
in_subcommand = Wow
|
||||
sub.subcommand = true
|
||||
"sub"."subcommand2" = "string_value"
|
||||
```
|
||||
|
||||
or equivalently in INI format
|
||||
|
@ -113,7 +113,9 @@ app.set_config("--config")
|
||||
|
||||
will read the files in the order given, which may be useful in some
|
||||
circumstances. Using `CLI::MultiOptionPolicy::TakeLast` would work similarly
|
||||
getting the last `N` files given.
|
||||
getting the last `N` files given. The default policy for config options is
|
||||
`CLI::MultiOptionPolicy::Reverse` which takes the last expected `N` and reverses
|
||||
them so the last option given is given precedence.
|
||||
|
||||
## Configure file format
|
||||
|
||||
@ -204,14 +206,18 @@ str3 = """\
|
||||
```
|
||||
|
||||
The key is that the closing of the multiline string must be at the end of a line
|
||||
and match the starting 3 quote sequence.
|
||||
and match the starting 3 quote sequence. Multiline sequences using `"""` allow
|
||||
escape sequences. Following [TOML](https://toml.io/en/v1.0.0#string) with the
|
||||
addition of allowing '\0' for a null character, and binary Strings described in
|
||||
the next section. This same formatting also applies to single line strings.
|
||||
Multiline strings are not allowed as part of an array.
|
||||
|
||||
### Binary Strings
|
||||
|
||||
Config files have a binary conversion capability, this is mainly to support
|
||||
writing config files but can be used by user generated files as well. Strings
|
||||
with the form `B"(XXXXX)"` will convert any characters inside the parenthesis
|
||||
with the form \xHH to the equivalent binary value. The HH are hexadecimal
|
||||
with the form `\xHH` to the equivalent binary value. The HH are hexadecimal
|
||||
characters. Characters not in this form will be translated as given. If argument
|
||||
values with unprintable characters are used to generate a config file this
|
||||
binary form will be used in the output string.
|
||||
@ -274,8 +280,8 @@ char arraySeparator = ',';
|
||||
char valueDelimiter = '=';
|
||||
/// the character to use around strings
|
||||
char stringQuote = '"';
|
||||
/// the character to use around single characters
|
||||
char characterQuote = '\'';
|
||||
/// the character to use around single characters and literal strings
|
||||
char literalQuote = '\'';
|
||||
/// the maximum number of layers to allow
|
||||
uint8_t maximumLayers{255};
|
||||
/// the separator used to separator parent layers
|
||||
@ -296,8 +302,8 @@ These can be modified via setter functions
|
||||
an array
|
||||
- `ConfigBase *valueSeparator(char vSep)`: Specify the delimiter between a name
|
||||
and value
|
||||
- `ConfigBase *quoteCharacter(char qString, char qChar)` :specify the characters
|
||||
to use around strings and single characters
|
||||
- `ConfigBase *quoteCharacter(char qString, char literalChar)` :specify the
|
||||
characters to use around strings and single characters
|
||||
- `ConfigBase *maxLayers(uint8_t layers)` : specify the maximum number of parent
|
||||
layers to process. This is useful to limit processing for larger config files
|
||||
- `ConfigBase *parentSeparator(char sep)` : specify the character to separate
|
||||
@ -410,3 +416,6 @@ will create an option name in following priority.
|
||||
2. Positional name
|
||||
3. First short name
|
||||
4. Environment name
|
||||
|
||||
In config files the name will be enclosed in quotes if there is any potential
|
||||
ambiguities in parsing the name.
|
||||
|
@ -26,18 +26,18 @@ app.add_option("-i", int_option, "Optional description")->capture_default_str();
|
||||
You can use any C++ int-like type, not just `int`. CLI11 understands the
|
||||
following categories of types:
|
||||
|
||||
| Type | CLI11 |
|
||||
| -------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| number like | Integers, floats, bools, or any type that can be constructed from an integer or floating point number. Accepts common numerical strings like `0xFF` as well as octal, and decimal |
|
||||
| string-like | std::string, or anything that can be constructed from or assigned a std::string |
|
||||
| char | For a single char, single string values are accepted, otherwise longer strings are treated as integral values and a conversion is attempted |
|
||||
| complex-number | std::complex or any type which has a real(), and imag() operations available, will allow 1 or 2 string definitions like "1+2j" or two arguments "1","2" |
|
||||
| enumeration | any enum or enum class type is supported through conversion from the underlying type(typically int, though it can be specified otherwise) |
|
||||
| container-like | a container(like vector) of any available types including other containers |
|
||||
| wrapper | any other object with a `value_type` static definition where the type specified by `value_type` is one of the type in this list, including `std::atomic<>` |
|
||||
| tuple | a tuple, pair, or array, or other type with a tuple size and tuple_type operations defined and the members being a type contained in this list |
|
||||
| function | A function that takes an array of strings and returns a string that describes the conversion failure or empty for success. May be the empty function. (`{}`) |
|
||||
| streamable | any other type with a `<<` operator will also work |
|
||||
| Type | CLI11 |
|
||||
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||
| number like | Integers, floats, bools, or any type that can be constructed from an integer or floating point number. Accepts common numerical strings like `0xFF` as well as octal[\0755, or \o755], decimal, and binary(0b011111100), supports value separators including `_` and `'` |
|
||||
| string-like | std::string, or anything that can be constructed from or assigned a std::string |
|
||||
| char | For a single char, single string values are accepted, otherwise longer strings are treated as integral values and a conversion is attempted |
|
||||
| complex-number | std::complex or any type which has a real(), and imag() operations available, will allow 1 or 2 string definitions like "1+2j" or two arguments "1","2" |
|
||||
| enumeration | any enum or enum class type is supported through conversion from the underlying type(typically int, though it can be specified otherwise) |
|
||||
| container-like | a container(like vector) of any available types including other containers |
|
||||
| wrapper | any other object with a `value_type` static definition where the type specified by `value_type` is one of the type in this list, including `std::atomic<>` |
|
||||
| tuple | a tuple, pair, or array, or other type with a tuple size and tuple_type operations defined and the members being a type contained in this list |
|
||||
| function | A function that takes an array of strings and returns a string that describes the conversion failure or empty for success. May be the empty function. (`{}`) |
|
||||
| streamable | any other type with a `<<` operator will also work |
|
||||
|
||||
By default, CLI11 will assume that an option is optional, and one value is
|
||||
expected if you do not use a vector. You can change this on a specific option
|
||||
|
@ -129,10 +129,10 @@ class ConfigBase : public Config {
|
||||
valueDelimiter = vSep;
|
||||
return this;
|
||||
}
|
||||
/// Specify the quote characters used around strings and characters
|
||||
ConfigBase *quoteCharacter(char qString, char qChar) {
|
||||
/// Specify the quote characters used around strings and literal strings
|
||||
ConfigBase *quoteCharacter(char qString, char literalChar) {
|
||||
stringQuote = qString;
|
||||
literalQuote = qChar;
|
||||
literalQuote = literalChar;
|
||||
return this;
|
||||
}
|
||||
/// Specify the maximum number of parents
|
||||
|
@ -218,6 +218,11 @@ class IPV4Validator : public Validator {
|
||||
IPV4Validator();
|
||||
};
|
||||
|
||||
class EscapedStringTransformer : public Validator {
|
||||
public:
|
||||
EscapedStringTransformer();
|
||||
};
|
||||
|
||||
} // namespace detail
|
||||
|
||||
// Static is not needed here, because global const implies static.
|
||||
@ -237,6 +242,9 @@ const detail::NonexistentPathValidator NonexistentPath;
|
||||
/// Check for an IP4 address
|
||||
const detail::IPV4Validator ValidIPV4;
|
||||
|
||||
/// convert escaped characters into their associated values
|
||||
const detail::EscapedStringTransformer EscapedString;
|
||||
|
||||
/// Validate the input as a particular type
|
||||
template <typename DesiredType> class TypeValidator : public Validator {
|
||||
public:
|
||||
|
@ -9,8 +9,8 @@
|
||||
// [CLI11:version_hpp:verbatim]
|
||||
|
||||
#define CLI11_VERSION_MAJOR 2
|
||||
#define CLI11_VERSION_MINOR 3
|
||||
#define CLI11_VERSION_PATCH 2
|
||||
#define CLI11_VERSION "2.3.2"
|
||||
#define CLI11_VERSION_MINOR 4
|
||||
#define CLI11_VERSION_PATCH 0
|
||||
#define CLI11_VERSION "2.4.0"
|
||||
|
||||
// [CLI11:version_hpp:end]
|
||||
|
@ -339,7 +339,11 @@ inline std::vector<ConfigItem> ConfigBase::from_config(std::istream &input) cons
|
||||
item.pop_back();
|
||||
}
|
||||
if(keyChar == '\"') {
|
||||
item = detail::remove_escaped_characters(item);
|
||||
try {
|
||||
item = detail::remove_escaped_characters(item);
|
||||
} catch(const std::invalid_argument &ia) {
|
||||
throw CLI::ParseError(ia.what(), CLI::ExitCodes::InvalidError);
|
||||
}
|
||||
}
|
||||
} else {
|
||||
if(lineExtension) {
|
||||
|
@ -229,10 +229,29 @@ CLI11_INLINE IPV4Validator::IPV4Validator() : Validator("IPV4") {
|
||||
return std::string("Each IP number must be between 0 and 255 ") + var;
|
||||
}
|
||||
}
|
||||
return std::string();
|
||||
return std::string{};
|
||||
};
|
||||
}
|
||||
|
||||
CLI11_INLINE EscapedStringTransformer::EscapedStringTransformer() {
|
||||
func_ = [](std::string &str) {
|
||||
try {
|
||||
if(str.size() > 1 && (str.front() == '\"' || str.front() == '\'' || str.front() == '`') &&
|
||||
str.front() == str.back()) {
|
||||
process_quoted_string(str);
|
||||
} else if(str.find_first_of('\\') != std::string::npos) {
|
||||
if(detail::is_binary_escaped_string(str)) {
|
||||
str = detail::extract_binary_string(str);
|
||||
} else {
|
||||
str = remove_escaped_characters(str);
|
||||
}
|
||||
}
|
||||
return std::string{};
|
||||
} catch(const std::invalid_argument &ia) {
|
||||
return std::string(ia.what());
|
||||
}
|
||||
};
|
||||
}
|
||||
} // namespace detail
|
||||
|
||||
CLI11_INLINE FileOnDefaultPath::FileOnDefaultPath(std::string default_path, bool enableErrorReturn)
|
||||
|
@ -50,7 +50,7 @@ TEST_CASE("file_fail") {
|
||||
CLI::FuzzApp fuzzdata;
|
||||
auto app = fuzzdata.generateApp();
|
||||
|
||||
int index = GENERATE(range(1, 6));
|
||||
int index = GENERATE(range(1, 7));
|
||||
auto parseData = loadFailureFile("fuzz_file_fail", index);
|
||||
std::stringstream out(parseData);
|
||||
try {
|
||||
|
@ -308,15 +308,6 @@ TEST_CASE("StringTools: binaryStrings", "[helpers]") {
|
||||
CHECK(result == "\\XEM\\X7K");
|
||||
}
|
||||
|
||||
/// these are provided for compatibility with the char8_t for C++20 that breaks stuff
|
||||
std::string from_u8string(const std::string &s) { return s; }
|
||||
std::string from_u8string(std::string &&s) { return std::move(s); }
|
||||
#if defined(__cpp_lib_char8_t)
|
||||
std::string from_u8string(const std::u8string &s) { return std::string(s.begin(), s.end()); }
|
||||
#elif defined(__cpp_char8_t)
|
||||
std::string from_u8string(const char8_t *s) { return std::string(reinterpret_cast<const char *>(s)); }
|
||||
#endif
|
||||
|
||||
TEST_CASE("StringTools: escapeConversion", "[helpers]") {
|
||||
CHECK(CLI::detail::remove_escaped_characters("test\\\"") == "test\"");
|
||||
CHECK(CLI::detail::remove_escaped_characters("test\\\\") == "test\\");
|
||||
|
@ -706,6 +706,53 @@ TEST_CASE_METHOD(TApp, "NumberWithUnitBadInput", "[transform]") {
|
||||
CHECK_THROWS_AS(run(), CLI::ValidationError);
|
||||
}
|
||||
|
||||
static const std::map<std::string, std::string> validValues = {
|
||||
{"test\\u03C0\\u00e9", from_u8string(u8"test\u03C0\u00E9")},
|
||||
{"test\\u03C0\\u00e9", from_u8string(u8"test\u73C0\u0057")},
|
||||
{"test\\U0001F600\\u00E9", from_u8string(u8"test\U0001F600\u00E9")},
|
||||
{R"("this\nis\na\nfour\tline test")", "this\nis\na\nfour\tline test"},
|
||||
{"'B\"(\\x35\\xa7\\x46)\"'", std::string{0x35, static_cast<char>(0xa7), 0x46}},
|
||||
{"B\"(\\x35\\xa7\\x46)\"", std::string{0x35, static_cast<char>(0xa7), 0x46}},
|
||||
{"test\\ntest", "test\ntest"},
|
||||
{"\"test\\ntest", "\"test\ntest"},
|
||||
{R"('this\nis\na\nfour\tline test')", R"(this\nis\na\nfour\tline test)"},
|
||||
{R"("this\nis\na\nfour\tline test")", "this\nis\na\nfour\tline test"},
|
||||
{R"(`this\nis\na\nfour\tline test`)", R"(this\nis\na\nfour\tline test)"}};
|
||||
|
||||
TEST_CASE_METHOD(TApp, "StringEscapeValid", "[transform]") {
|
||||
|
||||
auto test_data = GENERATE(from_range(validValues));
|
||||
|
||||
std::string value{};
|
||||
|
||||
app.add_option("-n", value)->transform(CLI::EscapedString);
|
||||
|
||||
args = {"-n", test_data.first};
|
||||
|
||||
run();
|
||||
CHECK(test_data.second == value);
|
||||
}
|
||||
|
||||
static const std::vector<std::string> invalidValues = {"test\\U0001M600\\u00E9",
|
||||
"test\\U0001E600\\u00M9",
|
||||
"test\\U0001E600\\uD8E9",
|
||||
"test\\U0001E600\\uD8",
|
||||
"test\\U0001E60",
|
||||
"test\\qbad"};
|
||||
|
||||
TEST_CASE_METHOD(TApp, "StringEscapeInvalid", "[transform]") {
|
||||
|
||||
auto test_data = GENERATE(from_range(invalidValues));
|
||||
|
||||
std::string value{};
|
||||
|
||||
app.add_option("-n", value)->transform(CLI::EscapedString);
|
||||
|
||||
args = {"-n", test_data};
|
||||
|
||||
CHECK_THROWS_AS(run(), CLI::ValidationError);
|
||||
}
|
||||
|
||||
TEST_CASE_METHOD(TApp, "NumberWithUnitIntOverflow", "[transform]") {
|
||||
std::map<std::string, int> mapping{{"a", 1000000}, {"b", 100}, {"c", 101}};
|
||||
|
||||
|
@ -71,6 +71,15 @@ inline void unset_env(std::string name) {
|
||||
#endif
|
||||
}
|
||||
|
||||
/// these are provided for compatibility with the char8_t for C++20 that breaks stuff
|
||||
CLI11_INLINE std::string from_u8string(const std::string &s) { return s; }
|
||||
CLI11_INLINE std::string from_u8string(std::string &&s) { return std::move(s); }
|
||||
#if defined(__cpp_lib_char8_t)
|
||||
CLI11_INLINE std::string from_u8string(const std::u8string &s) { return std::string(s.begin(), s.end()); }
|
||||
#elif defined(__cpp_char8_t)
|
||||
CLI11_INLINE std::string from_u8string(const char8_t *s) { return std::string(reinterpret_cast<const char *>(s)); }
|
||||
#endif
|
||||
|
||||
CLI11_INLINE void check_identical_files(const char *path1, const char *path2) {
|
||||
std::string err1 = CLI::ExistingFile(path1);
|
||||
if(!err1.empty()) {
|
||||
|
BIN
tests/fuzzFail/fuzz_file_fail6
Normal file
BIN
tests/fuzzFail/fuzz_file_fail6
Normal file
Binary file not shown.
Loading…
x
Reference in New Issue
Block a user