I totally have no idea when std::count returns a negative value, but the
result type of `std::count` is a differnce_type. So when it is added
with size_t value, implicit sign conversion happens. This changes check
this kind of (almost trivial but required) checking.
Since empty_iterator never points anything, so it always points null
(it returns nullptr by operator->). but dereferencing null causes UB.
Just calling std::terminate is of course better.
table = {key = "value"} # comment.
a value named "table" ({key = "value"}) has the above comment.
but a value named "key" ("value") does not have any comment.
I found that in a user-code (I'm also one of the users of this library),
this new feature sometimes causes an error. Some of my code won't
compile because of this change. Since toml::table is convertible to
toml::value *implicitly*, if toml::find(table, key, tablename) was
called, the overload resolution becomes ambiguous with toml::find(
value, key1, key2). But dropping support for toml::find(toml::table,
key, tablename) is a breaking change. So I concluded that now is not
the right time yet.
- change value_t::typename from CamelCase to snake_case.
- drop CamelCase typename supports.
The changes are introduced to make the interfaces uniform. For some
(historical) reasons, toml11 has both CamelCase names and snake_case
names for types. Additionally, since `float` is a keyword, snake_case
names uses `floating` to avoid collision and CamelCase name uses `Float`
because toml official calls it `Float`. This is too confusing.
Since it is a major upgrade, I think it is a big chance to make them
uniform.
Although the error value from combinators currently does not have any
information, it can have an information because it is a char value. It
is better to use no-information-type explicitly to make it clear that
it does not have any information. So I added none_t in toml::detai and
use it in combinators and parsers as an error value from combinators.
Generate error message in `parse_something()`, not in `lex_something`.
Since the error message generated by `lex_something` is too difficult to
read for humans, I've disabled the error message generation for the sake
of efficiency (it takes time to generate error message that will never
be read). I think now the error message generation itself safely can be
removed from combinators. At this stage, `lex_something` does not need
to return `result<T, E>` because all the error type would be discarded.
Now it is turned out that returing `optional<T>` from lex_* is enough.
Maybe later I would change the return type itself, but currently I
changed the error type from std::string to char because implementing
optional takes time and effort. It makes the parsing process a bit
faster.
so far, the error value of the lexer is just ignored because they are
not readable (results from all the nested combinator are concatenated,
so they are too redundant). those ones are replaced by a simple literal.
use as_something() instead of it. To realize this, the implementation of
as_something() is also changed. Now as_something does not depends on
`cast`. This reduces complexity around casting toml::value to other types.
Actually, since `floating` is used for toml::types, `as_floating`
seems to be clearer. But currently `is_*` functions uses `float`,
not `floating`, so `as_float` is chosen for the consistency.
In a future release, possibly v3, those names may need to be
re-considered for clarity.
- comment_before(): get comments just before a value.
- comment_inline(): get a comment in the same line as a value.
- comment(): get comment_before() + comment_inline().
in some cases, `region` contains several lines and `region::size`
returns the whole size that is a sum of lengthes of all the lines.
To avoid too long underlines, restrict the length of underline by
the length of the line that is shown in the message.
when ""_toml literal is used with C++11 raw-string literal,
it normally starts with newline like the following.
```cpp
const auto v = u8R"(
[table]
key = "value"
)"_toml;
```
With this, the error message shows the first empty line that starts just
after `u8R"(` and thus the error message shows nothing. To avoid this,
skip the first empty lines and whitespaces in literal.
to avoid unncessary error message generation, check the first some
characters before parsing it. It makes parsing process faster and
is also helpful to generate more accurate error messages.
all the parsers generate error messages and error message generation is
not a lightweight task. It concatenates a lot of strings, it formats
many values, etc. To avoid useless error-message generation, first check
which prefix is used and then parse special integers. Additionally, by
checking that, the quality of the error message can be improved (later).
At the earlier stage of the development, I thought that it is useful if
lexer-combinators generate error messages, because by doing this,
parser would not need to generate an error message. But now it turned
out that to show an appropriate error message, parser need to generate
according to the context. And almost all the messages from lexer are
discarded. So I added another parameter to lexer-combinator to suppress
error message generation. In the future, we may want to remove messages
completely from lexers, but currently I will keep it. Removing those
unused message generation makes the parsing process faster.
The literal like this `"[[table]]"_toml` caused a syntax error. It is
because the literal parser first check that it might be a bare value
without a key, and parse_array directory throws syntax_error. This
change makes the parser first check a literal is a name of table, and
then parse the content.
`location::line_num()` function used to be implemented by using
`std::count`, so each time the parser encounters a type mismatch,
`std::count` was called with almost whole file. It decelerates the
parsing process too much, so I decided to add `line_number_` member
variable to `location` and add `advance/retrace/reset` to `location`
in order to modify the position that is pointed.
the following code was okay in the last release
```
toml::format_error("[test]", v, "test", {"hint1", "hint2"})
```
but was not okay in the current master. This commit fixes this.
cons: By this, the number of values to show is limited upto 3.
To allow the following toml file, we need to replace the region after
the more precise region is found.
```toml
[a.b.c]
d = 42
[a]
e = 2.71
```
If the precise region (here, [a]) is found, the region of `a` should be
`[a]`, not `[a.b.c]`. After `[a]` is defined, toml does not allow to
write `[a]` twice. To check it, we need to replace the region of values
to the precise one.
in 2.0.x and 2.1.0, README says "it shows warning" for invalid unicode
codepoints. So far, this library just show an error message in stderr
for this case. It is not good to change the behavior fatal in the next
minor release, 2.1.1, that includes patches and improved error msgs.
I will make it throw syntax_error after 2.2.0 for invalid unicode
codepoints. For now, I will keep it to be "warning".
The function was needed to copy region information from value to value,
for a useful error message. Because of the last few commits, the region
information about keys are passed to insert_nested_keys that requires
the function which is removed. And it turned out that the function is no
longer required. It is originally a workaround, so I removed it.