Although the error value from combinators currently does not have any
information, it can have an information because it is a char value. It
is better to use no-information-type explicitly to make it clear that
it does not have any information. So I added none_t in toml::detai and
use it in combinators and parsers as an error value from combinators.
Generate error message in `parse_something()`, not in `lex_something`.
Since the error message generated by `lex_something` is too difficult to
read for humans, I've disabled the error message generation for the sake
of efficiency (it takes time to generate error message that will never
be read). I think now the error message generation itself safely can be
removed from combinators. At this stage, `lex_something` does not need
to return `result<T, E>` because all the error type would be discarded.
Now it is turned out that returing `optional<T>` from lex_* is enough.
Maybe later I would change the return type itself, but currently I
changed the error type from std::string to char because implementing
optional takes time and effort. It makes the parsing process a bit
faster.
so far, the error value of the lexer is just ignored because they are
not readable (results from all the nested combinator are concatenated,
so they are too redundant). those ones are replaced by a simple literal.
to avoid unncessary error message generation, check the first some
characters before parsing it. It makes parsing process faster and
is also helpful to generate more accurate error messages.
all the parsers generate error messages and error message generation is
not a lightweight task. It concatenates a lot of strings, it formats
many values, etc. To avoid useless error-message generation, first check
which prefix is used and then parse special integers. Additionally, by
checking that, the quality of the error message can be improved (later).
`location::line_num()` function used to be implemented by using
`std::count`, so each time the parser encounters a type mismatch,
`std::count` was called with almost whole file. It decelerates the
parsing process too much, so I decided to add `line_number_` member
variable to `location` and add `advance/retrace/reset` to `location`
in order to modify the position that is pointed.
in 2.0.x and 2.1.0, README says "it shows warning" for invalid unicode
codepoints. So far, this library just show an error message in stderr
for this case. It is not good to change the behavior fatal in the next
minor release, 2.1.1, that includes patches and improved error msgs.
I will make it throw syntax_error after 2.2.0 for invalid unicode
codepoints. For now, I will keep it to be "warning".
Error messages related to dotted keys looks weird. like:
1 | a.b.c = 42
| ~~ in this table
The underlined token is not a table. This should be like the following.
1 | a.b.c = 42
| ~~~ in this table
To implement this, the region information is needed when the keys are
read. This commit add this functionality, though currently the region
information is not used yet.