This is more of the same refactoring that has been happening. This
extraction also helps emphasize the relationship between imported objects,
and isolates the growing number of test cases. This parser will only grow.
DEV-13708
This delegates expression parsing to `AirExprAggregate`, in an effort to
both begin to simplify the understanding and maintenance of `AirAggregate`;
and allow for parser composition for template parsing.
This utilizes the prior changes for token sum types to precisely define the
subset of AIR tokens supported by the expression parser. This differs from
prior approaches which delegated until a dead state, relying on runtime
information to determine if a parser has finished. This allows us to
determine that statically.
I do want to be able to eliminate the dead state from the parser so we can
get rid of the `unreachable!`, but I need to move on; that's something I had
tried to do in the past too, which ended up adding a bit of complexity, and
I'll have to consider my options in the future, including whether the dead
state transition can be entirely eliminated in favor of the combination of
these sum types and recovery; the parsing framework decisions were made
while recovery was still an open question, at least in practice.
DEV-13708
This replaces the stub `derive_xmli` with the same result (well, minus a
space before the '/' in the output) using what will become the lowering
pipeline. Once again, this is quite verbose, and the lowering pipeline in
general needs to be further abstracted away.
Unlike the rest of the pipeline, an error during the derivation process will
immediately terminate with an unrecoverable error, because we do not want to
write partial files. This does not remove the garbage file, because the
build system ought to do that itself (e.g. `make`)...but that is certainly
open for debate.
DEV-13708
The reader previously yielded a `ParsedResult`, presumably to simplify
lowering operations. But the reader is not a `ParseState`, and does not
otherwise use the parsing API, so this was an inappropriate and confusing
coupling.
This resolves that, introducing a new `lowerable` which will translate an
iterator into something that can be placed in a lowering pipeline.
See the previous commit for more information.
DEV-13708
This parser does exactly what it says it does. Its implementation is
simple, but I added a test anyway just to prove that it works, and the test
seems more complicated than the implementation itself, given the types
involved.
DEV-13708
This does not yet yield the produces ASG, but does set up the lowering
pipeline to prepare to produce it. It's also currently a no-op, with
`NirToAsg` just yielding `Incomplete`.
The goal is to begin to move toward vertical slices for TAMER as I start to
return to the previous approach of a handoff with the old compiler. Now
that I've gained clarity from my previous failed approach (which I
documented in previous commits), I feel that this is the best way forward
that will allow me to incrementally introduce more fine-grained performance
improvements, at the cost of some throwaway work as this progresses. But
the cost of delay with these build times is far greater.
DEV-13429
This parser really just allows me to continue developing the NIR
interpolation system using `Expansion` terminology, and avoid having to use
dead states in tests. This allows for the appropriate level of abstraction
to be used in isolation, and then only be stripped when stitching is
necessary.
Future commits will show how this is actually integrated and may introduce
additional abstraction to help.
DEV-13156
This is a shift in approach.
My original idea was to try to keep NIR parsing the way it was, since it's
already hard enough to reason about with the `ele_parse!` parser-generator
macro mess. The idea was to produce an IR that would explicitly be denoted
as "maybe sugared", and have a desugaring operation as part of the lowering
pipeline that would perform interpolation and lower the symbol into a plain
version.
The problem with that is:
1. The use of the type was going to introduce a lot of mapping for all the
NIR token variants there are going to be; and
2. _The types weren't even utilized for interpolation._
Instead, if we interpolated _as attributes are encountered_ while parsing
NIR, then we'd be able to expand directly into that NIR token stream and
handle _all_ symbols in a generic way, without any mapping beyond the
definition of NIR's grammar using `ele_parse!`.
This is a step in that direction---it removes `NirSymbolTy` and introduces a
generic abstraction for the concept of expansion, which will be utilized
soon by the attribute parser to allow replacing `TryFrom` with something
akin to `ParseFrom`, or something like that, which is able to produce a
token stream before finally yielding the value of the attribute (which will
be either the original symbol or the replacement metavariable, in the case
of interpolation).
(Note that interpolation isn't yet finished---errors still need to be
implemented. But I want a working vertical slice first.)
DEV-13156
Not sure why I didn't add a prelude sooner, considering all the import
boilerplate. This will evolve as needed and I'll go back and replace other
imports when I'm not in the middle of something.
DEV-13156
This helps to clarify the situations under which these errors can occur, and
the generality also helps to show why the inner types are as they
are (e.g. use of `String`).
But more importantly, this allows for an error type in `finalize` that is
detached from the `ParseState`, which will be able to be utilized in the
lowering pipeline as a more general error distinguishable from other
lowering errors. At the moment I'm maintaining BC, but a following commit
will demonstrate the use case to introduce recoverable vs. non-recoverable
errors.
DEV-13158
This newtype allows a caller to prove (using types) that a parser of a given
type (`ParseState`) has been finalized.
This will be used by the lowering pipeline to ensure that all parsers in the
pipeline end up getting finalized (as you can see from a TODO added in the
code, one of them is missing). The lack of such a type was an oversight
during the (rather stressed) development of the parsing system, and I
shouldn't need to resort to unit tests to verify that parsers have been
finalized.
DEV-13158
This was accepting an early EOF when the active child `ParseState` was in an
accepting state, because it was not ensuring that anything on the stack was
also accepting.
Ideally, there should be nothing on the stack, and hopefully in the future
that's what happens. But with how things are today, it's important that, if
anything is on the stack, it is accepting.
Since `is_accepting` on the superstate is only called during finalization,
and because the check terminates early, and because the stack practically
speaking will only have a couple things on it max (unless we're in tail
position in a deeply nested tree, without TCO [yet]), this shouldn't be an
expensive check.
Implementing this did require that we expose `Context` to `is_accepting`,
which I had hoped to avoid having to do, but here we are.
DEV-7145
This properly integrates the trampoline into `ele_parse!`. The
implementation leaves some TODOs, most notably broken mixed text handling
since we can no longer intercept those tokens before passing to the
child. That is temporarily marked as incomplete; see a future commit.
The introduced test `ParseState`s were to help me reason about the system
intuitively as I struggled to track down some type errors in the monstrosity
that is `ele_parse!`. It will fail to compile if those invariants are
violated. (In the end, the problems were pretty simple to resolve, and the
struggle was the type system doing its job in telling me that I needed to
step back and try to reason about the problem again until it was intuitive.)
This keeps around the NT states for now, which are quickly used to
transition to the next NT state, like a couple of bounces on a trampoline:
NT -> Dead -> Parent -> Next NT
This could be optimized in the future, if it's worth doing.
This also makes no attempt to implement tail calls; that would have to come
after fixing mixed content and really isn't worth the added complexity
now. I (desperately) need to move on, and still have a bunch of cleanup to
do.
I had hoped for a smaller commit, but that was too difficult to do with all
the types involved.
DEV-7145
I'm disappointed that I keep having to implement features that I had hoped
to avoid implementing.
This introduces a "superstate" feature, which is intended really just to be
a sum type that is able to delegate to stitched `ParseState`s. This then
allows a `ParseState` to transition directly to another `ParseState` and
have the parent `ParseState` handle the delegation---a trampoline.
This issue naturally arises out of the recursive nature of parsing a TAME
XML document, where certain statements can be nested (like `<section>`), and
where expressions can be nested. I had gotten away with composition-based
delegation for now because `xmlo` headers do not have such nesting.
The composition-based approach falls flat for recursive structures. The
typical naive solution is boxing, which I cannot do, because not only is
this on an extremely hot code path, but I require that Rust be able to
deeply introspect and optimize away the lowering pipeline as much as
possible.
Many months ago, I figured that such a solution would require a trampoline,
as it typically does in stack-based languages, but I was hoping to avoid
it. Well, no longer; let's just get on with it.
This intends to implement trampolining in a `ParseState` that serves as that
sum type, rather than introducing it as yet another feature to `Parser`; the
latter would provide a more convenient API, but it would continue to bloat
`Parser` itself. Right now, only the element parser generator will require
use of this, so if it's needed beyond that, then I'll debate whether it's
worth providing a better abstraction. For now, the intent will be to use
the `Context` to store a stack that it can pop off of to restore the
previous `ParseState` before delegation.
DEV-7145
Various DUMMY_SPAN-derived spans are used by many test cases, so this
finally extracts them---something I've been meaning to do for some time.
This also places DUMMY_SPAN behind a `cfg(test)` directive to ensure that it
is _only_ used in tests; UNKNOWN_SPAN should be used when a span is actually
unknown, which may also be the case during development.
DEV-7145
This has gotten large and was cluttering `feed_tok`. This also provides the
ability to more easily expand into other types of tracing in the future.
DEV-7145
Because of recovery, the trace otherwise paints a really confusing-looking
picture when given unexpected input.
This is large enough now that it really ought to be extracted from
`feed_tok`, but I'll wait to see how this evolves further. I considered
adding color too, but it's not yet clear to me that the visual noise will be
all that helpful.
DEV-7145
This produces useful parse traces that are output as part of a failing test
case. The parser generator macros can be a bit confusing to deal with when
things go wrong, so this helps to clarify matters.
This is _not_ intended to be machine-readable, but it does show that it
would be possible to generate machine-readable output to visualize the
entire lowering pipeline. Perhaps something for the future.
I left these inline in Parser::feed_tok because they help to elucidate what
is going on, just by reading what the trace would output---that is, it helps
to make the method more self-documenting, albeit a tad bit more
verbose. But with that said, it should probably be extracted at some point;
I don't want this to set a precedent where composition is feasible.
Here's an example from test cases:
[Parser::feed_tok] (input IR: XIRF)
| ==> Parser before tok is parsing attributes for `package`.
| | Attrs_(SutAttrsState_ { ___ctx: (QName(None, LocalPart(NCName(SymbolId(46 "package")))), OpenSpan(Span { len: 0, offset: 0, ctx: Context(SymbolId(1 "#!DUMMY")) }, 10)), ___done: false })
|
| ==> XIRF tok: `<unexpected>`
| | Open(QName(None, LocalPart(NCName(SymbolId(82 "unexpected")))), OpenSpan(Span { len: 0, offset: 1, ctx: Context(SymbolId(1 "#!DUMMY")) }, 10), Depth(1))
|
| ==> Parser after tok is expecting opening tag `<classify>`.
| | ChildA(Expecting_)
| | Lookahead: Some(Lookahead(Open(QName(None, LocalPart(NCName(SymbolId(82 "unexpected")))), OpenSpan(Span { len: 0, offset: 1, ctx: Context(SymbolId(1 "#!DUMMY")) }, 10), Depth(1))))
= note: this trace was output as a debugging aid because `cfg(test)`.
[Parser::feed_tok] (input IR: XIRF)
| ==> Parser before tok is expecting opening tag `<classify>`.
| | ChildA(Expecting_)
|
| ==> XIRF tok: `<unexpected>`
| | Open(QName(None, LocalPart(NCName(SymbolId(82 "unexpected")))), OpenSpan(Span { len: 0, offset: 1, ctx: Context(SymbolId(1 "#!DUMMY")) }, 10), Depth(1))
|
| ==> Parser after tok is attempting to recover by ignoring element with unexpected name `unexpected` (expected `classify`).
| | ChildA(RecoverEleIgnore_(QName(None, LocalPart(NCName(SymbolId(82 "unexpected")))), OpenSpan(Span { len: 0, offset: 1, ctx: Context(SymbolId(1 "#!DUMMY")) }, 10), Depth(1)))
| | Lookahead: None
= note: this trace was output as a debugging aid because `cfg(test)`.
DEV-7145
Oh what a tortured journey. I had originally tried to avoid formalizing
lookahead for all parsers by pretending that it was only needed for dead
state transitions (that is---states that have no transitions for a given
input token), but then I needed to yield information for aggregation. So I
added the ability to override the token for `Dead` to yield that, in
addition to the token. But then I also needed to yield lookahead for error
conditions. It was a mess that didn't make sense.
This eliminates `ParseStatus::Dead` entirely and fully integrates the
lookahead token in `Parser` that was previously implemented.
Notably, the lookahead token is encapsulated in `TransitionResult` and
unavailable to `ParseState` implementations, forcing them to rely on
`Parser` for recursion. This not only prevents `ParseState` from recursing,
but also simplifies delegation by removing the need to manually handle
tokens of lookahead.
The awkward case here is XIRT, which does not follow the streaming parsing
convention, because it was conceived before the parsing framework. It needs
to go away, but doing so right now would be a lot of work, so it has to
stick around for a little bit longer until the new parser generators can be
used instead. It is a persistent thorn in my side, going against the grain.
`Parser` will immediately recurse if it sees a token of lookahead with an
incomplete parse. This is because stitched parsers will frequently yield a
dead state indication when they're done parsing, and there's no use in
propagating an `Incomplete` status down the entire lowering pipeline. But,
that does mean that the toplevel is not the only thing recursing. _But_,
the behavior doesn't really change, in the sense that it would infinitely
recurse down the entire lowering stack (though there'd be an opportunity to
detect that). This should never happen with a correct parser, but it's not
worth the effort right now to try to force such a thing with Rust's type
system. Something like TLA+ is better suited here as an aid, but it
shouldn't be necessary with clear implementations and proper test
cases. Parser generators will also ensure such a thing cannot occur.
I had hoped to remove ParseStatus entirely in favor of Parsed, but there's a
lot of type inference that happens based on the fact that `ParseStatus` has
a `ParseState` type parameter; `Parsed` has only `Object`. It is desirable
for a public-facing `Parsed` to not be tied to `ParseState`, since consumers
need not be concerned with such a heavy type; however, we _do_ want that
heavy type internally, as it carries a lot of useful information that allows
for significant and powerful type inference, which in turn creates
expressive and convenient APIs.
DEV-7145
Previously, `ParseStatus::Dead` always yielded
`ParseState::Token`. However, I'm working on introducing parsers that
aggregate (parsing XML attributes into structs), and those parsers do not
know that they have completed aggregation until they reach a dead state;
given that, I need to yield additional information at that time.
I played around with a number of alternative ideas, but this ended up being
the cleanest, relative to the effort involved. For example, introducing
another parameter to `ParseStatus::Dead` was too burdensome on APIs that
ought not concern themselves with the possibility of receiving an object in
addition to a lookahead token, since many parsers are not capable of doing
so (given that they map M:(N<=M)).
Another option that I abandoned fairly quickly was having
`is_accepting` (potentially renamed) return an aggregate object, since
that's on the side and didn't feel like it was part of the parsing pipeline.
The intent is to abstract this some in a new `ParseState` method for
delegation + aggregation.
DEV-7145
This allows `XmlXirReader` to be used in a `Lower` operation, just as
everything else, bringing me one step closer to a pipeline that can be
concisely represented; this is finally beginning to unify in a clear way,
though it is still a bit of a mess.
This causes `XmlXirReader` to _act_ like a `parse::Parser` in that it yields
a `ParsedResult`, but it does not use `parse::Parser` itself; that was the
_original_ plan: convert it into a `ParseState` where `XmlXirReader` became
a context, and force `Parser` to yield by feeding it a stream of tokens with
`repeat`, but that ended up performing poorly relative to this change. I
did some investigation, which I might write about in the future, but for
now, this solution works just fine.
DEV-7145
This abstraction has grown quite a bit, and it's time to start formalizing
it a bit. This split doesn't change any behavior, but it does start to make
it easier to reason about by clearly stating the broad components and how
they interact with one-another.
This doesn't yet move the tests; those will come next, but they are very
few. The reason I gave previously for this was because (a) they're tested
indirectly via the systems that utilize them and (b) because the abstraction
was not yet settled on the process was already very expensive. No test
coverage was lost---it's only that failures were potentially harder to debug
on test failures, but in practice not even this was true, because the deeply
expressive types all but ensured that, if it compiles, it will function in a
way that is expected. Unit tests and documentation for this system will be
added once I'm sure that this abstraction is in a proper state.
DEV-7145
This also modifies `poc` such that `Lower` is invoked as an associated
function rather than a method to emphasize the pattern that is forming, so
that it can be later abstracted away.
DEV-11864
The `while_ok` can just be implied with a lowering operation, and that
reduces the name complexity so that we can maybe introduce even more
specialized methods without resulting in a huge sentence as a name.
DEV-11864
This finally uses `parse` all the way up to aggregation into the ASG, as can
be seen by the mess in `poc`. This will be further simplified---I just need
to get this committed so that I can mentally get it off my plate. I've been
separating this commit into smaller commits, but there's a point where it's
just not worth the effort anymore. I don't like making large changes such
as this one.
There is still work to do here. First, it's worth re-mentioning that
`poc` means "proof-of-concept", and represents things that still need a
proper home/abstraction.
Secondly, `poc` is retrieving the context of two parsers---`LowerContext`
and `Asg`. The latter is desirable, since it's the final aggregation point,
but the former needs to be eliminated; in particular, packages need to be
worked into the ASG so that `found` can be removed.
Recursively loading `xmlo` files still happens in `poc`, but the compiler
will need this as well. Once packages are on the ASG, along with their
state, that responsibility can be generalized as well.
That will then simplify lowering even further, to the point where hopefully
everything has the same shape (once final aggregation has an abstraction),
after which we can then create a final abstraction to concisely stitch
everything together. Right now, Rust isn't able to infer `S` for
`Lower<S, LS>`, which is unfortunate, but we'll be able to help it along
with a more explicit abstraction.
DEV-11864
This is intended to describe, to the user, the state that the parser is
in. This will be used to convey additional information for general parser
errors, but it should also probably be integrated into parsers' individual
errors as well when appropriate.
This is something I expected to add at some point, but I wanted to add them
because, when dealing with lowering errors, it can be difficult to tell
what parser the error originated from.
DEV-11864
The `*_iter_while_ok` functions now compose like monads, flattening `Result`
at each step and drastically simplifying handling of error types. This also
removes the bunch of `?`s at the end of the expression, and allows me to use
`?` within the callback itself.
I had originally not used `Result` as the return type of the callback
because I was not entirely sure how I was going to use them, but it's now
clear that I _always_ use `Result` as the return type, and so there's no use
in trying to be too accommodating; it can always change in the future.
This is desirable not just for cleanup, but because trying to refactor
`asg_builder` into a pair of `Parser`s is really messy to chain without
flattening, especially given some state that has to leak temporarily to the
caller. More on that in a future commit.
DEV-11864
This allows retrieving and providing a context to a `Parser`. This is
intended for use with an aggregating parser, in particular to construct the
ASG and return it.
This is a component of a change that replaces `asg_builder` with a
`Parser`-based lowering into the ASG, but there are still changes that need
to be made to simplify things and complete its integration.
DEV-11864
`PartialEq` remains, and is all that is needed. See previous commit
regarding the removal of this same bound from `Context`.
This can be re-added if it ends up actually being necessary. But Tokens are
ephemeral and used only in lowering pipelines, using pattern matching.
DEV-11864
This is unnecessarily restrictive, since we do not require anything further
than `PartialEq` for the situations where we care about equality (tests).
DEV-11864
This is too restrictive, especially for parsers that fold into something,
like the ASG, which may exist prior to invoking the parser.
This moves the trait bound to the functions that actually need it. Those
obviously cannot be used if the Context does not implement `Default`, but
I'll provide alternative conveniences.
DEV-11864
RSG (Ryan Specialty Group) recently announced a rename to Ryan Specialty (no
"Group"), but I'm not sure if the legal name has been changed yet or not, so
I'll wait on that.
This is a working concept that will continue to evolve. I wanted to start
with some basic output before getting too carried away, since there's a lot
of potential here.
This is heavily influenced by Rust's helpful diagnostic messages, but will
take some time to realize a lot of the things that Rust does. The next step
will be to resolve line and column numbers, and then possibly include
snippets and underline spans, placing the labels alongside them. I need to
balance this work with everything else I have going on.
This is a large commit, but it converts the existing Error Display impls
into Diagnostic. This separation is a bit verbose, so I'll see how this
ends up evolving.
Diagnostics are tied to Error at the moment, but I imagine in the future
that any object would be able to describe itself, error or not, which would
be useful in the future both for the Summary Page and for query
functionality, to help developers understand the systems they are writing
using TAME.
Output is integrated into tameld only in this commit; I'll add tamec
next. Examples of what this outputs are available in the test cases in this
commit.
DEV-10935
There's no use in complicating the error handling here when we'd just
default to `UNKNOWN_SPAN` anyway when trying to render it. `UNKNOWN_SPAN`
didn't exist at the time of writing.
DEV-10935
This resolves the performance issues caused by Rust's failure to elide the
ElementStack (ArrayVec) memcpys on move.
Since XIRF is invoked tens of millions of times in some cases for larger
systems, prior to this change, failure to optimize away moves for XIRF
resulted in tens of millions of memcpys. This resulted in linking of one
program going from 1s -> ~15s. This change reduces it to ~2.5s with the
wip-xmlo-xir-reader flag on, with the extra time coming from elsewhere (the
subject of future changes).
In particular, this change introduces a new mutable reference to
`ParseState::parse_token`, which is a reference to a `Context` owned by the
caller (e.g. `Parser`). In the case of XIRF, this means that
`Parser<flat::State, _>` will own the `ElementStack`/`ArrayVec` instead of
`flat::State`; this allows the latter to remain pure and benefit from Rust's
move optimizations, without sacrificing the otherwise-pure implementation.
ParseStates that do not need a mutable context can use `NoContext` and
remain pure.
DEV-12024
This makes the necessary tweaks to have the entire linker work end-to-end
and produce a compatible xmle file (that is, identical except for
nondeterministic topological ordering). That's good, and finally that can
get off of my plate.
What's disappointing, and what I'll have more information on in future
commits, is how slow it is.
The linking of our largest package goes from ~1s -> ~15s with this
change. The reason is because of tens of millions of `memcpy` calls. Why?
The ParseState abstraction is pure and passes an owned `self` around, and
Parser replaces its own reference using this:
let result;
TransitionResult(Transition(self.state), result) =
take(&mut self.state).parse_token(tok);
Naively, this would store a copy of the old state in `result`, allocate a
new ParseState for `self.state`, pass the original or a copy to
`parse_token`, and then overwrite `self.state` with the new ParseState that
is returned once it is all over.
Of course, that'd be devastating. What we want to happen is for Rust to
realize that it can just pass a reference to `self.state` and perform no
copying at all.
For certain parsers, this is exactly what happens. Great!
But for XIRF, it we have this:
/// Stack of element [`QName`] and [`Span`] pairs,
/// representing the current level of nesting.
///
/// This storage is statically allocated,
/// allowing XIRF's parser to avoid memory allocation entirely.
type ElementStack<const MAX_DEPTH: usize> = ArrayVec<(QName, Span), MAX_DEPTH>;
/// XIRF document parser state.
///
/// This parser is a pushdown automaton that parses a single XML document.
#[derive(Debug, Default, PartialEq, Eq)]
pub enum State<const MAX_DEPTH: usize, SA = AttrParseState>
where
SA: FlatAttrParseState,
{
/// Document parsing has not yet begun.
#[default]
PreRoot,
/// Parsing nodes.
NodeExpected(ElementStack<MAX_DEPTH>),
/// Delegating to attribute parser.
AttrExpected(ElementStack<MAX_DEPTH>, SA),
/// End of document has been reached.
Done,
}
ParseState contains an ArrayVec, and its implementation details are causes
LLVM _not_ to elide the `memcpy`. And there's a lot of them.
Considering that ParseState is supposed to use only statically allocated
memory and be zero-copy, this is rather ironic.
Now, this _could_ be potentially fixed by not using ArrayVec; removing
it (and the corresponding checks for balanced tags) gets us down to
2s (which still needs improvement), but we can't have a core abstraction in
our system resting on a house of cards. What if the optimization changes
between releases and suddenly linking / building becomes shit slow? That's
too much of a risk.
Further, having to limit what abstractions we use just to appease the
compiler to optimize away moves is very restrictive.
The better option seems like to go back to what I used to do: pass around
`&mut self`. I had moved to an owned `self` to force consideration of _all_
state transitions, but I can try to do the same thing in a different type of
way using mutable references, and then we avoid this problem. The
abstraction isn't pure (in the functional sense) anymore, but it's safe and
isn't relying on delicate inlining and optimizer implementation details to
have a performant system.
More information to come.
DEV-10863
This parses the symbol dependency list (adjacency list).
I'm noticing some glaring issues in error handling, particularly that the
token being parsed while an error occurs is not returned and so recovery is
impossible. I'll have to address that later on, after I get this parser
completed.
Another previous question that I had a hard time answering in prior months
was how I was going to compose boilerplate parsers, e.g. handling the
parsing of single-attribute elements and such. A pattern is clearly taking
shape, and with the composition of parsers more formalized, that'll be able
to be abstracted away. But again, that's going to wait until after this
parser is actually functioning. Too many delays so far.
DEV-10863
This simply removes boilerplate.
This will receive concrete examples once I come up with docs for the entire
module; there's boilerplate involved in testing and documenting this in
isolation and the time investment is not worth it yet until I'm certain that
this will not be changed.
DEV-10863
This introduces a new method similar to the previous `delegate`, but with
another closure that allows for handling lookahead tokens from the child
parser.
Admittedly, this isn't exactly what I was going for---a list of arguments
isn't exactly self-documenting, especially with the brevity when the
arguments line up---but this was easy to do and so I'll run with this for
now.
This also modified `delegate` to accept a context, even though it wasn't
necessary, both for consistency with its lookup counterpart and for brevity
with the `into` argument (allowing, in our case, to just pass the name of
the variant, rather than a closure).
I'm not going to handle the actual starting and accepting state stitching
abstraction for now; I'd like to observe future boilerplate more before I
consider the best way to handle it, though I do have some ideas.
DEV-10863
This is the delegation portion of what I've come to call "state
stitching"---wiring together two state machines that recognize the same
input tokens.
This handles the delegation of tokens once the parser has been entered, but
does not yet handle the actual stitching part of it: wiring the start and
accepting states of the child parser to the parent.
This is indirectly tested by the XmloReader, but it will receive its own
tests once I further finalize this concept. I'm playing around with some
ideas. With that said, a quick visual inspection together with the
guarantees provided by the type system should convince any familiar reader
of its correctness.
DEV-10863
This does some cleanup and adds `parse::Object` for use in disambiguating
`From` for `ParseStatus`, allowing the `Transition` API to be much more
flexible in the data it accepts and automatically converts. This allows us
to concisely provide raw output data to be wrapped, or provide `ParseStatus`
directly when more convenient.
There aren't yet examples in the docs; I'll do so once I make sure this API
is actually utilized as intended.
DEV-10863
This converts the tuple type alias into a newtype, so that we may provide
our own implementations.
This differs from a previous approach that I took, which involved making
this type `Result<(S, T), (S, E)>` so that the return values composed well
with other functions. But the reality is that this is used only by other
`ParseState`s and `Parser`, so it's unnecessary.
However, this is also an attempt to utilize the new Try and FromResidual
traits; note how the Try associated types match precisely what I was trying
to do before, though they're used as intermediate types. I'll see how this
evolves.
DEV-10863
This allows the Results to compose and, importantly, is compatible with
`?` without having to put in any extra effort.
This makes puts the caller in an awkward spot, so I introduced a utility
function `result_tup0_invert` for now; we'll see if that stays or evolves
differently.
DEV-10863