This is intended to describe, to the user, the state that the parser is
in. This will be used to convey additional information for general parser
errors, but it should also probably be integrated into parsers' individual
errors as well when appropriate.
This is something I expected to add at some point, but I wanted to add them
because, when dealing with lowering errors, it can be difficult to tell
what parser the error originated from.
DEV-11864
The `*_iter_while_ok` functions now compose like monads, flattening `Result`
at each step and drastically simplifying handling of error types. This also
removes the bunch of `?`s at the end of the expression, and allows me to use
`?` within the callback itself.
I had originally not used `Result` as the return type of the callback
because I was not entirely sure how I was going to use them, but it's now
clear that I _always_ use `Result` as the return type, and so there's no use
in trying to be too accommodating; it can always change in the future.
This is desirable not just for cleanup, but because trying to refactor
`asg_builder` into a pair of `Parser`s is really messy to chain without
flattening, especially given some state that has to leak temporarily to the
caller. More on that in a future commit.
DEV-11864
This was always the intent, but I didn't have a higher-level object
yet. This removes all the awkwardness that existed with working the root
in as an identifier.
DEV-11864
This wraps `Ident` in a new `Object` variant and modifies `Asg` so that its
nodes are of type `Object`.
This unfortunately requires runtime type checking. Whether or not that's
worth alleviating in the future depends on a lot of different things, since
it'll require my own graph implementation, and I have to focus on other
things right now. Maybe it'll be worth it in the future.
Note that this also gets rid of some doc examples that simply aren't worth
maintaining as the API evolves.
DEV-11864
A previous commit mentioned that there's not a place for `Dim`, and
duplicated it between `asg` and `xmlo`. Well, `Dtype` is also needed in
both, and so here's a home for now.
`Dtype` has always been an inappropriate detail for the system and will one
day be removed entirely in favor of higher-level types; the machine
representation is up to the compiler to decide.
DEV-11864
asg_builder is about to be replaced, but in the process of simplifying the
destination IR (the ASG), I'm moving things into the proper place. This
never belonged here---it belongs with the actual lowering operation.
Previously, this was not reasoned about in terms of a lowering operation,
and was written when I was first introducing myself to Rust and trying to
get a proof-of-concept linker working.
DEV-11864
This matches xmlo::Dim, and could be the same thing, if we can find a home
for it in the future; it's not worth creating such a home right now when I'm
not yet sure what else ought to live there; the duplication may be fine.
The conversion from xmlo needs to be moved, and `Dim` is going to be used
for more than just identifiers (expressions will have type inference
performed).
DEV-11864
This allows retrieving and providing a context to a `Parser`. This is
intended for use with an aggregating parser, in particular to construct the
ASG and return it.
This is a component of a change that replaces `asg_builder` with a
`Parser`-based lowering into the ASG, but there are still changes that need
to be made to simplify things and complete its integration.
DEV-11864
Previously, since the graph contained only identifiers, discovered roots
were stored in a separate vector and exposed to the caller. This not only
leaked details, but added complexity; this was left over from the
refactoring of the proof-of-concept linker some time ago.
This moves the root management into the ASG itself, mostly, with one item
being left over for now in the asg_builder (eligibility classifications).
There are two roots that were added automatically:
- __yield
- __worksheet
The former has been removed and is now expected to be explicitly mapped in
the return map, which is now enforced with an extern in `core/base`. This
is still special, in the sense that it is explicitly referenced by the
generated code, but there's nothing inherently special about it and I'll
continue to generalize it into oblivion in the future, such that the final
yield is just a convention.
`__worksheet` is the only symbol of type `IdentKind::Worksheet`, and so that
was generalized just as the meta and map entries were.
The goal in the future will be to have this more under the control of the
source language, and to consolodate individual roots under packages, so that
the _actual_ roots are few.
As far as the actual ASG goes: this introduces a single root node that is
used as the sole reference for reachability analysis and topological
sorting. The edges of that root node replace the vector that was removed.
DEV-11864
Rather than having the linker add this symbol opaquely, let's remove the
special case and generalize it. There's nothing special about yield, except
historical precedent.
Systems can explicitly add it as a root in a common return map.
DEV-11864
In the actual implementation (outside of tests), this is always looking up
before adding the symbol. This will simplify the API, while still retaining
errors, since the identifier will fail the state transition if the
identifier did not exist before attempting to set a fragment. So while this
is slower in microbenchmarks, this has no effect on real-world performance.
Further, I'm refactoring toward a streaming ASG aggregation, which is a lot
easier if we do not need to perform lookups in a separate step from the
ASG's primitives.
DEV-11864
`PartialEq` remains, and is all that is needed. See previous commit
regarding the removal of this same bound from `Context`.
This can be re-added if it ends up actually being necessary. But Tokens are
ephemeral and used only in lowering pipelines, using pattern matching.
DEV-11864
These traits are no longer necessary now that I'm using concrete types; they
just add unnecessary noise and confusion as I attempt to further refactor.
Don't abstract prematurely.
DEV-11864
This removes the generic on the Asg (which was formerly BaseAsg),
hard-coding `IdentObject`, which will further evolve. This makes the IR an
actual concrete IR rather than an abstract data structure.
These tests bring me back a bit, since they were written as I was still
becoming familiar with Rust.
DEV-11864
This is the beginning of an incremental refactoring to remove generics, to
simplify the ASG. When I initially wrote the linker, I wasn't sure what
direction I was going in, but I was also negatively influenced by more
traditional approaches to both design and unit testing.
If we're going to call the ASG an IR, then it needs to be one---if the core
of the IR is generic, then it's more like an abstract data structure than
anything. We can abstract around the IR to slice it up into components that
are a little easier to reason about and understand how responsibilities are
segregated.
DEV-11864
This is unnecessarily restrictive, since we do not require anything further
than `PartialEq` for the situations where we care about equality (tests).
DEV-11864
This is too restrictive, especially for parsers that fold into something,
like the ASG, which may exist prior to invoking the parser.
This moves the trait bound to the functions that actually need it. Those
obviously cannot be used if the Context does not implement `Default`, but
I'll provide alternative conveniences.
DEV-11864
I attempted to resolve an error previously, and I thought I had, but
apparently some symbols acquire a @dtype at some point in the process, or
lose it. Regardless, I have no interest in debugging or resolving this
mess, since it's going away.
The linker ensures that externs match, so while this could potentially allow
conflicting imports within a package (unlikely, given that extern templates
are recommended), it still will not resolve with a conflicting concrete
implementation. I'm not worried.
DEV-1036
Extern resolution has apparently been failing for quite some time, resulting
in `preproc:error` nodes in the _symbol table_ of return maps. This was
caught by the new xmlo parser, which does not ignore nodes it does not care
about.
The failure was caused by missing `@dtype`---the externs did in fact match,
and if they did not, then the linker would have failed.
This doesn't modify the map compiler to properly detect these, because
this compiler is going away in the hopefully-near future, and the problems
will now be caught, though in a very unideal way (as a parse error during
xmlo reading).
DEV-10936
preproc:sym/preproc:from is used for generating `knownFields` using the
_input_ map, so this has no use for return map values; the map still
produces edges to its dependencies.
The issue is that there are return map entries in some of our systems that
are producing multiple `preproc:from`, but I somewhat-recently modified the
system to support only a single map, to remove dynamic allocation. This
resolves that problem.
With that said, `knownFields` was created for Liza to know when the
classifier ought to be invoked, to save time. Back when it was first
introduced ~10y ago, this provided significant savings, however the
structure of our system now is such that nearly every single field invokes
the classifier.
Furthermore, these details should remain encapsulated; if we wanted to make
that determination, we should be provided with a delta, which we could also
use to do incremental classification in the future, if there's an ROI there
after other improvements have been made.
So, eventually, preproc:sym/preproc:from will go away entirely.
DEV-10936
RSG (Ryan Specialty Group) recently announced a rename to Ryan Specialty (no
"Group"), but I'm not sure if the legal name has been changed yet or not, so
I'll wait on that.
These are no longer TODOs---they represent invalid tokens.
I'm going to put effort into providing further context with the diagnostic
system [right now] because these are internal errors caused by either
miscompilation or an incomplete reader.
DEV-10936
The new xmlo parser was failing on a worksheet xmlo file because fragments
were not properly placed within the header.
This was a change made when tameld was introduced so that we could stop
reading xmlo files early.
DEV-10936
This was missed when removing it from other Display impls when the new
diagnostic system was introduced. Raw `Span`s display byte offsets and the
context, which is no longer desirable as part of an error message.
DEV-10936
TAMER rejects this, because we shouldn't be using anything but UTF-8. My
use of this encoding is ancient, from over a decade ago, that was apparently
just copied around.
DEV-10936
I had waited to provide more documentation until I was sure that the
abstraction was not going to change significantly; there was a lot of
refactoring in prior commits.
DEV-12151
This moves construction out of `From` and into separate associated
functions, which can be further simplified in a bit.
We also need unit tests for this, since this still relies on integration
tests due to the cost of the aggressive and tight refactoring iterations.
DEV-12151
Previously, when adjacent duplicate spans were both resolved, if one failed,
the other certainly would, which would result in duplicate labels each
squash. Elided spans do not have syslabels, and so this is no longer a
concern.
DEV-12151
This was removed in a previous commit while working on simplifying the
implementation, with the hope of returning to it once things were in a
better place. They are, so let's bring it back.
DEV-12151
`SpanLabel` was created during a very early refactoring of this system, and
I've just been fighting with it sense. This removes it, and simplifies
some things in the process.
It also makes clear that `Level` is never optional and removes the awkward
`Level::default` that was there previously; the default is now the lowest
level, which will always be able to be escalated.
DEV-12151
This does what the original proof-of-concept implementation did---skip a
span that was just processed, since it'll be squashed into the previous
anyway. These duplicate spans originate from the diagnostic system when
producing supplemental help information.
DEV-12151
Tests are large and will be getting larger. The source will also grow as
it's better documented and cleaned up. It's getting more difficult to
navigate efficiently and concurrently modify implementation and tests, and
parsing via LSP is getting slower with certain types of changes.
DEV-12151
Alright, starting to settle on an abstraction now, and things are coming
together. This gives us line numbers in the previously-empty gutter, and
widens the gutter to accommodate. Gutters are normalized across
sections. Sections are not yet collapsed for sequential line numbers in the
same context.
Exciting!
Here's an example, on an xmlo file:
error: expected closing tag for `preproc:symtable`
--> /home/.../foo.xmlo:16:4
|
16 | <preproc:symtable xmlns:map="http://www.w3.org/2005/xpath-functions/map">
| ----------------- note: element `preproc:symtable` is opened here
--> /home/.../foo.xmlo:11326:4
|
11326 | </preproc:wrong>
| ^^^^^^^^^^^^^^^^ error: expected `</preproc:symtable>`
DEV-12151