This matches xmlo::Dim, and could be the same thing, if we can find a home
for it in the future; it's not worth creating such a home right now when I'm
not yet sure what else ought to live there; the duplication may be fine.
The conversion from xmlo needs to be moved, and `Dim` is going to be used
for more than just identifiers (expressions will have type inference
performed).
DEV-11864
This allows retrieving and providing a context to a `Parser`. This is
intended for use with an aggregating parser, in particular to construct the
ASG and return it.
This is a component of a change that replaces `asg_builder` with a
`Parser`-based lowering into the ASG, but there are still changes that need
to be made to simplify things and complete its integration.
DEV-11864
Previously, since the graph contained only identifiers, discovered roots
were stored in a separate vector and exposed to the caller. This not only
leaked details, but added complexity; this was left over from the
refactoring of the proof-of-concept linker some time ago.
This moves the root management into the ASG itself, mostly, with one item
being left over for now in the asg_builder (eligibility classifications).
There are two roots that were added automatically:
- __yield
- __worksheet
The former has been removed and is now expected to be explicitly mapped in
the return map, which is now enforced with an extern in `core/base`. This
is still special, in the sense that it is explicitly referenced by the
generated code, but there's nothing inherently special about it and I'll
continue to generalize it into oblivion in the future, such that the final
yield is just a convention.
`__worksheet` is the only symbol of type `IdentKind::Worksheet`, and so that
was generalized just as the meta and map entries were.
The goal in the future will be to have this more under the control of the
source language, and to consolodate individual roots under packages, so that
the _actual_ roots are few.
As far as the actual ASG goes: this introduces a single root node that is
used as the sole reference for reachability analysis and topological
sorting. The edges of that root node replace the vector that was removed.
DEV-11864
In the actual implementation (outside of tests), this is always looking up
before adding the symbol. This will simplify the API, while still retaining
errors, since the identifier will fail the state transition if the
identifier did not exist before attempting to set a fragment. So while this
is slower in microbenchmarks, this has no effect on real-world performance.
Further, I'm refactoring toward a streaming ASG aggregation, which is a lot
easier if we do not need to perform lookups in a separate step from the
ASG's primitives.
DEV-11864
`PartialEq` remains, and is all that is needed. See previous commit
regarding the removal of this same bound from `Context`.
This can be re-added if it ends up actually being necessary. But Tokens are
ephemeral and used only in lowering pipelines, using pattern matching.
DEV-11864
These traits are no longer necessary now that I'm using concrete types; they
just add unnecessary noise and confusion as I attempt to further refactor.
Don't abstract prematurely.
DEV-11864
This removes the generic on the Asg (which was formerly BaseAsg),
hard-coding `IdentObject`, which will further evolve. This makes the IR an
actual concrete IR rather than an abstract data structure.
These tests bring me back a bit, since they were written as I was still
becoming familiar with Rust.
DEV-11864
This is the beginning of an incremental refactoring to remove generics, to
simplify the ASG. When I initially wrote the linker, I wasn't sure what
direction I was going in, but I was also negatively influenced by more
traditional approaches to both design and unit testing.
If we're going to call the ASG an IR, then it needs to be one---if the core
of the IR is generic, then it's more like an abstract data structure than
anything. We can abstract around the IR to slice it up into components that
are a little easier to reason about and understand how responsibilities are
segregated.
DEV-11864
This is unnecessarily restrictive, since we do not require anything further
than `PartialEq` for the situations where we care about equality (tests).
DEV-11864
This is too restrictive, especially for parsers that fold into something,
like the ASG, which may exist prior to invoking the parser.
This moves the trait bound to the functions that actually need it. Those
obviously cannot be used if the Context does not implement `Default`, but
I'll provide alternative conveniences.
DEV-11864
RSG (Ryan Specialty Group) recently announced a rename to Ryan Specialty (no
"Group"), but I'm not sure if the legal name has been changed yet or not, so
I'll wait on that.
These are no longer TODOs---they represent invalid tokens.
I'm going to put effort into providing further context with the diagnostic
system [right now] because these are internal errors caused by either
miscompilation or an incomplete reader.
DEV-10936
This was missed when removing it from other Display impls when the new
diagnostic system was introduced. Raw `Span`s display byte offsets and the
context, which is no longer desirable as part of an error message.
DEV-10936
I had waited to provide more documentation until I was sure that the
abstraction was not going to change significantly; there was a lot of
refactoring in prior commits.
DEV-12151
This moves construction out of `From` and into separate associated
functions, which can be further simplified in a bit.
We also need unit tests for this, since this still relies on integration
tests due to the cost of the aggressive and tight refactoring iterations.
DEV-12151
Previously, when adjacent duplicate spans were both resolved, if one failed,
the other certainly would, which would result in duplicate labels each
squash. Elided spans do not have syslabels, and so this is no longer a
concern.
DEV-12151
This was removed in a previous commit while working on simplifying the
implementation, with the hope of returning to it once things were in a
better place. They are, so let's bring it back.
DEV-12151
`SpanLabel` was created during a very early refactoring of this system, and
I've just been fighting with it sense. This removes it, and simplifies
some things in the process.
It also makes clear that `Level` is never optional and removes the awkward
`Level::default` that was there previously; the default is now the lowest
level, which will always be able to be escalated.
DEV-12151
This does what the original proof-of-concept implementation did---skip a
span that was just processed, since it'll be squashed into the previous
anyway. These duplicate spans originate from the diagnostic system when
producing supplemental help information.
DEV-12151
Tests are large and will be getting larger. The source will also grow as
it's better documented and cleaned up. It's getting more difficult to
navigate efficiently and concurrently modify implementation and tests, and
parsing via LSP is getting slower with certain types of changes.
DEV-12151
Alright, starting to settle on an abstraction now, and things are coming
together. This gives us line numbers in the previously-empty gutter, and
widens the gutter to accommodate. Gutters are normalized across
sections. Sections are not yet collapsed for sequential line numbers in the
same context.
Exciting!
Here's an example, on an xmlo file:
error: expected closing tag for `preproc:symtable`
--> /home/.../foo.xmlo:16:4
|
16 | <preproc:symtable xmlns:map="http://www.w3.org/2005/xpath-functions/map">
| ----------------- note: element `preproc:symtable` is opened here
--> /home/.../foo.xmlo:11326:4
|
11326 | </preproc:wrong>
| ^^^^^^^^^^^^^^^^ error: expected `</preproc:symtable>`
DEV-12151
The `Section` itself is now responsible for outputting the gutter, which
puts us in a position to be able to apply consistent formatting without
having to propagate width data to every line variant.
Now `SourceLine` _does_ actually correspond to a line of output, which will
allow for better formatting (e.g. collapsing padding) and, importantly,
proper management of gutters.
Note that the seemingly unnecessary `SectionSourceLine` allows for a subtle
consistent formatting for all variants' gutters in `SectionLine`, which will
allow us to hoist that rendering out in the next commit. The other option
was to include a trailing space for padding and marks, but that is not only
sloppy and undesirable, but asking for confusion, especially in editors (like
mine) that trim trailing whitespace.
DEV-12151
If a column isn't present, it degrades to displaying labels like footnotes
anyway, so this simplifies the system rather than catering to a rare
case. With that said, this does lose functionality, since it does not
render the source line at all, even though we _could_ do so.
I may re-introduce that rendering after some further refactoring,
specifically for gutters.
DEV-12151
Using a byte vector just makes life more difficult with regard to preparing
the diagnostic reports. We're already validating UTF-8 data for column
generation, which is necessary for a robust report, so let's just store it
as a String to begin with.
DEV-12151
Note that, if a span is first encountered with a mark but with _no_ label,
the first label (if collapsed) will be on the next line. This allows a span
to be marked without extra visual noise if it's not necessary, and to be
able to trust that it'll stay that way.
Until coloring is introduced, this may or may not be easier to read
depending on context.
This is also not yet taking into account where on the line it begins, and so
may render poorly if the span is at the end of a line. That will be fixed
later on.
DEV-12151
This is now visible in the diagnostic output. Example at this point in
time, on an xmlo file for one of our smallest systems:
error: expected closing tag for `preproc:symtable`
--> /home/.../foo.xmlo:16:4
|
| <preproc:symtable xmlns:map="http://www.w3.org/2005/xpath-functions/map">
| -----------------
= note: element `preproc:symtable` is opened here
--> /home/.../foo.xmlo:11326:4
|
| </preproc:wrong>
| ^^^^^^^^^^^^^^^^
= error: expected `</preproc:symtable>`
DEV-12151
Looking more and more Rust-like. Shameless copy.
TBH I forget what character it uses for help, but it's easy enough to
change.
Also, to be clear: this is modeled after Rust, but it's not a requirement of
mine that it look exactly like it. I just like the general style; I'll
surely deviate over time, as appropriate (or as I feel like it).
DEV-12151
This has the effect of highlighting the columns of the source lines using
'^' as an underline.
The next step will be to have the underline character depend on the
`Level`.
If this commit message doesn't sound all that exciting, given what it
finally achieved after all this time, it's because I'm exhausted, and my
prototype has already taken my excitement. But this is significant, given
all the work leading up to it.
There is some code cleanup needed and some unit tests that ought to be
written rather than relying on integration, but considering how much this is
being refactored, I don't want to add to that refactoring cost just yet
before gutters are introduced and I know things are settled for now.
DEV-12151
This has been a lot of refactoring for something that I prototyped a week
ago, and the prototype is still further along in its output formatting (it
has line numbering in gutters and span markings).
But, this has come a long way, and I'm happy with it overall, though I'm not
happy with my slow pace and struggle to maintain focus. But those are
personal issues.
This leaves a lot to be desired, but at the same time is still really
helpful. There's a couple notable TODOs regarding pointless allocation and
UTF8 re-checking, but otherwise, the feature-related steps are:
- Gutters with line numbers; and
- Marking columns associated with the span.
DEV-12151
Rather than squashing as a separate operation, and explicitly denoting when
it occurred, we'll just always squash, as was done before these changes. It
doesn't really make sense to make this optional and there's not any value in
keeping the decision around.
This also sets us up favorably for future changes: it creates a vector of
labels, which can be analyzed later to determine how to best lay out marks
and labels.
DEV-12151
Just renames the lifetime to refer to the `Diagnostic`, rather than a
`Label` returned by it, which was all `'l` was previously used for.
Note that many labels have a `'static` lifetime; this doesn't change that or
somehow cause it to reallocate; the label must life _for at least `'d`_.
DEV-12151
Rather than rendering the diagnostic `Display` message to a string only to
copy it to yet another buffer later on, this simply stores a reference to
the `Diagnostic` that was provided. This also adds a type to the `Report`
associating it with the provided `Diagnostic`, which does seem appropriate,
given that the report was produced for it.
I should probably rename '{l=>d} now.
DEV-12151
Rather than writing to the provided `Write` object, this produces a `Report`
object. While a lifetime still exists for the diagnostic data (labels,
specifically), I was able to remove the other lifetime resulting from
`ResolvedSpan` by transferring ownership of the data to the `Report`
itself. Once actual source lines are integrated shortly, `Report` will
include those as well.
This has been a tedious process, but it's coming together. Hopefully these
commits documenting the progressive and ugly refactoring are found useful by
some reader in the future.
DEV-12151
The line number was getting special treatment that is simply not worth the
cost (with regards to how burdensome it is on the type definitions). This
simplifies things quite a bit.
If we want header customization in the future, we can worry about that in a
different way, or allow the header as a whole to be swapped out, rather than
its constituents.
DEV-12151
`HeadingColNum` is no longer constructed by `HeadingLineNum`. This both
narrows the types and required data (e.g. removing dummy values in test
cases), and reduces the coupling (by favoring composition, but still coupled
with the concrete type).
DEV-12151
I'm unhappy with the current state of this, which is why I haven't settled
on docs or unit tests for these changes yet (though note that the
integration tests do cover these changes)---this is still a prototype
refactoring.
In particular, this needs to do more lowering---the `ResolvedSpan` and
`MaybeResolvedSpan` need to be eliminated and lowered into exactly what is
needed so that we can stop reasoning about them and propagating them.
Further, having lines and columns lazily evaluate themselves for
display---based on `MaybeResolvedSpan`---adds extra generics that shouldn't
be necessary; they should be pre-computed and store the concrete data they
need in variants. Display shouldn't involve computation beyond formatting
of pre-computed data.
That was always the plan, but this refactoring has been incremental.
Anyway: this is in a working and integration-tested state, but it's going to
change.
DEV-12151