I noticed this while working on a graph traversal. The unit test used the
same span for both the reference _and_ the binding, so I didn't notice. -_-
The problem with this, though, is that we do not have a separate span
representing the source location of the identifier reference. The reason is
that we decided to re-use an existing node rather than creating another one,
which would add another inconvenient layer of indirection (and complexity).
So, I may have to add (optional?) spans to edges.
DEV-13708
This introduces the concept of ontological cross edges.
The term "cross edge" is most often seen in the context of graph traversals,
e.g. the trees formed by a depth-first search. This, however, refers to the
trees that are inherent in the ontology of the graph.
For example, an `ExprRef` will produce a cross edge to the referenced
`Ident`, that that is a different tree than the current expression. (Well,
I suppose technically it _could_ be a back edge, but then that'd be a cycle
which would fail the process once we get to preventing it. So let's ignore
that for now.)
DEV-13708
This causes a package definition to be rooted (so that it can be easily
accessed for a graph walk). This keeps consistent with the new
`ObjectIndex`-based API by introducing a unit `Root` `ObjectKind` and the
boilerplate that goes with it.
This boilerplate, now glaringly obvious, will be refactored at some point,
since its repetition is onerous and distracting.
DEV-13159
Included in this diff are the corresponding changes to the graph to support
the change. Adding the edge was easy, but we also need a way to get the
package for an identifier. The easiest way to do that is to modify the edge
weight to include not just the target node type, but also the source.
DEV-13159
This does not yet create edges from identifiers to the package; just getting
this introduced was quite a bit of work, so I want to get this committed.
Note that this also includes a change to NIR so that `Close` contains the
entity so that we can pattern-match for AIR transformations rather than
retaining yet another stack with checks that are already going to be done by
AIR. This makes NIR stand less on its own from a self-validation point, but
that's okay, given that it's the language that the user entered and,
conceptually, they could enter invalid NIR the same as they enter invalid
XML (e.g. from a REPL).
In _practice_, of course, NIR is lowered from XML and the schema is enforced
during that lowering and so the validation does exist as part of that
parsing.
These concessions speak more to the verbosity of the language (Rust) than
anything.
DEV-13159
This adds support for identifier references, adding `Ident` as a valid edge
type for `Expr`.
There is nothing in the system yet to enforce ontology through levels of
indirection; that will come later on.
I'm testing these changes with a very minimal NIR parse, which I'll commit
shortly.
DEV-13597
This allows for edges to be multiple types, and gives us two important
benefits:
(a) Compiler-verified correctness to ensure that we don't generate graphs
that do not adhere to the ontology; and
(b) Runtime verification of types, so that bugs are still memory safe.
There is a lot more information in the documentation within the patch.
This took a lot of iterating to get something that was tolerable. There's
quite a bit of boilerplate here, and maybe that'll be abstracted away better
in the future as the graph grows.
In particular, it was challenging to determine how I wanted to actually go
about narrowing and looking up edges. Initially I had hoped to represent
the subsets as `ObjectKind`s as well so that you could use them anywhere
`ObjectKind` was expected, but that proved to be far too difficult because I
cannot return a reference to a subset of `Object` (the value would be owned
on generation). And while in a language like C maybe I'd pad structures and
cast between them safely, since they _do_ overlap, I can't confidently do
that here since Rust's discriminant and layout are not under my control.
I tried playing around with `std::mem::Discriminant` as well, but
`discriminant` (the function) requires a _value_, meaning I couldn't get the
discriminant of a static `Object` variant without some dummy value; wasn't
worth it over `ObjectRelTy.` We further can't assign values to enum
variants unless they hold no data. Rust a decade from now may be different
and will be interesting to look back on this struggle.
DEV-13597
The ASG delegates certain operations to Objects so that they may enforce
their own invariants and ontology. It is therefore important that only
objects have access to certain methods on `Asg`, otherwise those invariants
could be circumvented.
It should be noted that the nesting of this module is such that AIR should
_not_ have privileged access to the ASG---it too must utilize objects to
ensure those invariants are enforced in a single place.
DEV-13597
This provides the initial implementation allowing an identifier to be
defined (bound to an object and made transparent).
I'm not yet entirely sure whether I'll stick with the "transparent" and
"opaque" terminology when there's also "declare" and "define", but a
`Missing` state is a type of declaration and so the distinction does still
seem to be important.
There is still work to be done on `ObjectIndex::<Ident>::bind_definition`,
which will follow. I'm going to be balancing work to provide type-level
guarantees, since I don't have the time to go as far as I'd like.
DEV-13597
This seems to have been an oversight from when I recently introduced SPairs
to ASG; I noticed it while working on another change and receiving back a
`DUMMY_SPAN`.
DEV-13597
This introduces a number of abstractions, whose concepts are not fully
documented yet since I want to see how it evolves in practice first.
This introduces the concept of edge ontology (similar to a schema) using the
type system. Even though we are not able to determine what the graph will
look like statically---since that's determined by data fed to us at
runtime---we _can_ ensure that the code _producing_ the graph from those
data will produce a graph that adheres to its ontology.
Because of the typed `ObjectIndex`, we're also able to implement operations
that are specific to the type of object that we're operating on. Though,
since the type is not (yet?) stored on the edge itself, it is possible to
walk the graph without looking at node weights (the `ObjectContainer`) and
therefore avoid panics for invalid type assumptions, which is bad, but I
don't think that'll happen in practice, since we'll want to be resolving
nodes at some point. But I'll addres that more in the future.
Another thing to note is that walking edges is only done in tests right now,
and so there's no filtering or anything; once there are nodes (if there are
nodes) that allow for different outgoing edge types, we'll almost certainly
want filtering as well, rather than panicing. We'll also want to be able to
query for any object type, but filter only to what's permitted by the
ontology.
DEV-13160
This addresses the two outstanding `todo!` match arms representing errors in
lowering expressions into the graph. As noted in the comments, these errors
are unlikely to be hit when using TAME in the traditional way, since
e.g. XIR and NIR are going to catch the equivalent problems within their own
contexts (unbalanced tags and a valid expression grammar respectively).
_But_, the IR does need to stand on its own, and I further hope that some
tooling maybe can interact more directly with AIR in the future.
DEV-13160
This introduces a number of concepts together, again to demonstrate that
they were derived.
This introduces support for nested expressions, extending the previous
work. It also supports error recovery for dangling expressions.
The parser states are a mess; there is a lot of duplicate code here that
needs refactoring, but I wanted to commit this first at a known-good state
so that the diff will demonstrate the need for the change that will
follow; the opportunities for abstraction are plainly visible.
The immutable stack introduced here could be generalized, if needed, in the
future.
Another important note is that Rust optimizes away the `memcpy`s for the
stack that was introduced here. The initial Parser Context was introduced
because of `ArrayVec` inhibiting that elision, but Vec never had that
problem. In the future, I may choose to go back and remove ArrayVec, but I
had wanted to keep memory allocation out of the picture as much as possible
to make the disassembly and call graph easier to reason about and to have
confidence that optimizations were being performed as intended.
With that said---it _should_ be eliding in tamec, since we're not doing
anything meaningful yet with the graph. It does also elide in tameld, but
it's possible that Rust recognizes that those code paths are never taken
because tameld does nothing with expressions. So I'll have to monitor this
as I progress and adjust accordingly; it's possible a future commit will
call BS on everything I just said.
Of course, the counter-point to that is that Rust is optimizing them away
anyway, but Vec _does_ still require allocation; I was hoping to keep such
allocation at the fringes. But another counter-point is that it _still_ is
allocated at the fringe, when the context is initialized for the parser as
part of the lowering pipeline. But I didn't know how that would all come
together back then.
...alright, enough rambling.
DEV-13160
This uses `ObjectIndex` to automatically narrow the type to what is
expected.
Given that `ObjectIndex` is supposed to mean that there must be an object
with that index, perhaps the next step is to remove the `Option` from `get`
as well.
DEV-13160
This begins to place expressions on the graph---something that I've been
thinking about for a couple of years now, so it's interesting to finally be
doing it.
This is going to evolve; I want to get some things committed so that it's
clear how I'm moving forward. The ASG makes things a bit awkward for a
number of reasons:
1. I'm dealing with older code where I had a different model of doing
things;
2. It's mutable, rather than the mostly-functional lowering pipeline;
3. We're dealing with an aggregate ever-evolving blob of data (the graph)
rather than a stream of tokens; and
4. We don't have as many type guarantees.
I've shown with the lowering pipeline that I'm able to take a mutable
reference and convert it into something that's both functional and
performant, where I remove it from its container (an `Option`), create a new
version of it, and place it back. Rust is able to optimize away the memcpys
and such and just directly manipulate the underlying value, which is often a
register with all of the inlining.
_But_ this is a different scenario now. The lowering pipeline has a narrow
context. The graph has to keep hitting memory. So we'll see how this
goes. But it's most important to get this working and measure how it
performs; I'm not trying to prematurely optimize. My attempts right now are
for the way that I wish to develop.
Speaking to #4 above, it also sucks that I'm not able to type the
relationships between nodes on the graph. Rather, it's not that I _can't_,
but a project to created a typed graph library is beyond the scope of this
work and would take far too much time. I'll leave that to a personal,
non-work project. Instead, I'm going to have to narrow the type any time
the graph is accessed. And while that sucks, I'm going to do my best to
encapsulate those details to make it as seamless as possible API-wise. The
performance hit of performing the narrowing I'm hoping will be very small
relative to all the business logic going on (a single cache miss is bound to
be far more expensive than many narrowings which are just integer
comparisons and branching)...but we'll see. Introducing branching sucks,
but branch prediction is pretty damn good in modern CPUs.
DEV-13160
This ASG implementation is a refactored form of original code from the
proof-of-concept linker, which was well before the span and diagnostic
implementations, and well before I knew for certain how I was going to solve
that problem.
This was quite the pain in the ass, but introduces spans to the AIR tokens
and graph so that we always have useful diagnostic information. With that
said, there are some important things to note:
1. Linker spans will originate from the `xmlo` files until we persist
spans to those object files during `tamec`'s compilation. But it's
better than nothing.
2. Some additional refactoring is still needed for consistency, e.g. use
of `SPair`.
3. This is just a preliminary introduction. More refactoring will come as
tamec is continued.
DEV-13041