Commit Graph

30 Commits (d889aca13a78b5909e192ad17f64929ef0650c3d)

Author SHA1 Message Date
Mike Gerwitz 38c0161257 tamer: f::{Functor=>Map}: It's not really a functor
At least not how most people expect functors to be.  I'm really just using
this as a map with powerful inference properties that make writing code more
pleasent.

And I need fallible methods now too.

DEV-13163
2023-07-26 16:43:09 -04:00
Mike Gerwitz 507669cb30 tamer: asg::graph::object::ObjectIndexRefined: New narrowing type
The provided documentation provides rationale, and the use case is the
ontree change.  I was uncomfortable without the exhaustive match, and I was
further annoyed by the lack of easy `ObjectIndex` narrowing.

DEV-13163
2023-07-18 10:31:33 -04:00
Mike Gerwitz 5a301c1548 tamer: asg::graph::visit::ontree: Source ordering of ontological tree
This introduces the ability to specify an edge ordering for the ontological
tree traversal.  `tree_reconstruction` will now use a
`SourceCompatibleTreeEdgeOrder`, which will traverse the graph in an order
that will result in a properly ordered source reconstruction.  This is
needed for template headers, because interpolation causes
metavariables (exposed as template params) to be mixed into the body.

There's a lot of information here, including some TODOs on possible
improvements.  I used the unstable `is_sorted` to output how many template
were already sorted, based on one of our very large packages internally that
uses templates extensively, and found that none of the desugared shorthand
template expansions were already ordered.  If I tweak that a bit, then
nearly all templates will already be ordered, reducing the work that needs
to be done, leaving only template definitions with interpolation to be
concerned about, which is infrequent relative to everything else.

DEV-13163
2023-07-18 10:31:31 -04:00
Mike Gerwitz b4b85a5e85 tamer: asg::air: Support Meta::ConcatList with lexemes and refs
This handles the common cases for meta, which includes what interpolation
desugars into.  Most of this work was in testing and reasoning about the
issue; `asg::graph::visit:ontree::test` has a good summary of the structure
of the graph that results.

The last remaining steps to make this work end-to-end is for NIR->AIR to
lower `Nir::Ref` into `Air::BindIdent`, and then for `asg::graph::xmli` to
reconstruct concatenation lists.  I'll then be able to commit the xmli test
case I've been sitting on, whose errors have been guiding my development.

DEV-13163
2023-07-13 10:48:45 -04:00
Mike Gerwitz e8335c57d4 tamer: asg::air::ir::AirMeta: Remove `Tpl` prefix from tokens
Cleanup from the previous commit.

DEV-13162
2023-05-23 14:44:16 -04:00
Mike Gerwitz 7857460c1d tamer: Re-use prior AirAggreagteCtx for subsequent parsers
A new AirAggregate parser is utilized for each package import.  This
prevents us from moving the index from `Asg` onto `AirAggregateCtx` because
the index would be dropped between each import.

This allows re-using that context and solves for problems that result from
attempting to do so, as explained in the new
`resume_previous_parsing_context` test case.

But, it's now clear that there's a missing abstraction, and that reasoning
about this problem at the topmost level of the compiler/linker in terms of
internal parsing details like "context" is not appropriate.  What we're
doing is suspending parsing and resuming it later on for another package,
aggregating into the same destination (ASG + index).  An abstraction ought
to be formed in terms of that.

DEV-13162
2023-05-19 13:38:15 -04:00
Mike Gerwitz 716e217c9f tamer: asg: Restrict index-related operations to AIR
This is in the same spirit as previous commits modifying (or removing)
tests and benchmarks related to accessing the ASG and its indexes directly.

With this change, only `asg::air` uses the indexing and lookup methods on
`Asg`.  This will allow me to extract the index from `Asg` entirely and have
`Air` solely responsible for lookup; the graph will be responsible only for,
well, being a graph.  Indexing is an optimization strategy.

More information in the commit to follow.  But notice how this moving
environment-related concerns away from `Asg` and into AIR, and how the
remaining environment concerns are index-related.

But there is one remaining barrier: to fully move the indexing away from
`Asg`, we have to use an alternative (and complete)
abstraction---AirAggregateCtx with its ability to resolve and introduce
scope based on the stack.  The `AirIdent` token subset doesn't yet do that,
and all the work up to this point was in prepartion for doing that.  Since
introducing indexing at Root a few commits ago, it's now possible to
proceed.

DEV-13162
2023-05-17 11:37:03 -04:00
Mike Gerwitz dd6a6dd196 tamer: asg::air::ir::AirPkg::PkgStart: Require name
This requires the name as part of the package definition, which in turn
removes a state (and all the combinations resulting from it) from
AirAggregate, which results in significant complexity reduction for a very
complex part of the system.

Pushing this complexity outward results in a reduction of overall
complexity, and obviates the question of where NIR will receive a generated
name.

DEV-13162
2023-05-10 13:57:45 -04:00
Mike Gerwitz ebdae9ac38 tamer: ld::xmle::lower: Sort only rooted Idents
This is one of many changes that have been lingering that I need to start to
break apart in an attempt to commit the confusing and disappointing
conclusion to this package loading madness.

More information to come.

DEV-13162
2023-05-09 15:20:39 -04:00
Mike Gerwitz 4ec4857360 Revert "tamer: asg::air::ir::AirBind::RefIdent: New optional canonical name"
This reverts commit da7fe96254e425bc7b75f8cf454465b71e27e372.

I'm a fool---this would be pursuant to a future plan that removes AirIdent
opaque tokens.  But for now, I need it on IdentDecl and others, which
currently has a `Source` (that I want to go away, as just mentioned), which
contains the same information.

So maybe more to come on this...

DEV-13162
2023-05-09 12:35:06 -04:00
Mike Gerwitz 572337505c tamer: asg::air::ir::AirBind::RefIdent: New optional canonical name
This allows for a canonical package name to be optionally provided to
explicitly resolve a reference against, avoiding a lexical lookup.

This change doesn't actually utilize this new value yet; it just
retains BC.  The new argument will be used for the linker, since it already
knows the package that defined an identifier while reading the object file's
symbol table.  It will also be used by tamec for the same purposes while
processing package imports.

DEV-13162

-- squashed with --

tamer: asg::air::ir::RefIdent: CanonicalName=SPair

The use of CanonicalName created an asymmetry between RefIdent and
BindIdent.  The hope was to move CanonicalName instantiation outside of AIR
and into NIR, but doing so would be confusing and awkward without doing
something with BindIdent.

I don't have the time to deal with that for now, so let's observe how the
system continues to evolve and see whether hoisting it out makes sense in the
end.  For now, this works just fine and I need to move on with the actual
goal of finishing package imports so that I can expand templates.

DEV-13162
2023-05-09 12:35:06 -04:00
Mike Gerwitz 48bcb0cdab tamer: asg: Integrate package CanonicalName
This change requires every package to have a canonical name, and performs
namespec canonicalization on imports.

Since all package names are canonicalized, this opens the door to being able
to index package names at import, allowing the object to be shared on the
graph and properly reference a package after it has been resolved.

Note that the system tests' canonicalization is relative to the hard-coded
`/TODO` presently; that will change in the near future once `tamec`
generates names from the provided path.

DEV-13162
2023-05-05 10:26:58 -04:00
Mike Gerwitz 670c5d3a5d tamer: asg::graph: Require name for non-imports
NOTE: This temporarily breaks `tameld`.  It is fixed in a future commit when
names are bound.  This was an oversight when breaking apart changes into
separate commits, because the linker does not yet have system tests like
tamec does.

This is preparing for a full transition to requiring a canonical package
name.  The previous `Unnamed` variant has been removed and `AirAggregate`
will provide a default `WS_EMPTY` name, as `Pkg` had done before.

The intent of this change is to allow for consulting the index before a
new `Pkg` object is created on the graph, but we're not quite ready for that
yet.

Well, that's not entirely true---the linker can be ready for that.  But the
compiler needs to canonicalize import paths relative to the active package
canonical name, which it can't even do yet because tamec isn't generating a
name.

So maybe the linker will be first; it's useful to have that in a separate
commit anyway to emphasize the change.

DEV-13162
2023-05-05 10:24:47 -04:00
Mike Gerwitz 9b53a5e176 tamer: asg::graph::visit::topo: Cut cycles
This commit includes plenty of documentation, so you should look there.

It's desirable to describe the sorting that TAME performs as a topological
sort, since that's the end result we want.  This uses the ontology to
determine what to do to the graph when a cycle is encountered.  So
technically we're sorting a graph with cycles, but you can equivalently view
this as first transforming the graph to cut all cycles and then sorting it.

For the sake of trivia, the term "cut" is used for two reasons: (1) it's an
intuitive visualization, and (2) the term "cut" has precedence in logic
programming (e.g. Prolog), where it (`!`) is used to prevent
backtracking.  We're also preventing backtracking, via a back edge, which
would produce a cycle.

DEV-13162
2023-04-28 14:33:48 -04:00
Mike Gerwitz c2c1434afe tamer: asg::graph::visit::topo: Cycle detection
This introduces cycle detection, but it does not yet filter ontologically
permitted cycles, which will be needed prior to utilizing this in `tameld`.

There's a considerable amount of documentation here.  While the
implementation is fairly simple, there are important algorithmic decisions,
both in the DFS construction and the derivation of the cycle path from data
that already exists.

This also supports recovery (by ignoring cycles), which can then be utilized
to find more cycles and other errors in the system.

DEV-13162
2023-04-27 16:28:57 -04:00
Mike Gerwitz e3094e0bad tamer: asg::graph::visit::topo: Introduce topological sort
This is an initial implementation that does not yet produce errors on
cycles.  Documentation is not yet complete.

The implementation is fairly basic, and similar to Petgraph's DFS.

A terminology note: the DFS will be ontology-aware (or at least aware of
edge metadata) to avoid traversing edges that would introduce cycles in
situations where they are permitted, which effectively performs a
topological sort on an implicitly _filtered_ graph.

This will end up replacing ld::xmle::lower::sort.

DEV-13162
2023-04-26 09:51:45 -04:00
Mike Gerwitz be05fbb833 tamer: asg::graph::visit{=>::ontree}: Move into submodule
This reorganization makes way for more traversals.

DEV-13162
2023-04-24 13:51:04 -04:00
Mike Gerwitz daa8c6967b tamer: asg: Initial nested template supported
I had hoped this would be considerably easier to implement, but there are
some confounding factors.

First of all: this accomplishes the initial task of getting nested template
applications and definitions re-output in the `xmli` file.  But to do so
successfully, some assumptions had to be made.

The primary issue is that of scope.  The old (XSLT-based) TAME relied on the
output JS to handle lexical scope for it at runtime in most situations.  In
the case of the template system, when scoping/shadowing were needed, complex
and buggy XPaths were used to make a best effort.  The equivalent here would
be a graph traversal, which is not ideal.

I had begun going down the rabbit hole of formalizing lexical scope for
TAMER with environments, but I want to get this committed and working first;
I've been holding onto this and breaking off changes for some time now.

DEV-13708
2023-04-05 15:46:44 -04:00
Mike Gerwitz 9c0e20e58c tamer: asg: Shorthand and long-form template arguments
This applies to template application only; there's still some work to do for
template parameters in definitions (well, for deriving them in `xmli` at
least).  And, as you can see, there's still a lot of TODO items here.

I ended up backtracking on tree edges to Meta, and even on cross edges to
Meta, because it complicated xmli derivation with no benefit right now;
maybe a cross edge will be re-added in the future, but I need to move on and
see where this takes me.

But, it works.

DEV-13708
2023-03-29 12:58:35 -04:00
Mike Gerwitz 893da0ed20 tamer: asg: Dynamically determined cross edges
Previous to this commit, ontological cross edges were declared
statically.  But this doesn't fare well with the decided implementation for
template application.

The documentation details it, but we have Tpl->Ident which could mean "I
define this Ident once expanded", or it could mean "this is a reference to a
template I will be applying".  The former is a tree edge, the latter is a
cross edge, and that determination can only be made by inspecting edge data
at runtime.

It could have been resolved by introducing new Object types, but that is a
lot of work for little benefit, especially given that only (right now) the
visitor uses this information.

DEV-13708
2023-03-29 12:58:34 -04:00
Mike Gerwitz 9e5958d89e tamer: asg::air::ir::Air: Open/Close => Start/End in token names
See the Air docblock for more information.  I'm introducing new tokens for
the template system, which uses the terms "free" and "closed".  I prefer
open/close for delimiters, as I've expressed elsewhere, but unfortunately it
conflicts too much (and too confusingly) with other standard terminology as
we get more into the formal side of the language.

DEV-13708
2023-03-15 10:59:25 -04:00
Mike Gerwitz 343f5b34b3 tamer: asg::air: Template support for dangling expressions
The intent was to have a very simple implementation of `hold_dangling` and
have everything work.  But, I had a nasty surprise when the system tests
caught bug caused by some interesting depth interactions as it relates to
`xmli` and auto-closing.

I added an extra test/example in `asg::graph::visit::test` to illustrate the
situation; it was difficult to derive from the traces, but trivially obvious
once I wrote it out as an example.

With that, templates can now aggregate tokens for dangling expressions.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz d99a8efbaf tamer: asg::air::ir: {ExprRef=>RefIdent}
This generalizes the IR, and relates the duals: identifying and referencing.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz b6d0569b99 tamer: asg::air: Expression parser
This delegates expression parsing to `AirExprAggregate`, in an effort to
both begin to simplify the understanding and maintenance of `AirAggregate`;
and allow for parser composition for template parsing.

This utilizes the prior changes for token sum types to precisely define the
subset of AIR tokens supported by the expression parser.  This differs from
prior approaches which delegated until a dead state, relying on runtime
information to determine if a parser has finished.  This allows us to
determine that statically.

I do want to be able to eliminate the dead state from the parser so we can
get rid of the `unreachable!`, but I need to move on; that's something I had
tried to do in the past too, which ended up adding a bit of complexity, and
I'll have to consider my options in the future, including whether the dead
state transition can be entirely eliminated in favor of the combination of
these sum types and recovery; the parsing framework decisions were made
while recovery was still an open question, at least in practice.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz 08278bc867 tamer: asg::air::Air::{ExprIdent=>BindIdent}: Rename
I wasn't initially sure whether I'd want separate tokens for different types
of identifying operations, but now that I see that it is clear from the
current state of the parser, there's no need.

This matches the name of the token in NIR.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz 3587d032c3 tamer: asg::graph::object::rel::DynObjectRel: Store source data
This is generic over the source, just as the target, defaulting just the
same to `ObjectIndex`.

This allows us to use only the edge information provided rather than having
to perform another lookup on the graph and then assert that we found the
correct edge.  In this case, we're dealing with an `Ident->Expr` edge, of
which there is only one, but in other cases, there may be many such edges,
and it wouldn't be possible to know _which_ was referred to without also
keeping context of the previous edge in the walk.

So, in addition to avoiding more indirection and being more immune to logic
bugs, this also allows us to avoid states in `AsgTreeToXirf` for the purpose
of tracking previous edges in the current path.  And it means that the tree
walk can seed further traversals in conjunction with it, if that is so
needed for deriving sources.

More cleanup will be needed, but this does well to set us up for moving
forward; I was too uncomfortable with having to do the separate
lookup.  This is also a more intuitive API.

But it does have the awkward effect that now I don't need the pair---I just
need the `Object`---but I'm not going to remove it because I suspect I may
need it in the future.  We'll see.

The TODO references the fact that I'm using a convenient `resolve_oi_pairs`
instead of resolving only the target first and then the source only in the
code path that needs it.  I'll want to verify that Rust will properly
optimize to avoid the source resolution in branches that do not need it.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz a5a5a99dbd tamer: asg::graph::visit::TreeWalkRel: New token type
This introduces a `Token` in place of the original tuple for
`TreePreOrderDfs` so that it can be used as input to a parser that will
lower into XIRF.

This requires that various things be describable (using `Display`), which
this also adds.  This is an example of where the parsing framework itself
enforces system observability by ensuring that every part of the system can
describe its state.

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz 7f3ce44481 tamer: asg::graph: Formalize dynamic relationships (edges)
The `TreePreOrderDfs` iterator needed to expose additional edge context to
the caller (specifically, the `Span`).  This was getting a bit messy, so
this consolodates everything into a new `DynObjectRel`, which also
emphasizes that it is in need of narrowing.

Packing everything up like that also allows us to return more information to
the caller without complicating the API, since the caller does not need to
be concerned with all of those values individually.

Depth is kept separate, since that is a property of the traversal and is not
stored on the graph.  (Rather, it _is_ a property of the graph, but it's not
calculated until traversal.  But, depth will also vary for a given node
because of cross edges, and so we cannot store any concrete depth on the
graph for a given node.  Not even a canonical one, because once we start
doing inlining and common subexpression elimination, there will be shared
edges that are _not_ cross edges (the node is conceptually part of _both_
trees).  Okay, enough of this rambling parenthetical.)

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz 065dca88fc tamer: asg::graph::vist::tree_reconstruction: Include Depth
This information is necessary to be able to reconstruct the tree, since
the `ObjectIndex` alone does not give you enough information.  Even if you
inspected the graph, it _still_ wouldn't give you enough information, since
you don't know the current path of the traversal for nodes that may have
multiple incoming edges.  (Any assumptions you could make today won't
always be valid in the future.)

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz e6f736298b tamer: asg::graph::visit::tree_reconstruction: New graph traversal
This begins to introduce a graph traversal useful for a source
reconstruction from the current state of the ASG.  The idea is to, after
having parsed and ingested the source through the lowering pipeline, to
re-output it to (a) prove that we have parsed correctly and (b) allow
progressively moving things from the XSLT-based compiler into TAMER.

There's quite a bit of documentation here; see that for more
information.  Generalizing this in an appropriate way took some time, but I
think this makes sense (that work began with the introduction of cross edges
in terms of the tree described by the graph's ontology).  But I do need to
come up with an illustration to include in the documentation.

DEV-13708
2023-03-10 14:27:57 -05:00