Commit Graph

136 Commits (fc569f7551c949c822549f92df07279d54fdcea1)

Author SHA1 Message Date
Mike Gerwitz 1c7df894ea tamer: asg::graph: *lookup{=>_global}*
Identifier lookups, as done using the graph methods today, look up from a
cache representing the global environment.

Templates must not contribute to this environment until expansion.  Further,
metavariables will not be present in this environment.  To avoid confusion
and help obviate accidental contributions to this environment, the methods
have been renamed.  This will also allow for the creation of more general
methods down the line.

DEV-13708
2023-03-29 12:58:35 -04:00
Mike Gerwitz be81878dd7 tamer: src::asg: Scaffolding for metasyntactic variables
Also known as metavariables or template parameters.

This is a bit of a tortured excursion, trying to figure out how I want to
best represent this.  I have a number of pages of hand-written notes that
I'd like to distill over time, but the rendered graph ontology (via
`asg-ontviz`) demonstrates the broad idea.

`AirTpl::TplApply` highlights some remaining questions.  What I had _wanted_
to do is to separate the concepts of application and expansion, and support
partial application and such.  But it's going to be too much work for now,
when it isn't needed---partial application can be worked around by simply
creating new templates and duplicating params, as we do today, although that
sucks and is a maintenance issue.  But I'd rather address that head-on in
the future.

So it's looking like Option B is going to be the approach for now, with
templates being closed (as in, no free metavariables) and expanded at the
same time.  This simplifies the parser and error conditions significantly
and makes it easier to utilize anonymous templates, since it'll still be the
active context.

My intent is to get at least the graph construction sorted out---not the
actual expansion and binding yet---enough that I can use templates to
represent parts of NIR that do not have proper graph representations or
desugaring yet, so that I can spit them back out again in the `xmli` file
and incrementally handle them.  That was an option I had considered some
months ago, but didn't want to entertain it at the time because I wasn't
sure what doing so would look like; while it was an attractive approach
since it pushes existing primitives into the template system (something I've
wanted to do for years), I didn't want to potentially tank performance or
compromise the design for it after I had spent so much effort on all of this
so far.

But my efforts have yielded a system that significantly exceeds my initial
performance expectations, with a decent abstractions, and so this seems
viable.

DEV-13708
2023-03-15 16:40:07 -04:00
Mike Gerwitz 454b91dfce tamer: asg::graph::object: New Tpl object
There's quite a bit of boilerplate here that'll eventually need factoring
out.  But it's also clear that it is somewhat onerous to add new object
types.

Note that a good chunk of this burden is _intentional_, via exhaustiveness
checks---adding a new type of object is an exceptional occurrence (well, in
principle, but we haven't added them all yet, so it'll be more common
initially), and we'd rather be safe to ensure that everything is properly
considering how that new type of object interacts with it.

Let's not confuse coupling with safety---the latter causes a burden because
of the former, not because of itself; it provides a service to us.

But, nonetheless, we'll want to reduce this burden somewhat since there are
a number more to add.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz 6db70385d0 tamer: xir::flat: Introduce configurable acceptors
Technically, an "acceptor" in the context of state machines is actually a
state machine; the terminology here is more describing the configuration of
the state machine (`XirToXirf`) as an acceptor.

This change comes with significant documentation of the rationale and why
this is important; see that for more information.

This change is necessary so that we can enforce finalization on all parsers
in the lowering pipeline, which is not currently being done.  If we were to
do that now, then `tameld` would fail because it halts parsing of the tokens
stream at the end of the `xmlo` header.

This is also quite the type soup, but I'm not going to refine this further
right now, since my focus is elsewhere (XMLI lowering).

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz 29178f2360 tamer: xir::reader: Divorce from `parse`
The reader previously yielded a `ParsedResult`, presumably to simplify
lowering operations.  But the reader is not a `ParseState`, and does not
otherwise use the parsing API, so this was an inappropriate and confusing
coupling.

This resolves that, introducing a new `lowerable` which will translate an
iterator into something that can be placed in a lowering pipeline.

See the previous commit for more information.

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz 963688f889 tamer: parse::lower::ParsedObject: Include Token type parameter
The token type was previously hard-coded to `UnknownToken`, since the use
case was the beginning of the lowering pipeline at the start of the program,
where there was no token type because the first parser (`XirReader`,
currently) is responsible for producing the first token type.

But when we're lowering from the graph (so, the other side of the lowering
pipeline), we _do_ have token types to deal with.

This also emphasizes the inappropriate coupling of `<XirReader as
Iterator>::Item` with `ParsedResult`; I'd like to follow the same approach
that I'm about to introduce with `tamec`, so see a future commit.

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz 2d3b27ac01 tamer: asg: Root package definition
This causes a package definition to be rooted (so that it can be easily
accessed for a graph walk).  This keeps consistent with the new
`ObjectIndex`-based API by introducing a unit `Root` `ObjectKind` and the
boilerplate that goes with it.

This boilerplate, now glaringly obvious, will be refactored at some point,
since its repetition is onerous and distracting.

DEV-13159
2023-02-01 10:34:17 -05:00
Mike Gerwitz 39d093525c tamer: nir, asg: Introduce package to ASG
This does not yet create edges from identifiers to the package; just getting
this introduced was quite a bit of work, so I want to get this committed.

Note that this also includes a change to NIR so that `Close` contains the
entity so that we can pattern-match for AIR transformations rather than
retaining yet another stack with checks that are already going to be done by
AIR.  This makes NIR stand less on its own from a self-validation point, but
that's okay, given that it's the language that the user entered and,
conceptually, they could enter invalid NIR the same as they enter invalid
XML (e.g. from a REPL).

In _practice_, of course, NIR is lowered from XML and the schema is enforced
during that lowering and so the validation does exist as part of that
parsing.

These concessions speak more to the verbosity of the language (Rust) than
anything.

DEV-13159
2023-02-01 10:34:16 -05:00
Mike Gerwitz 055ff4a9d9 tamer: Remove graphml target
This was originally created to populate Neo4J for querying, but it has not
been utilized.  It's become a maintenance burden as I try to change the API
of and encapsulate the graph, which is important for upholding its
invariants.

This feature, or one like it, will return in the future.  I have other
related plans; we'll see if they materialize.

The graph can't be encapsulated fully just yet because of the linker; those
commits will come in the following days.

DEV-13597
2023-01-26 14:45:17 -05:00
Mike Gerwitz 954b5a2795 Copyright year and name update
Ryan Specialty Group (RSG) rebranded to Ryan Specialty after its IPO.
2023-01-20 23:37:30 -05:00
Mike Gerwitz 378fe3db66 tamer: asg::Asg::lookup: SymbolId=>SPair
This seems to have been an oversight from when I recently introduced SPairs
to ASG; I noticed it while working on another change and receiving back a
`DUMMY_SPAN`.

DEV-13597
2023-01-20 23:37:29 -05:00
Mike Gerwitz 554bb81a63 tamer: asg::ident: Introduce distinction between opaque and transparent
`Ident` is now `Opaque`, but the new `Transparent` state isn't actually used
yet in any transitions; that'll come next.

The original (now "opaque") identifiers were added for the linker, which
does not need (at present) the associated expressions, since they've already
been compiled.  In the future I'd like to do LTO (link-time optimization),
and then the graph will need more information.

DEV-13160
2023-01-20 23:37:29 -05:00
Mike Gerwitz e6640c0019 tamer: Integrate clippy
This invokes clippy as part of `make check` now, which I had previously
avoided doing (I'll elaborate on that below).

This commit represents the changes needed to resolve all the warnings
presented by clippy.  Many changes have been made where I find the lints to
be useful and agreeable, but there are a number of lints, rationalized in
`src/lib.rs`, where I found the lints to be disagreeable.  I have provided
rationale, primarily for those wondering why I desire to deviate from the
default lints, though it does feel backward to rationalize why certain lints
ought to be applied (the reverse should be true).

With that said, this did catch some legitimage issues, and it was also
helpful in getting some older code up-to-date with new language additions
that perhaps I used in new code but hadn't gone back and updated old code
for.  My goal was to get clippy working without errors so that, in the
future, when others get into TAMER and are still getting used to Rust,
clippy is able to help guide them in the right direction.

One of the reasons I went without clippy for so long (though I admittedly
forgot I wasn't using it for a period of time) was because there were a
number of suggestions that I found disagreeable, and I didn't take the time
to go through them and determine what I wanted to follow.  Furthermore, it
was hard to make that judgment when I was new to the language and lacked
the necessary experience to do so.

One thing I would like to comment further on is the use of `format!` with
`expect`, which is also what the diagnostic system convenience methods
do (which clippy does not cover).  Because of all the work I've done trying
to understand Rust and looking at disassemblies and seeing what it
optimizes, I falsely assumed that Rust would convert such things into
conditionals in my otherwise-pure code...but apparently that's not the case,
when `format!` is involved.

I noticed that, after making the suggested fix with `get_ident`, Rust
proceeded to then inline it into each call site and then apply further
optimizations.  It was also previously invoking the thread lock (for the
interner) unconditionally and invoking the `Display` implementation.  That
is not at all what I intended for, despite knowing the eager semantics of
function calls in Rust.

Anyway, possibly more to come on that, I'm just tired of typing and need to
move on.  I'll be returning to investigate further diagnostic messages soon.
2023-01-20 23:37:29 -05:00
Mike Gerwitz 5e13c93a8f tamer: asg: New ObjectContainer for Node type
Working with the graph can be confusing with all of the layers
involved.  This begins to provide a better layer of abstraction that can
encapsulate the concept and enforce invariants.

Since I'm better able to enforce invariants now, this also removes the span
from the diagnostic message, since the invariant is now always enforced with
certainty.  I'm not removing the runtime panic, though; we can revisit that
if future profiling shows that it makes a negative impact.

DEV-13160
2023-01-20 23:37:29 -05:00
Mike Gerwitz 0863536149 tamer: asg::Asg::get: Narrow object type
This uses `ObjectIndex` to automatically narrow the type to what is
expected.

Given that `ObjectIndex` is supposed to mean that there must be an object
with that index, perhaps the next step is to remove the `Option` from `get`
as well.

DEV-13160
2022-12-22 16:32:21 -05:00
Mike Gerwitz 6e90867212 tamer: asg::object::Object{Ref=>Index}: Associate object type
This makes the system a bit more ergonomic and introduces additional type
safety by associating the narrowed object type with the
`ObjectIndex` (previously `ObjectRef`).  Not only does this allow us to
explicitly state the type of object wherever those indices are stored, but
it also allows the API to automatically narrow to that type when operating
on it again without the caller having to worry about it.

DEV-13160
2022-12-22 15:18:08 -05:00
Mike Gerwitz 646633883f tamer: Initial concept for AIR/ASG Expr
This begins to place expressions on the graph---something that I've been
thinking about for a couple of years now, so it's interesting to finally be
doing it.

This is going to evolve; I want to get some things committed so that it's
clear how I'm moving forward.  The ASG makes things a bit awkward for a
number of reasons:

  1. I'm dealing with older code where I had a different model of doing
       things;
  2. It's mutable, rather than the mostly-functional lowering pipeline;
  3. We're dealing with an aggregate ever-evolving blob of data (the graph)
       rather than a stream of tokens; and
  4. We don't have as many type guarantees.

I've shown with the lowering pipeline that I'm able to take a mutable
reference and convert it into something that's both functional and
performant, where I remove it from its container (an `Option`), create a new
version of it, and place it back.  Rust is able to optimize away the memcpys
and such and just directly manipulate the underlying value, which is often a
register with all of the inlining.

_But_ this is a different scenario now.  The lowering pipeline has a narrow
context.  The graph has to keep hitting memory.  So we'll see how this
goes.  But it's most important to get this working and measure how it
performs; I'm not trying to prematurely optimize.  My attempts right now are
for the way that I wish to develop.

Speaking to #4 above, it also sucks that I'm not able to type the
relationships between nodes on the graph.  Rather, it's not that I _can't_,
but a project to created a typed graph library is beyond the scope of this
work and would take far too much time.  I'll leave that to a personal,
non-work project.  Instead, I'm going to have to narrow the type any time
the graph is accessed.  And while that sucks, I'm going to do my best to
encapsulate those details to make it as seamless as possible API-wise.  The
performance hit of performing the narrowing I'm hoping will be very small
relative to all the business logic going on (a single cache miss is bound to
be far more expensive than many narrowings which are just integer
comparisons and branching)...but we'll see.  Introducing branching sucks,
but branch prediction is pretty damn good in modern CPUs.

DEV-13160
2022-12-22 14:33:28 -05:00
Mike Gerwitz 8c4923274a tamer: ld::xmle::lower: Diagnostic message for cycles
This moves the special handling of circular dependencies out of
`poc.rs`---and to be clear, everything needs to be moved out of there---and
into the source of the error.  The diagnostic system did not exist at the
time.

This is one example of how easy it will be to create robust diagnostics once
we have the spans on the graph.  Once the spans resolve to the proper source
locations rather than the `xmlo` file, it'll Just Work.

It is worth noting, though, that this detection and error will ultimately
need to be moved so that it can occur when performing other operation on the
graph during compilation, such as type inference and unification.  I don't
expect to go out of my way to detect cycles, though, since the linker will.

DEV-13430
2022-12-16 15:09:05 -05:00
Mike Gerwitz 0b2e563cdb tamer: asg: Associate spans with identifiers and introduce diagnostics
This ASG implementation is a refactored form of original code from the
proof-of-concept linker, which was well before the span and diagnostic
implementations, and well before I knew for certain how I was going to solve
that problem.

This was quite the pain in the ass, but introduces spans to the AIR tokens
and graph so that we always have useful diagnostic information.  With that
said, there are some important things to note:

  1. Linker spans will originate from the `xmlo` files until we persist
     spans to those object files during `tamec`'s compilation.  But it's
     better than nothing.
  2. Some additional refactoring is still needed for consistency, e.g. use
     of `SPair`.
  3. This is just a preliminary introduction.  More refactoring will come as
     tamec is continued.

DEV-13041
2022-12-16 14:44:38 -05:00
Mike Gerwitz 56d1ecf0a3 tamer: Air{Token=>}
Consistency with `Nir` et al.

DEV-13430
2022-12-13 14:36:38 -05:00
Mike Gerwitz 7c4c0ebdda tamer: parse::lower: Separate error types for lowering and return
Lowering errors in tamec end up utilizing recovery and reporting, so there
is a distinction between recoverable and unrecoverable errors.

tameld aborts on the first error, since recovery is not currently
supported (we'll want to add it, since tameld should output e.g. lists of
unresolved externs).

Note that tamec does not yet handle `FinalizeError` like tameld because it
uses `Lower::lower`, which does not yet finalize (though it does in practice
when it reaches the end of the stream and auto-finalizes, but that is
widened into a `ParseError`).

DEV-13158
2022-10-26 12:44:20 -04:00
Mike Gerwitz 1c181fe546 tamer: parse::lower: Propagate widened errors to terminal parser
The term "terminal parser" isn't formalized yet in the system, but is meant
to refer to the innermost parser that is responsible for pulling tokens
through the lowering pipeline.

This approach is more of what one would expect when dealing with
`Result`-like monads---we are effectively chaining the inner operation while
propagating errors to short-circuit lowering and let the caller decide
whether recovery ought to be permitted with diagnostic messages.  This will
become more clear as it is further refactored.

This also means that the previous changes for introducing interior
mutability for a shared mutable `Reporter` can be reverted, which is great,
since that approach was antithetical to how the streaming pipeline
operates (and introduces awkward mutable state into an
otherwise-mostly-immutable system).

DEV-13158
2022-10-26 12:32:51 -04:00
Mike Gerwitz 65b42022f0 tamer: xir::st: Prefix all preproc-namespaced constants with `QN_P_`
I had previously avoided this to keep names more concise, but now it's
ambiguous with parsing actual TAME sources.

DEV-7145
2022-08-15 13:00:10 -04:00
Mike Gerwitz 7a5f731cac tamer: tameld: XIRF nesting 64=>4
Since we'll never be reading past the header, this is all that is needed.

If in the future this is violated, XIRF will cause a nice diagnostic error
displaying precisely what opening tag caused the increased level of nesting,
which will aid in debugging and allow us to determine if it ought to be
increased.  Here's an example, if I set the max to `3`:

  error: maximum XML element nesting depth of `3` exceeded
     --> /home/.../foo.xmlo:261:10
      |
  261 |          <preproc:sym-ref name=":_vproduct:vector_a"/>
      |          ^^^^^^^^^^^^^^^^ error: this opening tag increases the level of nesting past the limit of 3

Of course, the longer-term goal is to do away with `xmlo` entirely.

This had no (perceivable via `/usr/bin/time -v`, at least) impact on memory
or CPU time.

DEV-7145
2022-08-01 15:01:37 -04:00
Mike Gerwitz 41b41e02c1 tamer: Xirf::Text refinement
This teaches XIRF to optionally refine Text into RefinedText, which
determines whether the given SymbolId represents entirely whitespace.

This is something I've been putting off for some time, but now that I'm
parsing source language for NIR, it is necessary, in that we can only permit
whitespace Text nodes in certain contexts.

The idea is to capture the most common whitespace as preinterned
symbols.  Note that this heuristic ought to be determined from scanning a
codebase, which I haven't done yet; this is just an initial list.

The fallback is to look up the string associated with the SymbolId and
perform a linear scan, aborting on the first non-whitespace character.  This
combination of checks should be sufficiently performant for now considering
that this is only being run on source files, which really are not all that
large.  (They become large when template-expanded.)  I'll optimize further
if I notice it show up during profiling.

This also frees XIR itself from being concerned by Whitespace.  Initially I
had used quick-xml's whitespace trimming, but it messed up my span
calculations, and those were a pain in the ass to implement to begin with,
since I had to resort to pointer arithmetic.  I'd rather avoid tweaking it.

tameld will not check for whitespace, since it's not important---xmlo files,
if malformed, are the fault of the compiler; we can ignore text nodes except
in the context of code fragments, where they are never whitespace (unless
that's also a compiler bug).

Onward and yonward.

DEV-7145
2022-08-01 15:01:37 -04:00
Mike Gerwitz c671bf6a9c tamer: xir: Introduce {Ele,Open,Close}Span
This isn't conceptally all that significant of a change, but there was a lot
of modify to get it working.  I would generally separate this into a commit
for the implementation and another commit for the integration, but I decided
to keep things together.

This serves a role similar to AttrSpan---this allows deriving a span
representing the element name from a span representing the entire XIR
token.  This will provide more useful context for errors---including the tag
delimiter(s) means that we care about the fact that an element is in that
position (as opposed to some other type of node) within the context of an
error.  However, if we are expecting an element but take issue with the
element name itself, we want to place emphasis on that instead.

This also starts to consider the issue of span contexts---a blob of detached
data that is `Span` is useful for error context, but it's not useful for
manipulation or deriving additional information.  For that, we need to
encode additional context, and this is an attempt at that.

I am interested in the concept of providing Spans that are guaranteed to
actually make sense---that are instantiated and manipulated with APIs that
ensure consistency.  But such a thing buys us very little, practically
speaking, over what I have now for TAMER, and so I don't expect to actually
implement that for this project; I'll leave that for a personal
project.  TAMER's already take a lot of my personal interests and it can
cause me a lot of grief sometimes (with regards to letting my aspirations
cause me more work).

DEV-7145
2022-06-24 14:16:29 -04:00
Mike Gerwitz 2b8e7e6031 tamer: xir::st::qname: New module
This moves and deduplicates the static `QName`s into a common area.

DEV-7145
2022-06-06 11:31:27 -04:00
Mike Gerwitz 3da82b351e tamer: xir::flat::{State=>XirToXirf}: Rename
Like the previous two commits, this states the intent of this parser, which
results in more clear pipeline composition.

DEV-7145
2022-06-02 13:48:54 -04:00
Mike Gerwitz 91b55999e2 tamer: asg::air::{AirState=>AirAggregate}: Rename
Like the previous commit, this emphasizes what is happening.

DEV-7145
2022-06-02 13:26:46 -04:00
Mike Gerwitz 45bbf3879e tamer: obj::xmlo::{lower=>air}: Rename {LowerState=>XmloToAir}
This provides much more clarity as to what is going on.  Further, it's less
ambiguous, since I'm about to introduce a new type of xmlo lowering into XIR
for writing the actual xmlo files.

DEV-7145
2022-06-02 13:23:41 -04:00
Mike Gerwitz 8d92667388 tamer: Integrate xir::reader as a parser in the lowering pipeline
This allows `XmlXirReader` to be used in a `Lower` operation, just as
everything else, bringing me one step closer to a pipeline that can be
concisely represented; this is finally beginning to unify in a clear way,
though it is still a bit of a mess.

This causes `XmlXirReader` to _act_ like a `parse::Parser` in that it yields
a `ParsedResult`, but it does not use `parse::Parser` itself; that was the
_original_ plan: convert it into a `ParseState` where `XmlXirReader` became
a context, and force `Parser` to yield by feeding it a stream of tokens with
`repeat`, but that ended up performing poorly relative to this change.  I
did some investigation, which I might write about in the future, but for
now, this solution works just fine.

DEV-7145
2022-06-02 10:30:44 -04:00
Mike Gerwitz 63aa452197 tamer: parse: Move parse::lower into Lower
This also modifies `poc` such that `Lower` is invoked as an associated
function rather than a method to emphasize the pattern that is forming, so
that it can be later abstracted away.

DEV-11864
2022-06-01 11:15:43 -04:00
Mike Gerwitz f40f8bbafc tamer: parse: Rename {lower_*_while_ok=>lower_*}
The `while_ok` can just be implied with a lowering operation, and that
reduces the name complexity so that we can maybe introduce even more
specialized methods without resulting in a huge sentence as a name.

DEV-11864
2022-05-27 14:10:55 -04:00
Mike Gerwitz b084e23497 tamer: Refactor asg_builder into obj::xmlo::lower and asg::air
This finally uses `parse` all the way up to aggregation into the ASG, as can
be seen by the mess in `poc`.  This will be further simplified---I just need
to get this committed so that I can mentally get it off my plate.  I've been
separating this commit into smaller commits, but there's a point where it's
just not worth the effort anymore.  I don't like making large changes such
as this one.

There is still work to do here.  First, it's worth re-mentioning that
`poc` means "proof-of-concept", and represents things that still need a
proper home/abstraction.

Secondly, `poc` is retrieving the context of two parsers---`LowerContext`
and `Asg`.  The latter is desirable, since it's the final aggregation point,
but the former needs to be eliminated; in particular, packages need to be
worked into the ASG so that `found` can be removed.

Recursively loading `xmlo` files still happens in `poc`, but the compiler
will need this as well.  Once packages are on the ASG, along with their
state, that responsibility can be generalized as well.

That will then simplify lowering even further, to the point where hopefully
everything has the same shape (once final aggregation has an abstraction),
after which we can then create a final abstraction to concisely stitch
everything together.  Right now, Rust isn't able to infer `S` for
`Lower<S, LS>`, which is unfortunate, but we'll be able to help it along
with a more explicit abstraction.

DEV-11864
2022-05-27 13:51:29 -04:00
Mike Gerwitz f218c452b9 tamer: iter::trip: Flatten Result
The `*_iter_while_ok` functions now compose like monads, flattening `Result`
at each step and drastically simplifying handling of error types.  This also
removes the bunch of `?`s at the end of the expression, and allows me to use
`?` within the callback itself.

I had originally not used `Result` as the return type of the callback
because I was not entirely sure how I was going to use them, but it's now
clear that I _always_ use `Result` as the return type, and so there's no use
in trying to be too accommodating; it can always change in the future.

This is desirable not just for cleanup, but because trying to refactor
`asg_builder` into a pair of `Parser`s is really messy to chain without
flattening, especially given some state that has to leak temporarily to the
caller.  More on that in a future commit.

DEV-11864
2022-05-20 16:08:16 -04:00
Mike Gerwitz 958a707e02 tamer: asg: Hoist Root from Ident into Object
This was always the intent, but I didn't have a higher-level object
yet.  This removes all the awkwardness that existed with working the root
in as an identifier.

DEV-11864
2022-05-19 12:48:43 -04:00
Mike Gerwitz 6252758730 tamer: asg::Object: Introduce Object::Ident
This wraps `Ident` in a new `Object` variant and modifies `Asg` so that its
nodes are of type `Object`.

This unfortunately requires runtime type checking.  Whether or not that's
worth alleviating in the future depends on a lot of different things, since
it'll require my own graph implementation, and I have to focus on other
things right now.  Maybe it'll be worth it in the future.

Note that this also gets rid of some doc examples that simply aren't worth
maintaining as the API evolves.

DEV-11864
2022-05-19 12:33:59 -04:00
Mike Gerwitz ebf1de5a60 tamer: asg::Ident{Object=>}: Rename
I think this may have been renamed _from_ `Ident` some time ago, but I'm too
lazy to check.  In any case, the name is redundant.

DEV-11864
2022-05-19 11:17:04 -04:00
Mike Gerwitz 07d2ec1ffb tamer: Move Dim and {Sym=>}Dtype into num module
A previous commit mentioned that there's not a place for `Dim`, and
duplicated it between `asg` and `xmlo`.  Well, `Dtype` is also needed in
both, and so here's a home for now.

`Dtype` has always been an inappropriate detail for the system and will one
day be removed entirely in favor of higher-level types; the machine
representation is up to the compiler to decide.

DEV-11864
2022-05-19 10:39:21 -04:00
Mike Gerwitz 8948452b71 tamer: asg::ident::Dim: Narrow type
This matches xmlo::Dim, and could be the same thing, if we can find a home
for it in the future; it's not worth creating such a home right now when I'm
not yet sure what else ought to live there; the duplication may be fine.

The conversion from xmlo needs to be moved, and `Dim` is going to be used
for more than just identifiers (expressions will have type inference
performed).

DEV-11864
2022-05-19 09:32:43 -04:00
Mike Gerwitz 3e277270a7 tamer: asg: Track roots on graph
Previously, since the graph contained only identifiers, discovered roots
were stored in a separate vector and exposed to the caller.  This not only
leaked details, but added complexity; this was left over from the
refactoring of the proof-of-concept linker some time ago.

This moves the root management into the ASG itself, mostly, with one item
being left over for now in the asg_builder (eligibility classifications).

There are two roots that were added automatically:

  - __yield
  - __worksheet

The former has been removed and is now expected to be explicitly mapped in
the return map, which is now enforced with an extern in `core/base`.  This
is still special, in the sense that it is explicitly referenced by the
generated code, but there's nothing inherently special about it and I'll
continue to generalize it into oblivion in the future, such that the final
yield is just a convention.

`__worksheet` is the only symbol of type `IdentKind::Worksheet`, and so that
was generalized just as the meta and map entries were.

The goal in the future will be to have this more under the control of the
source language, and to consolodate individual roots under packages, so that
the _actual_ roots are few.

As far as the actual ASG goes: this introduces a single root node that is
used as the sole reference for reachability analysis and topological
sorting.  The edges of that root node replace the vector that was removed.

DEV-11864
2022-05-17 10:42:05 -04:00
Mike Gerwitz 34eb994a0d tamer: asg::Asg::set_fragment: {ObjectRef=>SymbolId}
In the actual implementation (outside of tests), this is always looking up
before adding the symbol.  This will simplify the API, while still retaining
errors, since the identifier will fail the state transition if the
identifier did not exist before attempting to set a fragment.  So while this
is slower in microbenchmarks, this has no effect on real-world performance.

Further, I'm refactoring toward a streaming ASG aggregation, which is a lot
easier if we do not need to perform lookups in a separate step from the
ASG's primitives.

DEV-11864
2022-05-16 13:14:27 -04:00
Mike Gerwitz d87006391e tamer: asg::object: Remove IdentObjectState, IdentObjectData
These traits are no longer necessary now that I'm using concrete types; they
just add unnecessary noise and confusion as I attempt to further refactor.

Don't abstract prematurely.

DEV-11864
2022-05-12 16:31:36 -04:00
Mike Gerwitz 3748762d31 tamer: asg::graph::Asg: Remove type parameter O
This removes the generic on the Asg (which was formerly BaseAsg),
hard-coding `IdentObject`, which will further evolve.  This makes the IR an
actual concrete IR rather than an abstract data structure.

These tests bring me back a bit, since they were written as I was still
becoming familiar with Rust.

DEV-11864
2022-05-12 15:46:17 -04:00
Mike Gerwitz f2c5443176 tamer: asg: Remove generic Asg, rename {Base=>}Asg
This is the beginning of an incremental refactoring to remove generics, to
simplify the ASG.  When I initially wrote the linker, I wasn't sure what
direction I was going in, but I was also negatively influenced by more
traditional approaches to both design and unit testing.

If we're going to call the ASG an IR, then it needs to be one---if the core
of the IR is generic, then it's more like an abstract data structure than
anything.  We can abstract around the IR to slice it up into components that
are a little easier to reason about and understand how responsibilities are
segregated.

DEV-11864
2022-05-11 16:47:13 -04:00
Mike Gerwitz 1ad2fb1dc8 Copyright year update 2022
RSG (Ryan Specialty Group) recently announced a rename to Ryan Specialty (no
"Group"), but I'm not sure if the legal name has been changed yet or not, so
I'll wait on that.
2022-05-03 14:14:29 -04:00
Mike Gerwitz eaa8133d21 tamer: diagnose: Introduction of diagnostic system
This is a working concept that will continue to evolve.  I wanted to start
with some basic output before getting too carried away, since there's a lot
of potential here.

This is heavily influenced by Rust's helpful diagnostic messages, but will
take some time to realize a lot of the things that Rust does.  The next step
will be to resolve line and column numbers, and then possibly include
snippets and underline spans, placing the labels alongside them.  I need to
balance this work with everything else I have going on.

This is a large commit, but it converts the existing Error Display impls
into Diagnostic.  This separation is a bit verbose, so I'll see how this
ends up evolving.

Diagnostics are tied to Error at the moment, but I imagine in the future
that any object would be able to describe itself, error or not, which would
be useful in the future both for the Summary Page and for query
functionality, to help developers understand the systems they are writing
using TAME.

Output is integrated into tameld only in this commit; I'll add tamec
next.  Examples of what this outputs are available in the test cases in this
commit.

DEV-10935
2022-04-13 15:22:46 -04:00
Mike Gerwitz cfc7f45bc4 tamer: Remove wip-xmlo-xir-reader
This entirely removes the old XmloReader that has since been replaced with a
XIR-based reader.

I had been holding off on this because the new reader is slower, pending
performance optimizations (which I'll do a little later on), however the
performance loss is of no practical consideration and only affects the
linker, which is still fast.

Therefore, it's better to get this old code out of the way to simplify
refactoring going forward.  In particular, I'm working on the diagnostic
system.

This is a little sad, in a way---this is some of my first Rust code that I'm
deleting.

DEV-10935
2022-04-11 16:11:49 -04:00
Mike Gerwitz f07c0e75be tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary.  I've been wanting to do this for a long time,
so it's nice seeing this come together.  This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them.  This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.

This just maintains compatibility with the dynamic dispatch that was
previous occurring.  This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.

The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.

More to come.

DEV-10935
2022-04-11 15:15:04 -04:00
Mike Gerwitz a1a4ad3e8e tamer: Introduce context into XirReader
tamec and tameld will now both introduce a `Context` to XIR, which will use
it to create spans.

Here's an example of an error, now that it's all working well together:

  $ target/release/tameld --emit xmle -o /dev/null path/to/package.xmlo
  error: invalid preproc:sym/@dim `9` at [/../path/to/package.xmlo offset 1175451-1175452]

A future task will make this human-readable by producing line and column
numbers, and perhaps even a snippet (if not now, then eventually).

It's exciting to see this coming together finally.

DEV-10934
2022-04-08 16:16:23 -04:00