Oh boy. What a mess of a change.
This demonstrates some significant issues we have with Symbol. I had
originally modelled the system a bit after Rustc's, but deviated in certain
regards:
1. This has a confurable base type to enable better packing without bit
twiddling and potentially unsafe tricks I'd rather avoid unless
necessary; and
2. The lifetime is not static, and there is no global, singleton interner;
and
3. I pass around references to a Symbol rather than passing around an
index into an interner.
For #3---this is done because there's no singleton interner and therefore
resolving a symbol requires a direct reference to an available interner. It
also wasn't clear to me (and still isn't, in fact) whether more than one
interner may be used for different contexts.
But, that doesn't preclude removing lifetimes and just passing around
indexes; in fact, I plan to do this in the frontend where the parser and
such will have direct interner access and can therefore just look up based
on a symbol index. We could reserve references for situations where
exposing an interner would be undesirable.
Anyway, more to come...
As mentioned in the previous commit, this flips the types such that the base
type if the primitive and the associated type is the `NonZero*` type; this
is much more natural, concise, and allows Rust to infer the proper type in
most every situation.
The next step will be to stop defaulting the index type for SymbolIndex and
related, since we are about to care very much what size it is (compiler
vs. linker).
This was previously a NonZeroU32, but it was intended to support NonZeroU16
as well for packages, so that we can fit symbols into smaller spaces. In
particular, the upcoming Span wants to fit within 8 bytes, and so requires a
smaller SymbolIndex type.
I'm unhappy with this current implementation, and so comments are unfinished
and there are a couple ignores for dead code warnings. I want to flip the
`SupportedSymbolIndex` trait so that users can specify the primitive rather
than the NonZero* type, which is really awkward-looking and verbose,
especially if you have to do `SymbolIndex::<NonZeroU32>::from_int` or
something. It also prevents (at least in the cases I've observed) Rust from
inferring the proper type for you based on the argument you provide.
So, the goal will be `SymbolIndex::<u32>::from_int(n)`, for example.
The first step in the process is to emit the raw XML events that can then be
immediately output again to echo the results into another file. This will
then allow us to begin parsing the input incrementally, and begin to morph
the output into a real `xmlo` file.
This introduces the beginnings of frontends for TAMER, gated behind a
`wip-features` flag.
This will be introduced in stages:
1. Replace the existing copy with a parser-based copy (echo back out the
tokens), when the flag is on.
2. Begin to parse portions of the source, augmenting the output xmlo (xmli
at the moment). The XSLT-based compiler will be modified to skip
compilation steps as necessary.
As portions of the compilation are implemented in TAMER, they'll be placed
behind their own feature flags and stabalized, which will incrementally
remove the compilation steps from the XSLT-based system. The result should
be substantial incremental performance improvements.
Short-term, the priorities are for loading identifiers into an IR
are (though the order may change):
1. Echo
2. Imports
3. Extern declarations.
4. Simple identifiers (e.g. param, const, template, etc).
5. Classifications.
6. Documentation expressions.
7. Calculation expressions.
8. Template applications.
9. Template definitions.
10. Inline templates.
After each of those are done, the resulting xmlo (xmli) will have fully
reconstructed the source document from the IR produced during parsing.
This was incorrect to begin with---it does not make sense that an input
mapping should depend upon the identifier that it maps to, in the sense that
we make use of these dependencies. If we add weak symbol references in the
future, then this can be reintroduced.
By removing this, we free tameld from having to perform the check itself.
.rev-xmlo bumped to force rebuilding of object files since the linker now
expects that no such dependencies will exist within them.
This is something that changed when the TAMER POC was initially created, as
I was learning Rust. I don't recall the original reason why this was moved,
but it could have been moved back long ago.
In our systems, constants can hold tables (as matrices) with tens or
hundreds of thousands of rows, and there are a number of them in certain
projects. As an example, the YAML-based test cases for one of our systems
went from ~2m30s to ~45s after this change was made. Much of the cost
savings comes from saving GC.
A previous commit used a rustdoc tool lint, but that support wasn't added
until 1.52.0 (2021-05-06).
Note that this represents the minimum _required_ version to build TAMER; you
can use a later version.
This checks explicitly for unresolved objects while sorting and provides an
explicit error for them. For example, this will catch externs that have no
concrete resolution.
This previously fell all the way through to the unreachable! block. The old
POC implementation was catching unresolved objects, albeit with a debug
error.
This will be used for the next commit, but this change has been isolated
both because it distracts from the implementation change in the next commit,
and because it cleans up the code by removing the need for a type parameter
on `AsgError`.
Note that the sort test cases now use `unwrap` instead of having
`{,Sortable}AsgError` support one or the other---this is because that does
not currently happen in practice, and there is not supposed to be a
hierarchy; they are siblings (though perhaps their name may imply otherwise).
The only reason this function was a method of `BaseAsg` was because of
`self.graph`, which is accessible within the scope of this
module. `check_cycles` is logically associated with `SortableAsg`, and so
should exist alongside it (though it can't exist as an associated function
of that trait).
We want to be able to build a representation of the dependency graph so
we can easily inspect it.
We do not want to make GraphML by default. It is better to use a tool.
We use "petgraph-graphml".
This was originally omitted because there wasn't a use case for it. Now
that we're adding context to errors, however, an owned value is highly
desirable.
This adds almost no measurable overhead to the internment system in
benchmarks (largely within the margin of error).
This is a union (sum type) of three other errors types, plus errors specific
to this builder.
This commit does a good job demonstrating the boilerplate, as well as a need
for additional context (in the case of `IdentKindError`), that we'll want to
work on abstracting away.
The `Debug` bound is inconvenient and requires propagation to any types that
use it. Further, it's really awkward having `Display` depend on `Debug`; if
we want to render a useful display here, we can write one.
To be clear: IndexType implements Debug.
For now, this is pretty-printed by another part of the code, which we don't
want to implement in `Display` because it requires looking things up from
the graph.
This flips the API from using XmloWriter as the context to using Asg and
consuming anything that can produce XmloResults. This not only makes more
sense, but avoids having to create a trait for XmloReader, and simplifies
the trait bounds we have to concern ourselves with.
This just tidies things up a little bit before I get into some further
refactoring. I wrote the original code when I was just learning Rust not
too long ago, so it's interesting to see how my understanding has changed
over that relatively short period of time.
This abstracts away the canonicalizer and solves the problem whereby
canonicalization was not being performed prior to recording whether a path
has been visited. This ensures that multiple relative paths to the same
file will be properly recognized as visited.
This will be entirely replaced in an upcoming commit. See that for
details. I don't feel like dealing with the conflicts for rearranging and
squashing these commits.
This also includes an implementation to visit paths only once. Note that it
does not yet canonicalize the path before visiting, so relative paths to the
same file can slip through, and relative paths to _different_ files could be
erroneously considered to have been visited.
This will be fixed in an upcoming commit.
This serves as a constructor for the time being, decoupling from POC. We
may do something better once we have a better idea of how the various
abstractions around this will evolve.
Add a stub executable that will eventually become a full-featured TAME
compiler. The first implementation will only copy the source file to an
intermediary file that will be compiled by the XSLT compiler.