tame/tamer
Mike Gerwitz 6d39474127 tamer: NIR re-simplification
Alright, this has been a rather tortured experience.  The previous commit
began to state what is going on.

This is reversing a lot of prior work, with the benefit of
hindsight.  Little bit of history, for the people who will probably never
read this, but who knows:

As noted at the top of NIR, I've long wanted a very simple set of general
primitives where all desugaring is done by the template system---TAME is a
metalanguage after all.  Therefore, I never intended on having any explicit
desugaring operations.

But I didn't have time to augment the template system to support parsing on
attribute strings (nor am I sure if I want to do such a thing), so it became
clear that interpolation would be a pass in the compiler.  Which led me to
the idea of a desugaring pass.

That in turn spiraled into representing the status of whether NIR was
desugared, and separating primitives, etc, which lead to a lot of additional
complexity.  The idea was to have a Sugared and Plan NIR, and further within
them have symbols that have latent types---if they require interpolation,
then those types would be deferred until after template expansion.

The obvious problem there is that now:

  1. NIR has the complexity of various types; and
  2. Types were tightly coupled with NIR and how it was defined in terms of
     XML destructuring.

The first attempt at this didn't go well: it was clear that the symbol types
would make mapping from Sugared to Plain NIR very complicated.  Further,
since NIR had any number of symbols per Sugared NIR token, interpolation was
a pain in the ass.

So that lead to the idea of interpolating at the _attribute_ level.  That
seemed to be going well at first, until I realized that the token stream of
the attribute parser does not match that of the element parser, and so that
general solution fell apart.  It wouldn't have been great anyway, since then
interpolation was _also_ coupled to the destructuring of the document.

Another goal of mine has been to decouple TAME from XML.  Not because I want
to move away from XML (if I did, I'd want S-expressions, not YAML, but I
don't think the team would go for that).  This decoupling would allow the
use of a subset of the syntax of TAME in other places, like CSVMs and YAML
test cases, for example, if appropriate.

This approach makes sense: the grammar of TAME isn't XML, it's _embedded
within_ XML.  The XML layer has to be stripped to expose it.

And so that's what NIR is now evolving into---the stripped, bare
repsentation of TAME's language.  That also has other benefits too down the
line, like a REPL where you can use any number of syntaxes.  I intend for
NIR to be stack-based, which I'd find to be intuitive for manipulating and
querying packages, but it could have any number of grammars, including
Prolog-like for expressing Horn clauses and querying with a
Prolog/Datalog-like syntax.  But that's for the future...

The next issue is that of attribute types.  If we have a better language for
NIR, then the types can be associated with the NIR tokens, rather than
having to associate each symbol with raw type data, which doesn't make a
whole lot of sense.  That also allows for AIR to better infer types and
determine what they ought to be, and further makes checking types after
template application natural, since it's not part of NIR at all.  It also
means the template system can naturally apply to any sources.

Now, if we take that final step further, and make attributes streaming
instead of aggregating, we're back to a streaming pipeline where all
aggregation takes place on the ASG (which also resolves the memcpy concerns
worked around previously, also further simplifying `ele_parse` again, though
it sucks that I wasted that time).  And, without the symbol types getting
in the way, since now NIR has types more fundamentally associated with
tokens, we're able to interpolate on a token stream using simple SPairs,
like I always hoped (and reverted back to in the previous commit).

Oh, and what about that desugaring pass?  There's the issue of how to
represent such a thing in the type system---ideally we'd know statically
that desugaring always lowers into a more primitive NIR that reduces the
mapping that needs to be done to AIR.  But that adds complexity, as
mentioned above.  The alternative is to just use the templat system, as I
originally wanted to, and resolve shortcomings by augmenting the template
system to be able to handle it.  That not only keeps NIR and the compiler
much simpler, but exposes more powerful tools to developers via TAME's
metalanguage, if such a thing is appropriate.

Anyway, this creates a system that's far more intuitive, and far
simpler.  It does kick the can to AIR, but that's okay, since it's also
better positioned to deal with it.

Everything I wrote above is a thought dump and has not been proof-read, so
good luck!  And lets hope this finally works out...it's actually feeling
good this time.  The journey was necessary to discover and justify what came
out of it---everything I'm stripping away was like a cocoon, and within it
is a more beautiful and more elegant TAME.

DEV-13346
2022-12-01 11:09:25 -05:00
..
benches tamer: Xirf::Text refinement 2022-08-01 15:01:37 -04:00
build-aux Copyright year update 2022 2022-05-03 14:14:29 -04:00
src tamer: NIR re-simplification 2022-12-01 11:09:25 -05:00
.gitignore TAMER: Initial commit 2019-11-18 14:05:47 -05:00
Cargo.lock tamer: Cargo.toml: Remove lazy_static 2022-06-24 14:18:04 -04:00
Cargo.toml tamer: Cargo.toml: Sort dependencies 2022-10-18 14:48:14 -04:00
Makefile.am tamer: Add `--quiet` flag to `make check` (`cargo test`) 2022-08-12 00:47:14 -04:00
README.md Copyright year update 2022 2022-05-03 14:14:29 -04:00
autogen.sh Copyright year update 2022 2022-05-03 14:14:29 -04:00
bootstrap Copyright year update 2022 2022-05-03 14:14:29 -04:00
configure.ac tamer: (explicit_generic_args_with_impl_trait): Remove unstable feature flag 2022-08-12 16:42:30 -04:00
rustfmt.toml tamer/rustfmt (max_width): Set to 80 2019-11-27 09:15:15 -05:00

README.md

TAME in Rust (TAMER)

TAME was written to help tame the complexity of developing comparative insurance rating systems. This project aims to tame the complexity and performance issues of TAME itself. TAMER is therefore more tame than TAME.

TAME was originally written in XSLT. For more information about the project, see the parent README.md.

Building

To bootstrap from the source repository, run ./bootstrap.

To configure the build for your system, run ./configure. To build, run make. To run tests, run make check.

You may also invoke cargo directly, which make will do for you using options provided to configure.

Note that the default development build results in terrible runtime performance! See [#Build Flags][] below for instructions on how to generate a release binary.

Build Flags

The environment variable CARGO_BUILD_FLAGS can be used to provide additional arguments to cargo build when invoked via make. This can be provided optionally during configure and can be overridden when invoking make. For example:

# release build
$ ./configure && make CARGO_BUILD_FLAGS=--release
$ ./configure CARGO_BUILD_FLAGS=--release && make

# dev build
$ ./configure && make
$ ./configure CARGO_BUILD_FLAGS=--release && make CARGO_BUILD_FLAGS=

Hacking

This section contains advice for those developing TAMER.

Running Tests

Developers should be using test-driven development (TDD). make check will run all necessary tests.

Code Format

Rust provides rustfmt that can automatically format code for you. This project mandates its use and therefore eliminates personal preference in code style (for better or worse).

Formatting checks are run during make check and, on failure, will output the diff that would be applied if you ran make fmt (or make fix); this will run cargo fmt for you (and will use the binaries configured via configure).

Since developers should be doing test-driven development (TDD) and therefore should be running make check frequently, the hope is that frequent feedback on formatting issues will allow developers to quickly adjust their habits to avoid triggering formatting errors at all.

If you want to automatically fix formatting errors and then run tests:

$ make fmt check

Benchmarking

Benchmarks serve two purposes: external integration tests (which are subject to module visibility constraints) and actual benchmarking. To run benchmarks, invoke make bench.

Note that link-time optimizations (LTO) are performed on the binary for benchmarking so that its performance reflects release builds that will be used in production.

The configure script will automatically detect whether the test feature is unstable (as it was as of the time of writing) and, if so, will automatically fall back to invoking nightly (by running cargo +nightly bench).

If you do not have nightly, run you install it via rustup install nightly.