tame/tamer
Mike Gerwitz 1a04d99f15 tamer: obj::xmlo::reader: Working xmlo reader
This makes the necessary tweaks to have the entire linker work end-to-end
and produce a compatible xmle file (that is, identical except for
nondeterministic topological ordering).  That's good, and finally that can
get off of my plate.

What's disappointing, and what I'll have more information on in future
commits, is how slow it is.

The linking of our largest package goes from ~1s -> ~15s with this
change.  The reason is because of tens of millions of `memcpy` calls.  Why?

The ParseState abstraction is pure and passes an owned `self` around, and
Parser replaces its own reference using this:

        let result;
        TransitionResult(Transition(self.state), result) =
            take(&mut self.state).parse_token(tok);

Naively, this would store a copy of the old state in `result`, allocate a
new ParseState for `self.state`, pass the original or a copy to
`parse_token`, and then overwrite `self.state` with the new ParseState that
is returned once it is all over.

Of course, that'd be devastating.  What we want to happen is for Rust to
realize that it can just pass a reference to `self.state` and perform no
copying at all.

For certain parsers, this is exactly what happens.  Great!

But for XIRF, it we have this:

  /// Stack of element [`QName`] and [`Span`] pairs,
  ///   representing the current level of nesting.
  ///
  /// This storage is statically allocated,
  ///   allowing XIRF's parser to avoid memory allocation entirely.
  type ElementStack<const MAX_DEPTH: usize> = ArrayVec<(QName, Span), MAX_DEPTH>;

  /// XIRF document parser state.
  ///
  /// This parser is a pushdown automaton that parses a single XML document.
  #[derive(Debug, Default, PartialEq, Eq)]
  pub enum State<const MAX_DEPTH: usize, SA = AttrParseState>
  where
      SA: FlatAttrParseState,
  {
      /// Document parsing has not yet begun.
      #[default]
      PreRoot,

      /// Parsing nodes.
      NodeExpected(ElementStack<MAX_DEPTH>),

      /// Delegating to attribute parser.
      AttrExpected(ElementStack<MAX_DEPTH>, SA),

      /// End of document has been reached.
      Done,
  }

ParseState contains an ArrayVec, and its implementation details are causes
LLVM _not_ to elide the `memcpy`.  And there's a lot of them.

Considering that ParseState is supposed to use only statically allocated
memory and be zero-copy, this is rather ironic.

Now, this _could_ be potentially fixed by not using ArrayVec; removing
it (and the corresponding checks for balanced tags) gets us down to
2s (which still needs improvement), but we can't have a core abstraction in
our system resting on a house of cards.  What if the optimization changes
between releases and suddenly linking / building becomes shit slow?  That's
too much of a risk.

Further, having to limit what abstractions we use just to appease the
compiler to optimize away moves is very restrictive.

The better option seems like to go back to what I used to do: pass around
`&mut self`.  I had moved to an owned `self` to force consideration of _all_
state transitions, but I can try to do the same thing in a different type of
way using mutable references, and then we avoid this problem.  The
abstraction isn't pure (in the functional sense) anymore, but it's safe and
isn't relying on delicate inlining and optimizer implementation details to
have a performant system.

More information to come.

DEV-10863
2022-04-01 16:31:14 -04:00
..
benches tamer: xir: Remove Text enum 2021-11-15 23:47:14 -05:00
build-aux Copyright year update 2021 2021-07-22 15:00:15 -04:00
src tamer: obj::xmlo::reader: Working xmlo reader 2022-04-01 16:31:14 -04:00
.gitignore TAMER: Initial commit 2019-11-18 14:05:47 -05:00
Cargo.lock tamer: Update dependencies 2022-03-11 10:51:51 -05:00
Cargo.toml tamer: Update dependencies 2022-03-11 10:51:51 -05:00
Makefile.am tamer: cargo --frozen --offline 2021-12-02 11:49:51 -05:00
README.md Copyright year update 2021 2021-07-22 15:00:15 -04:00
autogen.sh Copyright year update 2021 2021-07-22 15:00:15 -04:00
bootstrap tamer: cargo --frozen --offline 2021-12-02 11:49:51 -05:00
configure.ac tamer: cargo --frozen --offline 2021-12-02 11:49:51 -05:00
rustfmt.toml tamer/rustfmt (max_width): Set to 80 2019-11-27 09:15:15 -05:00

README.md

TAME in Rust (TAMER)

TAME was written to help tame the complexity of developing comparative insurance rating systems. This project aims to tame the complexity and performance issues of TAME itself. TAMER is therefore more tame than TAME.

TAME was originally written in XSLT. For more information about the project, see the parent README.md.

Building

To bootstrap from the source repository, run ./bootstrap.

To configure the build for your system, run ./configure. To build, run make. To run tests, run make check.

You may also invoke cargo directly, which make will do for you using options provided to configure.

Note that the default development build results in terrible runtime performance! See [#Build Flags][] below for instructions on how to generate a release binary.

Build Flags

The environment variable CARGO_BUILD_FLAGS can be used to provide additional arguments to cargo build when invoked via make. This can be provided optionally during configure and can be overridden when invoking make. For example:

# release build
$ ./configure && make CARGO_BUILD_FLAGS=--release
$ ./configure CARGO_BUILD_FLAGS=--release && make

# dev build
$ ./configure && make
$ ./configure CARGO_BUILD_FLAGS=--release && make CARGO_BUILD_FLAGS=

Hacking

This section contains advice for those developing TAMER.

Running Tests

Developers should be using test-driven development (TDD). make check will run all necessary tests.

Code Format

Rust provides rustfmt that can automatically format code for you. This project mandates its use and therefore eliminates personal preference in code style (for better or worse).

Formatting checks are run during make check and, on failure, will output the diff that would be applied if you ran make fmt (or make fix); this will run cargo fmt for you (and will use the binaries configured via configure).

Since developers should be doing test-driven development (TDD) and therefore should be running make check frequently, the hope is that frequent feedback on formatting issues will allow developers to quickly adjust their habits to avoid triggering formatting errors at all.

If you want to automatically fix formatting errors and then run tests:

$ make fmt check

Benchmarking

Benchmarks serve two purposes: external integration tests (which are subject to module visibility constraints) and actual benchmarking. To run benchmarks, invoke make bench.

Note that link-time optimizations (LTO) are performed on the binary for benchmarking so that its performance reflects release builds that will be used in production.

The configure script will automatically detect whether the test feature is unstable (as it was as of the time of writing) and, if so, will automatically fall back to invoking nightly (by running cargo +nightly bench).

If you do not have nightly, run you install it via rustup install nightly.