Commit Graph

149 Commits (a23bae5e4d310133a5d6c2940880f7f54ddfee27)

Author SHA1 Message Date
Mike Gerwitz a23bae5e4d tamer: XIR: Working concept
This is a working streaming IR for XML.  I want to get this committed before
I go further cleaning it up and integrating it into the xmle writer.

This is lacking detailed documentation, and the names of things may end up
changing.

Initial benchmarks do show that it has a ~2x performance improvement over
quick-xml when dealing with two attributes on a node, and I suspect that
improvement will increase with the number of attributes.  We will see how it
compares in real-world benchmarks once the linker has been modified to use
it.

The goal isn't to _avoid_ quick-xml---it'll be used in the future for things
like escaping that would be a huge waste to implement ourselves.  It just so
happened that quick-xml was not beneficial for these changes; indeed, its
own writer is fairly simple for the portions that were implemented here, so
there's no use in fighting with its API, particularly around attributes and
our need to explicitly control whitespace (with the intent of handling code
formatters in the future).

To put this into perspective: the reason this work is being done isn't to
refactor the linker, or to speed it up, but to generalize XML writing and
provide a suitable IR for use in the compiler.  The first step of the
frontend is to essentially echo the XML token stream back out so we can
incrementally parse it and do something useful, to incrementally rewrite the
compiler in Rust.
2021-08-20 10:16:36 -04:00
Mike Gerwitz c211ada89b tamer: benches (memchr): Add missing bench attr
This benchmark was not being run.
2021-08-19 23:14:33 -04:00
Mike Gerwitz e217478a46 tamer: Makefile.am (CARGO_BENCH_FLAGS): New env var 2021-08-19 16:43:14 -04:00
Mike Gerwitz fc235b7ecc tamer: memchr benches
This adds benchmarking for the memchr crate.  It is used primarily by
quick-xml at the moment, but the question is whether to rely on it for
certain operations for XIR.

The benchmarking on an Intel Xeon system shows that memchr and Rust's
contains() perform very similarly on small inputs, matching against a single
character, and so Rust's built-in should be preferred in that case so that
we're using APIs that are familiar to most people.

When larger inputs are compared against, there's a greater benefit (a little
under ~2x).

When comparing against two characters, they are again very close.  But look
at when we compare two characters against _multiple_ inputs:

  running 24 tests
  test large_str:1️⃣:memchr_early_match                 ... bench:       4,938 ns/iter (+/- 124)
  test large_str:1️⃣:memchr_late_match                  ... bench:      81,807 ns/iter (+/- 1,153)
  test large_str:1️⃣:memchr_non_match                   ... bench:      82,074 ns/iter (+/- 1,062)
  test large_str:1️⃣:rust_contains_one_byte_early_match ... bench:       9,425 ns/iter (+/- 167)
  test large_str:1️⃣:rust_contains_one_byte_late_match  ... bench:     123,685 ns/iter (+/- 3,728)
  test large_str:1️⃣:rust_contains_one_byte_non_match   ... bench:     123,117 ns/iter (+/- 2,200)
  test large_str:1️⃣:rust_contains_one_char_early_match ... bench:       9,561 ns/iter (+/- 507)
  test large_str:1️⃣:rust_contains_one_char_late_match  ... bench:     123,929 ns/iter (+/- 2,377)
  test large_str:1️⃣:rust_contains_one_char_non_match   ... bench:     122,989 ns/iter (+/- 2,788)
  test large_str:2️⃣:memchr2_early_match                ... bench:       5,704 ns/iter (+/- 91)
  test large_str:2️⃣:memchr2_late_match                 ... bench:      89,194 ns/iter (+/- 8,546)
  test large_str:2️⃣:memchr2_non_match                  ... bench:      85,649 ns/iter (+/- 3,879)
  test large_str:2️⃣:rust_contains_two_char_early_match ... bench:      66,785 ns/iter (+/- 3,385)
  test large_str:2️⃣:rust_contains_two_char_late_match  ... bench:   2,148,064 ns/iter (+/- 21,812)
  test large_str:2️⃣:rust_contains_two_char_non_match   ... bench:   2,322,082 ns/iter (+/- 22,947)
  test small_str:1️⃣:memchr_mid_match                   ... bench:       4,737 ns/iter (+/- 842)
  test small_str:1️⃣:memchr_non_match                   ... bench:       5,160 ns/iter (+/- 62)
  test small_str:1️⃣:rust_contains_one_byte_non_match   ... bench:       3,930 ns/iter (+/- 35)
  test small_str:1️⃣:rust_contains_one_char_mid_match   ... bench:       3,677 ns/iter (+/- 618)
  test small_str:1️⃣:rust_contains_one_char_non_match   ... bench:       5,415 ns/iter (+/- 221)
  test small_str:2️⃣:memchr2_mid_match                  ... bench:       5,488 ns/iter (+/- 888)
  test small_str:2️⃣:memchr2_non_match                  ... bench:       6,788 ns/iter (+/- 134)
  test small_str:2️⃣:rust_contains_two_char_mid_match   ... bench:       6,203 ns/iter (+/- 170)
  test small_str:2️⃣:rust_contains_two_char_non_match   ... bench:       7,853 ns/iter (+/- 713)

Yikes.

With that said, we won't be comparing against such large inputs
short-term.  The larger strings (fragments) are copied verbatim, and not
compared against---but they _were_ prior to the previous commit that stopped
unencoding and re-encoding.

So: Rust built-ins for inputs that are expected to be small.
2021-08-18 14:23:03 -04:00
Mike Gerwitz 1cdb3fbbc5 tamer: tameld: Skip fragment unescaping only to re-escape on write
Fragments' text were unescaped on reading, producing an owned String and
spending time parsing the text to unescape.  We were then copying that into
an internement pool (so, copying twice, effectively).

Further, we were then _re-escaping_ on write.

This was all wasteful, since we do not do any manipulation of the fragment
before outputting to the xmle file; we know that Saxon produced properly
escaped XML to begin with, and can trust to propagate it.

This also introduces a new global `clone_uninterned_utf8_unchecked` method.

In profiling this change, I tested (a) before this change, (b) after writing
without escaping, and (c) after both reading escaped and writing without
escaping.

     (a)              (b)              (c)
  sec   mem (B)    sec     B        sec     B
0:00.95 47896 -> 0:00.91 47988 -> 0:00.87 48288
0:00.40 30176 -> 0:00.37 25656 -> 0:00.36 25788
0:00.39 45672 -> 0:00.37 45756 -> 0:00.35 34952
0:00.39 20716 -> 0:00.38 19604 -> 0:00.36 19956
0:00.33 16836 -> 0:00.32 16988 -> 0:00.31 16892
0:00.23 15268 -> 0:00.23 15236 -> 0:00.22 15312
0:00.44 20780 -> 0:00.44 20048 -> 0:00.41 20148
0:00.54 44516 -> 0:00.50 36964 -> 0:00.49 36728
0:00.62 55976 -> 0:00.57 46204 -> 0:00.54 41468
0:00.31 28016 -> 0:00.30 27308 -> 0:00.28 23844
0:00.23 15388 -> 0:00.22 15316 -> 0:00.21 15304
0:00.05 4888  -> 0:00.05 4760  -> 0:00.05 4948
0:00.41 19756 -> 0:00.41 19852 -> 0:00.40 19992
0:00.47 20828 -> 0:00.46 20844 -> 0:00.44 20968
0:00.27 18152 -> 0:00.26 18184 -> 0:00.25 18312

Interestingly, the peak memory usage increases very slightly between the
second and third steps (though decreases from the first), likely because the
raw (encoded) is larger than the unencoded text (e.g. `>` takes more
space than `>`).
2021-08-18 11:39:06 -04:00
Mike Gerwitz f97141f5c5 tamer: tameld: Use uninterned symbols for reader
Fragments were previously represented by `String` to avoid the cost of
interning (hashing and copying).  This change modifies it to use uninterned
symbols, which does still have a copy overhead but it does not hash.

Initial tests shows a small performance decrease of about 15% and a small
memory increase of similar proportion.  However, once I realized that I was
not clearing buffers from quick_xml events and implemented that change in a
previous commit, this change ended up being approximately on par with
`String`, despite the copying of some pretty large fragments.

YMMV, though, and perhaps on less powerful systems time may increase
slightly.

The upcoming XIR (XML IR) was originally going to support both owned strings
and symbols, but now we'll just use uninterned symbols; I can't rationalize
complicating the API at this time when it will provide an almost
imperceivable performance benefit.  If ever that changes in the future,
that change will be entertained.

The end result is that the fate of a fragment's underlying memory is
determined by whatever is processing the data, _not_ by the API itself---the
API was previously forcing use of a String, whereas now it's up to the
caller to determine whether we want comparable interns.  For fragments,
that's not likely ever to be the case, especially considering that the
representation will change so drastically in the future.
2021-08-16 14:05:32 -04:00
Mike Gerwitz d96dcad7d8 tamer: tameld: Reduce peak memory usage
This clears the buffers used by quick_xml, which was apparently forgotten
during initial development (I think I expected it to re-use the previously
allocated space automatically).

This has significant effects in some cases.  For example, one of our UI
builds drops from ~9KiB to ~5KiB peak memory usage.  Other builds for larger
suppliers are only slightly effected because of some of their massive
fragments.
2021-08-16 13:38:14 -04:00
Mike Gerwitz ce233ac01d tamer: sym: Uninterned symbols
This adds support for uninterned symbols.  This came about as I was creating
Xir (not yet committed) where I had to decide if I wanted `SymbolId` for all
values, even though some values (e.g. large text blocks like compiled code
fragments for xmle files) will never be compared, and so would be wastefull
hashed.

Previous IRs used `String`, but that was clumsy; see documentation in this
commit for rationale.
2021-08-13 22:54:04 -04:00
Mike Gerwitz 0ff0f88e5f tamer: Introduce span
This is an initial implementation optimized for expected use
cases.  Hopefully that pans out and doesn't come back to bite me.

Regarding the context: it only allows for interned paths atm, which are
strings (and so much be valid UTF-8, which is fine for us, but sucks for
something more general-purpose).  I'll be curious if the context needs
extension later on, or if different contexts will be stored in IRs (e.g. to
store a template application site as well as the location of the expansion
within the template body).
2021-08-13 15:16:39 -04:00
Mike Gerwitz 29ab4b9bfc tamer: sym: Disallow SymbolId construction outside of module
SymboldIds must only be constructed by interners, otherwise we lose
confidence in the type.

This offers an associated function to construct raw SymbolIds from integers
for testing purposes.
2021-08-13 11:54:11 -04:00
Mike Gerwitz d11b4220b2 Revert "tamer: Cargo.toml (dependencies)[lazy_static]: Remove (now used)"
This reverts commit 4fd6313cd2.

...and now I need it for tests.
2021-08-12 16:08:34 -04:00
Mike Gerwitz 4fd6313cd2 tamer: Cargo.toml (dependencies)[lazy_static]: Remove (now used)
The previous commit removed all uses.
2021-08-11 16:26:36 -04:00
Mike Gerwitz 9deb393bfd tamer: Global interners
This is a major change, and I apologize for it all being in one commit.  I
had wanted to break it up, but doing so would have required a significant
amount of temporary work that was not worth doing while I'm the only one
working on this project at the moment.

This accomplishes a number of important things, now that I'm preparing to
write the first compiler frontend for TAMER:

  1. `Symbol` has been removed; `SymbolId` is used in its place.
  2. Consequently, symbols use 16 or 32 bits, rather than a 64-bit pointer.
  3. Using symbols no longer requires dereferencing.
  4. **Lifetimes no longer pollute the entire system! (`'i`)**
  5. Two global interners are offered to produce `SymbolStr` with `'static`
     lifetimes, simplfiying lifetime management and borrowing where strings
     are still needed.
  6. A nice API is provided for interning and lookups (e.g. "foo".intern())
     which makes this look like a core feature of Rust.

Unfortunately, making this change required modifications to...virtually
everything.  And that serves to emphasize why this change was needed:
_everything_ used symbols, and so there's no use in not providing globals.

I implemented this in a way that still provides for loose coupling through
Rust's trait system.  Indeed, Rustc offers a global interner, and I decided
not to go that route initially because it wasn't clear to me that such a
thing was desirable.  It didn't become apparent to me, in fact, until the
recent commit where I introduced `SymbolIndexSize` and saw how many things
had to be touched; the linker evolved so rapidly as I was trying to learn
Rust that I lost track of how bad it got.

Further, this shows how the design of the internment system was a bit
naive---I assumed certain requirements that never panned out.  In
particular, everything using symbols stored `&'i Symbol<'i>`---that is, a
reference (usize) to an object containing an index (32-bit) and a string
slice (128-bit).  So it was a reference to a pretty large value, which was
allocated in the arena alongside the interned string itself.

But, that was assuming that something would need both the symbol index _and_
a readily available string.  That's not the case.  In fact, it's pretty
clear that interning happens at the beginning of execution, that `SymbolId`
is all that's needed during processing (unless an error occurs; more on that
below); and it's not until _the very end_ that we need to retrieve interned
strings from the pool to write either to a file or to display to the
user.  It was horribly wasteful!

So `SymbolId` solves the lifetime issue in itself for most systems, but it
still requires that an interner be available for anything that needs to
create or resolve symbols, which, as it turns out, is still a lot of
things.  Therefore, I decided to implement them as thread-local static
variables, which is very similar to what Rustc does itself (Rustc's are
scoped).  TAMER does not use threads, so the resulting `'static` lifetime
should be just fine for now.  Eventually I'd like to implement `!Send` and
`!Sync`, though, to prevent references from escaping the thread (as noted in
the patch); I can't do that yet, since the feature has not yet been
stabalized.

In the end, this leaves us with a system that's much easier to use and
maintain; hopefully easier for newcomers to get into without having to deal
with so many complex lifetimes; and a nice API that makes it a pleasure to
work with symbols.

Admittedly, the `SymbolIndexSize` adds some complexity, and we'll see if I
end up regretting that down the line, but it exists for an important reason:
the `Span` and other structures that'll be introduced need to pack a lot of
data into 64 bits so they can be freely copied around to keep lifetimes
simple without wreaking havoc in other ways, but a 32-bit symbol size needed
by the linker is too large for that.  (Actually, the linker doesn't yet need
32 bits for our systems, but it's going to in the somewhat near future
unless we optimize away a bunch of symbols...but I'd really rather not have
the linker hit a limit that requires a lot of code changes to resolve).

Rustc uses interned spans when they exceed 8 bytes, but I'd prefer to avoid
that for now.  Most systems can just use on of the `PkgSymbolId` or
`ProgSymbolId` type aliases and not have to worry about it.  Systems that
are actually shared between the compiler and the linker do, though, but it's
not like we don't already have a bunch of trait bounds.

Of course, as we implement link-time optimizations (LTO) in the future, it's
possible most things will need the size and I'll grow frustrated with that
and possibly revisit this.  We shall see.

Anyway, this was exhausting...and...onward to the first frontend!
2021-08-11 14:24:55 -04:00
Mike Gerwitz 71011f5724 tamer: sym: Split into multiple modules
This helps to organize a bit better as I prepare to introduce singleton
interners.
2021-08-02 23:54:37 -04:00
Mike Gerwitz 01722c9c3b tamer: Symbol{Index=>Id}
The former was a misnomer (it represents an index _entry_).  This name is
also shorter, which is nice, considering how often it'll be used.
2021-07-30 13:32:32 -04:00
Mike Gerwitz 0fc8a1a4df tamer: Remove default SymbolIndex (et al) index type
Oh boy.  What a mess of a change.

This demonstrates some significant issues we have with Symbol.  I had
originally modelled the system a bit after Rustc's, but deviated in certain
regards:

  1. This has a confurable base type to enable better packing without bit
     twiddling and potentially unsafe tricks I'd rather avoid unless
     necessary; and
  2. The lifetime is not static, and there is no global, singleton interner;
     and
  3. I pass around references to a Symbol rather than passing around an
     index into an interner.

For #3---this is done because there's no singleton interner and therefore
resolving a symbol requires a direct reference to an available interner.  It
also wasn't clear to me (and still isn't, in fact) whether more than one
interner may be used for different contexts.

But, that doesn't preclude removing lifetimes and just passing around
indexes; in fact, I plan to do this in the frontend where the parser and
such will have direct interner access and can therefore just look up based
on a symbol index.  We could reserve references for situations where
exposing an interner would be undesirable.

Anyway, more to come...
2021-07-29 14:26:40 -04:00
Mike Gerwitz e6ad2be5b9 tamer: sym: Primitive-based SupportedSymbolIndex
As mentioned in the previous commit, this flips the types such that the base
type if the primitive and the associated type is the `NonZero*` type; this
is much more natural, concise, and allows Rust to infer the proper type in
most every situation.

The next step will be to stop defaulting the index type for SymbolIndex and
related, since we are about to care very much what size it is (compiler
vs. linker).
2021-07-28 15:21:24 -04:00
Mike Gerwitz e562d7fcc8 tamer: sym: Begin SymbolIndex base data generalization
This was previously a NonZeroU32, but it was intended to support NonZeroU16
as well for packages, so that we can fit symbols into smaller spaces.  In
particular, the upcoming Span wants to fit within 8 bytes, and so requires a
smaller SymbolIndex type.

I'm unhappy with this current implementation, and so comments are unfinished
and there are a couple ignores for dead code warnings.  I want to flip the
`SupportedSymbolIndex` trait so that users can specify the primitive rather
than the NonZero* type, which is really awkward-looking and verbose,
especially if you have to do `SymbolIndex::<NonZeroU32>::from_int` or
something.  It also prevents (at least in the cases I've observed) Rust from
inferring the proper type for you based on the argument you provide.

So, the goal will be `SymbolIndex::<u32>::from_int(n)`, for example.
2021-07-28 15:21:15 -04:00
Mike Gerwitz ca6ef3ed36 tamer: frontend: Begin basic XML parsing
The first step in the process is to emit the raw XML events that can then be
immediately output again to echo the results into another file.  This will
then allow us to begin parsing the input incrementally, and begin to morph
the output into a real `xmlo` file.
2021-07-27 00:37:13 -04:00
Mike Gerwitz d9dcfe8777 tamer: Introduce tpwrap module to contain quick_xml::Error adapter
This adapter exists to implement PartialEq so that it can be derived on
Error objects.  This is used primarily (well, exclusively atm) for tests.
2021-07-23 23:23:55 -04:00
Mike Gerwitz fb8422d670 tamer: Initial frontend concept
This introduces the beginnings of frontends for TAMER, gated behind a
`wip-features` flag.

This will be introduced in stages:

  1. Replace the existing copy with a parser-based copy (echo back out the
     tokens), when the flag is on.
  2. Begin to parse portions of the source, augmenting the output xmlo (xmli
     at the moment).  The XSLT-based compiler will be modified to skip
     compilation steps as necessary.

As portions of the compilation are implemented in TAMER, they'll be placed
behind their own feature flags and stabalized, which will incrementally
remove the compilation steps from the XSLT-based system.  The result should
be substantial incremental performance improvements.

Short-term, the priorities are for loading identifiers into an IR
are (though the order may change):

  1. Echo
  2. Imports
  3. Extern declarations.
  4. Simple identifiers (e.g. param, const, template, etc).
  5. Classifications.
  6. Documentation expressions.
  7. Calculation expressions.
  8. Template applications.
  9. Template definitions.
  10. Inline templates.

After each of those are done, the resulting xmlo (xmli) will have fully
reconstructed the source document from the IR produced during parsing.
2021-07-23 22:24:08 -04:00
Mike Gerwitz 60372d2960 tamer: Makefile.am (all): Binaries and doc
`all` was previously the target for binaries only.
2021-07-23 22:23:10 -04:00
Mike Gerwitz 6ec1a49506 tamer: Makefile.am: Include feature flags for doc generation and tests
This was forgotten in the previous commit.
2021-07-23 15:56:33 -04:00
Mike Gerwitz f1a3273ee3 tamer: configure.ac: Configure-time feature flags (via Cargo) 2021-07-23 10:16:44 -04:00
Mike Gerwitz 5aaa1106cb tamer: obj::xmlo::reader::mock: Extract into crate::test::quick_xml
Other mocks exist here, and here it can be re-used for the upcoming XML
frontend.
2021-07-22 15:32:30 -04:00
Mike Gerwitz 2e50af1220 Copyright year update 2021 2021-07-22 15:00:15 -04:00
Mike Gerwitz e5bbd49166 tamer: obj::xmlo::reader: Extract tests separate file
The file's getting a bit large and the tests are rather complex.  Further,
LSP does better on smaller, less complex files.
2021-07-22 14:39:06 -04:00
Mike Gerwitz 1f24cfdf25 Remove :map: sym-dep generation
This was incorrect to begin with---it does not make sense that an input
mapping should depend upon the identifier that it maps to, in the sense that
we make use of these dependencies.  If we add weak symbol references in the
future, then this can be reintroduced.

By removing this, we free tameld from having to perform the check itself.

.rev-xmlo bumped to force rebuilding of object files since the linker now
expects that no such dependencies will exist within them.
2021-07-22 14:27:15 -04:00
Mike Gerwitz 90c6b51fd5 tamer: tameld: Place constants into static section in executable
This is something that changed when the TAMER POC was initially created, as
I was learning Rust.  I don't recall the original reason why this was moved,
but it could have been moved back long ago.

In our systems, constants can hold tables (as matrices) with tens or
hundreds of thousands of rows, and there are a number of them in certain
projects.  As an example, the YAML-based test cases for one of our systems
went from ~2m30s to ~45s after this change was made.  Much of the cost
savings comes from saving GC.
2021-07-21 14:53:15 -04:00
Mike Gerwitz 93fb1f1bdd tamer: Rust v1.{48=>53}.0 for rustdoc tool lints
A previous commit used a rustdoc tool lint, but that support wasn't added
until 1.52.0 (2021-05-06).

Note that this represents the minimum _required_ version to build TAMER; you
can use a later version.
2021-06-22 09:07:53 -04:00
Mike Gerwitz 716556c39f tamer: Rust 1.{42=>48}.0 for stable intra-doc links without nightly 2021-06-21 13:10:00 -04:00
Mike Gerwitz 96ea0302cc tamer: Cargo.lock: Dependency updates
This project has been on pause for over a year.
2021-06-21 12:46:38 -04:00
Mike Gerwitz 96ffd5f6e5 [DEV-8000] ir::asg: Error types for unresolved identifiers during sorting
This checks explicitly for unresolved objects while sorting and provides an
explicit error for them.  For example, this will catch externs that have no
concrete resolution.

This previously fell all the way through to the unreachable! block.  The old
POC implementation was catching unresolved objects, albeit with a debug
error.
2020-07-02 01:38:32 -04:00
Mike Gerwitz a2415c8c6f [DEV-8000] ir::asg::base: Replace Symbol::new_dummy
Use symbol_dummy!.
2020-07-01 15:53:56 -04:00
Mike Gerwitz 0d4bbe5e4e [DEV-8000] ir::asg: Introduce SortableAsgError
This will be used for the next commit, but this change has been isolated
both because it distracts from the implementation change in the next commit,
and because it cleans up the code by removing the need for a type parameter
on `AsgError`.

Note that the sort test cases now use `unwrap` instead of having
`{,Sortable}AsgError` support one or the other---this is because that does
not currently happen in practice, and there is not supposed to be a
hierarchy; they are siblings (though perhaps their name may imply otherwise).
2020-07-01 13:42:14 -04:00
Mike Gerwitz f832feb3fa [DEV-8000] ir::asg::base::BaseAsg::check_cycles: Extract into function
The only reason this function was a method of `BaseAsg` was because of
`self.graph`, which is accessible within the scope of this
module.  `check_cycles` is logically associated with `SortableAsg`, and so
should exist alongside it (though it can't exist as an associated function
of that trait).
2020-07-01 11:02:20 -04:00
Joseph Frazer 43d00a8268 [DEV-7504] Add GraphML generation
We want to be able to build a representation of the dependency graph so
we can easily inspect it.

We do not want to make GraphML by default. It is better to use a tool.
We use "petgraph-graphml".
2020-05-13 08:04:48 -04:00
Mike Gerwitz 0127d4b698 TAMER: sym::Interner::index_lookup
This was originally omitted because there wasn't a use case for it.  Now
that we're adding context to errors, however, an owned value is highly
desirable.

This adds almost no measurable overhead to the internment system in
benchmarks (largely within the margin of error).
2020-04-29 11:33:41 -04:00
Mike Gerwitz 4b643385c8 TAMER: Update Cargo dependencies 2020-04-29 11:33:38 -04:00
Mike Gerwitz bcca5f7c49 [DEV-7084] TAMER: AsgBuilder and IR lowering docs 2020-04-28 13:39:55 -04:00
Mike Gerwitz 0f4b2d75f8 [DEV-7084] TAMER: obj::xmlo: Private inner modules 2020-04-28 11:08:05 -04:00
Mike Gerwitz 549e9ca23b [DEV-7084] TAMER: AsgBuilderState:🆕 New constructor 2020-04-28 09:06:25 -04:00
Mike Gerwitz 9893d56775 [DEV-7084] TAMER: Finalize AsgBuilder 2020-04-28 09:06:25 -04:00
Mike Gerwitz 32abc7dce2 [DEV-7084] TAMER: impl PartialEq for XmloError
This cannot be dervied because XmlError does not implement PartialEq,
which is quite the annoyance in tests.
2020-04-28 09:06:25 -04:00
Mike Gerwitz 21a0bdcce1 [DEV-7084] TAMER: AsgBuilderError: Introduce proper error variants
This is a union (sum type) of three other errors types, plus errors specific
to this builder.

This commit does a good job demonstrating the boilerplate, as well as a need
for additional context (in the case of `IdentKindError`), that we'll want to
work on abstracting away.
2020-04-28 09:06:25 -04:00
Mike Gerwitz ef79a763ac [DEV-7084] TAMER: Correct Ix trait bound for AsgError
The `Debug` bound is inconvenient and requires propagation to any types that
use it.  Further, it's really awkward having `Display` depend on `Debug`; if
we want to render a useful display here, we can write one.

To be clear: IndexType implements Debug.

For now, this is pretty-printed by another part of the code, which we don't
want to implement in `Display` because it requires looking things up from
the graph.
2020-04-28 09:06:25 -04:00
Mike Gerwitz cfc13f9016 [DEV-7084] TAMER: ir::asg::IdentKindError: Replace string with enum 2020-04-28 09:06:25 -04:00
Mike Gerwitz 0a9a3214b7 [DEV-7084] TAMER: ir::asg::BaseAsg:🆕 New associated function
Profiling showed that creating an initial capacity of 0 did not have a
notable affect on performance.
2020-04-28 09:06:25 -04:00
Mike Gerwitz ecc2e33ba7 [DEV-7084] TAMER: xmlo::AsgBuilder: Accept XmloResult iterator
This flips the API from using XmloWriter as the context to using Asg and
consuming anything that can produce XmloResults.  This not only makes more
sense, but avoids having to create a trait for XmloReader, and simplifies
the trait bounds we have to concern ourselves with.
2020-04-28 09:06:25 -04:00
Mike Gerwitz 323ea79bf8 [DEV-7084] TAMER: Basic AsgBuilder cleanup
This just tidies things up a little bit before I get into some further
refactoring.  I wrote the original code when I was just learning Rust not
too long ago, so it's interesting to see how my understanding has changed
over that relatively short period of time.
2020-04-28 09:06:25 -04:00