Commit Graph

793 Commits (1cf54887565cd46dccc97b98ee0b9696235e1a52)

Author SHA1 Message Date
Mike Gerwitz 400d5b25a1 ir::asg::Object::Empty: Remove variant
This variant is unnecessary, as it was used only by the indexer to represent
the absence of a node, for which was can simply use `None` in the containing
`Option`.

* tamer/Cargo.toml: Add `lazy_static`.
* tamer/Cargo.lock: Update.
* tamer/src/ir/asg/base.rs (with_capacity): Use `None` in place of
    `Some(Object::Empty)`.
* tamer/src/ir/asg/object.rs: Adjust state machine graphic.
  (Empty): Remove variant.
  (Missing): Remove reference to variance.
* tamer/src/lib.rs: Import `lazy_static` for test builds.
* tamer/obj/xmle/writer/writer.rs (Section::iter): Remove `Object::Empty`
    from documentation.
  (test::): Remove references to `Object::Missing`.  `lazy_static!` used
    here.
* tamer/obj/xmle/writer/xmle.rs (test::write_section_catch_missing): Replace
    reference to `Object::Missing`.
2020-03-19 15:42:06 -04:00
Mike Gerwitz 0a135ad707 TAMER: Tidy up graph_sort test
This still isn't comprehensive.  Further, it won't be able to be, because
we'd have to rely on Petgraph implementation details: there are potentially
many acceptable orderings for a given graph.
2020-03-13 11:51:59 -04:00
Joseph Frazer 7e95394076 [DEV-7085] Create `SortableAsg` trait
Create a trait that sorts a graph into `Sections` that can then be used
as an IR. The `BaseAsg` should implement the trait using what was
originally in the POC.
2020-03-13 11:51:59 -04:00
Joseph Frazer bc760387f6 [DEV-7085] Implement `PartialEq` for `Sections`
We want to be able to easily compare `Sections` in tests, so
implementing `PartialEq` (and `Debug`) for both `Sections` and `Section`
is required.
2020-03-13 11:51:59 -04:00
Joseph Frazer 59a0c382af [DEV-7085] Move sections to IR module
We need to use `Sections` in both the writer and the ASG so it needs to
be in a place that makes sense.
2020-03-13 11:51:59 -04:00
Joseph Frazer b5f6a082dd [DEV-7134] Remove unnecessary node replacement
The node was being replaced before we were catching errors properly. Now
that they are propagated, we should not need the replacement.
2020-03-09 11:41:11 -04:00
Joseph Frazer 01e7d3e560 [DEV-7134] Propagate errors from the writer
When an error occurs during the XML writing, they should be shown to the
user.
2020-03-09 08:23:13 -04:00
Joseph Frazer f373a00a80 [DEV-7134] Propagate sorting errors
If a node is found while sorting that is not expected, we should show
the error to the user.
2020-03-09 08:23:13 -04:00
Joseph Frazer 2a5551a04a [DEV-7134] Propagate errors setting fragments
If we cannot set a fragment, we need to display the error to the user.

We are currently ignoring "___head", "___tail", and objects that are
both virtual and overridden. Those will be corrected in with future
changes.
2020-03-09 08:23:13 -04:00
Joseph Frazer 06bc89a9ce [DEV-7134] Pass read event errors up the stack 2020-03-06 14:08:55 -05:00
Joseph Frazer 246a40a047 [DEV-7134] Return error for XmloEvent::SymDecl
We want more than warnings when a XmloEvent::SymDecl symbol has an
unknown "kind".
2020-03-06 13:41:32 -05:00
Joseph Frazer 2228a6158a [DEV-7134] Add alias for LoadResult
It looks better and was recommended by Rust's linter.
2020-03-06 12:44:22 -05:00
Joseph Frazer 4810e7a099 [DEV-7134] Remove unwrap so we can bubble up error messages 2020-03-06 12:32:42 -05:00
Joseph Frazer 590245e191 [DEV-7134] Escalate the error from finding the absolute path
We do not want to have a panic here. The error should be displayed
properly.
2020-03-06 12:24:45 -05:00
Mike Gerwitz bfea768f89 Copyright year 2020 update 2020-03-06 11:05:18 -05:00
Joseph Frazer e613bd8a8c [DEV-7081] Add options to tameld
We want to add an option to set the output file to the linker so we do
not need to redirect output to awk any longer.

This also adds integration tests for tameld.
2020-03-06 09:41:55 -05:00
Joseph Frazer 6ac7641087 [DEV-7083] TAMER: xmle writer
This introduces the writer for xmle files.
2020-03-03 11:21:18 -05:00
Mike Gerwitz c2e6efc0b5 TAMER: Additional crate::ld documentation 2020-03-02 15:54:36 -05:00
Mike Gerwitz b89408e5bb TAMER: Extract quick_xml event-related mocks 2020-02-26 10:49:01 -05:00
Mike Gerwitz 19a6d67dc4 TAMER: Separate static xmle section 2020-02-26 10:49:01 -05:00
Mike Gerwitz 7c60b53de8 TAMER: Virtual symbol override 2020-02-26 10:49:01 -05:00
Mike Gerwitz ab3aec980d TAMER: POC: Use FxHash to remove nondeterminism
The default SipHash is a cryptographic hash and causes ordering to change
between runs.
2020-02-26 10:49:00 -05:00
Mike Gerwitz 645908e258 TAMER: xmle output changes to support Summary Page
Co-Authored-By: Joseph Frazer <joseph.frazer@ryansg.com>
2020-02-26 10:49:00 -05:00
Mike Gerwitz 6939753ca0 TAMER: POC: Output xmle
This is a working proof-of-concept that will be finalized in future commits.
2020-02-26 10:49:00 -05:00
Mike Gerwitz 85a4934db5 TAMER: Symbol source data and metadata 2020-02-26 10:49:00 -05:00
Mike Gerwitz bcc2ab1221 TAMER: Initial abstract semantic graph (ASG)
This begins to introduce the ASG, backed by Petgraph.  The API will continue
to evolve, and Petgraph will likely be encapsulated so that our
implementation can vary independently from it (or even remove it in the
future).
2020-02-26 10:48:59 -05:00
Mike Gerwitz 10b9caa7ad TAMER: Fail on empty fragment ids (and fix underlying problem) 2020-02-25 16:46:28 -05:00
Mike Gerwitz a0893da577 TAMER: xmlo: Add Package event 2020-02-25 16:46:27 -05:00
Mike Gerwitz a8726918f7 TAMER: poc: Use xmlo reader
TODO: More information
2020-02-25 16:46:27 -05:00
Mike Gerwitz a929c8cae4 TAMER: xmlo reader
This introduces the reader for xmlo files produced by the XSLT-based
compiler.  It is an initial implementation but is not complete; see future
commits.
2020-02-25 16:46:25 -05:00
Mike Gerwitz 6aae741162 TAMER (sym::Interner::intern_utf8_unchecked): New function
This removes boilerplate for reading xmlo files.  See next commit.
2020-02-25 16:10:55 -05:00
Mike Gerwitz e8cd378d59 TAMER: Display for Symbol
One of the benefits of storing a reference to the interned string on the
symbol itself is that we get to get its underlying value essentially for
free.
2020-02-24 14:56:28 -05:00
Mike Gerwitz ff0c8bb34f Order symtable, sym-dep, fragments
This ordering will simplify streaming processing of xmlo files in
TAMER.  Specifically, we know that symbols will have been declared by the
time dependencies are added to the graph (and so we should only be creating
edges to existing nodes); and we can halt reading as soon as the closing
fragments tag is encountered, avoiding parsing the entirety of these massive
XML files.

On one particularly large program, this cuts time down from ~0.333s to
~0.300 in the POC linker.
2020-02-24 14:56:28 -05:00
Mike Gerwitz 1f4db84f24 TAMER: Arena-based string interner
Contrary to what I said previously, this replaces the previous
implementation with an arena-backed internment system.  The motivation for
this change was investigating how Rustc performed its string interning, and
why they chose to associate integer identifiers with symbols.

The intent was originally to use Rustc's arena allocator directly, but that
create pulled in far too many dependencies and depended on nightly
Rust.  Bumpalo provides a very similar implementation to Rustc's
DroplessArena, so I went with that instead.

Rustc also relies on a global, singleton interner.  I do not do that
here.  Instead, the returned Symbol carries a lifetime of the underlying
arena, as well as a pointer to the interned string.

Now that this is put to rest, it's time to move on.
2020-02-24 14:56:28 -05:00
Mike Gerwitz 176d099fb6 tamer::sym: FNV => Fx Hash
For strings of any notable length, Fx Hash outperforms FNV.  Rustc also
moved to this hash function and noticed performance
improvements.  Fortunately, as was accounted for in the design, this was a
trivial switch.

Here are some benchmarks to back up that claim:

test hash_set::fnv::with_all_new_1000                 ... bench:     133,096 ns/iter (+/- 1,430)
test hash_set::fnv::with_all_new_1000_with_capacity   ... bench:      82,591 ns/iter (+/- 592)
test hash_set::fnv::with_all_new_rc_str_1000_baseline ... bench:     162,073 ns/iter (+/- 1,277)
test hash_set::fnv::with_one_new_1000                 ... bench:      37,334 ns/iter (+/- 256)
test hash_set::fnv::with_one_new_rc_str_1000_baseline ... bench:      18,263 ns/iter (+/- 261)
test hash_set::fx::with_all_new_1000                  ... bench:      85,217 ns/iter (+/- 1,111)
test hash_set::fx::with_all_new_1000_with_capacity    ... bench:      59,383 ns/iter (+/- 752)
test hash_set::fx::with_all_new_rc_str_1000_baseline  ... bench:      98,802 ns/iter (+/- 1,117)
test hash_set::fx::with_one_new_1000                  ... bench:      42,484 ns/iter (+/- 1,239)
test hash_set::fx::with_one_new_rc_str_1000_baseline  ... bench:      15,000 ns/iter (+/- 233)
test hash_set::with_all_new_1000                      ... bench:     137,645 ns/iter (+/- 1,186)
test hash_set::with_all_new_rc_str_1000_baseline      ... bench:     163,129 ns/iter (+/- 1,725)
test hash_set::with_one_new_1000                      ... bench:      59,051 ns/iter (+/- 1,202)
test hash_set::with_one_new_rc_str_1000_baseline      ... bench:      37,986 ns/iter (+/- 771)
2020-02-24 14:56:28 -05:00
Mike Gerwitz 541fbffc2e tameld: Move documentation to tamer::ld 2020-02-24 14:56:28 -05:00
Mike Gerwitz f2b24e6505 HashMapInterner: New interner, docs, and benchmarks
This interner will be suitable for providing an index to look up nodes in
the ASG.
2020-02-24 14:56:28 -05:00
Mike Gerwitz 9a98644213 TAMER: sym::tests: Generate with macro
This will be used for generating the common tests between HashSet and
HashMap implementations.

This is my first macro in Rust.  There does not seem to be a way to
concatenate identifiers (!), so I'm placing them within modules
instead.  That ended up working out just fine, since then I can use a type
to provide the SUT.
2020-02-24 14:56:28 -05:00
Mike Gerwitz e4e0089815 TAMER: Initial string interning abstraction
This is missing two key things that I'll add shortly: a HashMap-based one
for use in the ASG for node mapping, and an entry-based system for
manipulations.

This has been a nice start for exploring various aspects of Rust
development, as well as conventions that I'd like to implement.  In
particular:

  - Robust documentation intended to guide people through learning the
    necessary material about the compiler, as well as related work to
    rationalize design decisions;
  - Benchmarks;
  - TDD;
  - And just getting used to Rust in general.

I've beat this one to death, so I'll commit this and make smaller changes
going forward to show how easily it can evolve.

(This module was originally named `intern` but this commit and those that
follow rewrote it to `sym`.)
2020-02-24 14:56:28 -05:00
Mike Gerwitz 8455a38a1d Graph-based POC
This makes use of Petgraph for representing the dependency graph and uses a
separate data structure for both string interning and indexing by symbol
name.
2019-12-02 10:05:48 -05:00
Mike Gerwitz 8374541965 tamer: Initial baisc POC with no XML output
This is garbage code.  Do not use it.  It is intentionally throwaway.

While I've researched Rust, I haven't actually _used_ it for a project, so
this is a combination of me exploring various ways of accomplishing the
problem and forcing myself to learn certain aspects of the language.

I'll likely be using petgraph, and this also currently lacks symbol
abstractions.  This commit also performs far too much heap allocation
copying strings around.  But it _does_ perform the topological sort.

Since this only stores the symbol name, it lacks enough information about
the symbol to perform a proper linking.
2019-12-02 10:00:53 -05:00
Mike Gerwitz 7412a8934c tameld: Placeholder binary 2019-11-20 10:11:00 -05:00
Mike Gerwitz fd1a5837ba TAMER: Initial commit 2019-11-18 14:05:47 -05:00