This is essential to clarify what exactly the different object types
represent with the new generic abstractions. For example, we will have
expressions as an object type.
There's a lot here to make the object stored on the `Asg` generic. This
introduces `ObjectState` for state transitions and `ObjectData` for pure
data retrieval. This will allow not only for mocking, but will be useful to
enforce compile-time restrictions on the type of objects expected by the
linker vs. the compiler (e.g. the linker will not have expressions).
This commit intentionally leaves the corresponding tests in their original
location to prove that the functionality has not changed; they'll be moved
in a future commit.
This also leaves the names as "Object" to reduce the number the cognative
overhead of this commit. It will be renamed to something like "IdentObject"
in the near future to clarify the intent of the current object type and to
open the way for expressions and a type that marries both of them in the
future.
Once all of this is done, we'll finally be able to make changes to the
compatibility logic in state transitions to implement extern compatibility
checks during resolution.
DEV-7087
The next commit will generalize this further. This moves logic out of
BaseAsg so that we can implement more sophisticated transitions for
compatability checks.
The logic is still tested as part of BaseAsg; the next commit will change
that as it's generalized further.
* tamer/src/ir/asg/base.rs: Extract object transitions.
* tamer/src/ir/asg/graph.rs (AsgError)[IncompatibleIdent]: New variant.
(From<TransitionError> for AsgError): Basic type translation.
* tamer/src/ir/asg/object.rs (TransitionResult): New type.
(impl Object): Transition methods.
(TransitionError): New enum.
This variant is unnecessary, as it was used only by the indexer to represent
the absence of a node, for which was can simply use `None` in the containing
`Option`.
* tamer/Cargo.toml: Add `lazy_static`.
* tamer/Cargo.lock: Update.
* tamer/src/ir/asg/base.rs (with_capacity): Use `None` in place of
`Some(Object::Empty)`.
* tamer/src/ir/asg/object.rs: Adjust state machine graphic.
(Empty): Remove variant.
(Missing): Remove reference to variance.
* tamer/src/lib.rs: Import `lazy_static` for test builds.
* tamer/obj/xmle/writer/writer.rs (Section::iter): Remove `Object::Empty`
from documentation.
(test::): Remove references to `Object::Missing`. `lazy_static!` used
here.
* tamer/obj/xmle/writer/xmle.rs (test::write_section_catch_missing): Replace
reference to `Object::Missing`.
Merge branch 'jira-7085'
* jira-7085:
TAMER: Tidy up graph_sort test
[DEV-7085] Create `SortableAsg` trait
[DEV-7085] Implement `PartialEq` for `Sections`
[DEV-7085] Move sections to IR module
This still isn't comprehensive. Further, it won't be able to be, because
we'd have to rely on Petgraph implementation details: there are potentially
many acceptable orderings for a given graph.
Create a trait that sorts a graph into `Sections` that can then be used
as an IR. The `BaseAsg` should implement the trait using what was
originally in the POC.
Merge branch 'jira-7134'
* jira-7134:
[DEV-7134] Remove unnecessary node replacement
[DEV-7134] Propagate errors from the writer
[DEV-7134] Propagate sorting errors
[DEV-7134] Propagate errors setting fragments
[DEV-7134] Pass read event errors up the stack
[DEV-7134] Return error for XmloEvent::SymDecl
[DEV-7134] Add alias for LoadResult
[DEV-7134] Remove unwrap so we can bubble up error messages
[DEV-7134] Escalate the error from finding the absolute path
If we cannot set a fragment, we need to display the error to the user.
We are currently ignoring "___head", "___tail", and objects that are
both virtual and overridden. Those will be corrected in with future
changes.
We want to add an option to set the output file to the linker so we do
not need to redirect output to awk any longer.
This also adds integration tests for tameld.
We will continue to finalize this as we go. It is currently used in
production, both for performance and because it fixes a bug in the
XSLT-based linker.
All systems should be using the provided Makefile, so this shouldn't be
invoked anymore. The new linker is still considered a proof-of-concept, but
bugs have been encountered in the old one that are not worth investing the
time into fixing.
The new linker has been used in production for nearly a couple months and is
functioning properly.
This begins to introduce the ASG, backed by Petgraph. The API will continue
to evolve, and Petgraph will likely be encapsulated so that our
implementation can vary independently from it (or even remove it in the
future).
This introduces the reader for xmlo files produced by the XSLT-based
compiler. It is an initial implementation but is not complete; see future
commits.
One of the benefits of storing a reference to the interned string on the
symbol itself is that we get to get its underlying value essentially for
free.
This ordering will simplify streaming processing of xmlo files in
TAMER. Specifically, we know that symbols will have been declared by the
time dependencies are added to the graph (and so we should only be creating
edges to existing nodes); and we can halt reading as soon as the closing
fragments tag is encountered, avoiding parsing the entirety of these massive
XML files.
On one particularly large program, this cuts time down from ~0.333s to
~0.300 in the POC linker.
Contrary to what I said previously, this replaces the previous
implementation with an arena-backed internment system. The motivation for
this change was investigating how Rustc performed its string interning, and
why they chose to associate integer identifiers with symbols.
The intent was originally to use Rustc's arena allocator directly, but that
create pulled in far too many dependencies and depended on nightly
Rust. Bumpalo provides a very similar implementation to Rustc's
DroplessArena, so I went with that instead.
Rustc also relies on a global, singleton interner. I do not do that
here. Instead, the returned Symbol carries a lifetime of the underlying
arena, as well as a pointer to the interned string.
Now that this is put to rest, it's time to move on.
For strings of any notable length, Fx Hash outperforms FNV. Rustc also
moved to this hash function and noticed performance
improvements. Fortunately, as was accounted for in the design, this was a
trivial switch.
Here are some benchmarks to back up that claim:
test hash_set::fnv::with_all_new_1000 ... bench: 133,096 ns/iter (+/- 1,430)
test hash_set::fnv::with_all_new_1000_with_capacity ... bench: 82,591 ns/iter (+/- 592)
test hash_set::fnv::with_all_new_rc_str_1000_baseline ... bench: 162,073 ns/iter (+/- 1,277)
test hash_set::fnv::with_one_new_1000 ... bench: 37,334 ns/iter (+/- 256)
test hash_set::fnv::with_one_new_rc_str_1000_baseline ... bench: 18,263 ns/iter (+/- 261)
test hash_set::fx::with_all_new_1000 ... bench: 85,217 ns/iter (+/- 1,111)
test hash_set::fx::with_all_new_1000_with_capacity ... bench: 59,383 ns/iter (+/- 752)
test hash_set::fx::with_all_new_rc_str_1000_baseline ... bench: 98,802 ns/iter (+/- 1,117)
test hash_set::fx::with_one_new_1000 ... bench: 42,484 ns/iter (+/- 1,239)
test hash_set::fx::with_one_new_rc_str_1000_baseline ... bench: 15,000 ns/iter (+/- 233)
test hash_set::with_all_new_1000 ... bench: 137,645 ns/iter (+/- 1,186)
test hash_set::with_all_new_rc_str_1000_baseline ... bench: 163,129 ns/iter (+/- 1,725)
test hash_set::with_one_new_1000 ... bench: 59,051 ns/iter (+/- 1,202)
test hash_set::with_one_new_rc_str_1000_baseline ... bench: 37,986 ns/iter (+/- 771)