employer/tame - tame - Mike Gerwitz's Forge

employer

tame

Author	SHA1	Message	Date
Mike Gerwitz	28b83ad6a3	tamer: asg::graph::AsgObjectMut: Allow objects to assert ownership over relationships There's a lot to say about this; it's been a bit of a struggle figuring out what I wanted to do here. First: this allows objects to use `AsgObjectMut` to control whether an edge is permitted to be added, or to cache information about an edge that is about to be added. But no object does that yet; it just uses the default trait implementation, and so this _does not change any current behavior_. It also is approximately equivalent cycle-count-wise, according to Valgrind (within ~100 cycles out of hundreds of millions on large package tests). Adding edges to the graph is still infallible _after having received permission_ from an `ObjectIndexRelTo`, but the object is free to reject the edge with an `AsgError`. As an example of where this will be useful: the template system needs to keep track of what is in the body of a template as it is defined. But the `TplAirAggregate` parser is sidelined while expressions in the body are parsed, and edges are added to a dynamic source using `ObjectIndexRelTo`. Consequently, we cannot rely on a static API to cache information; we have to be able to react dynamically. This will allow `Tpl` objects to know any time edges are added and, therefore, determine their shape as the graph is being built, rather than having to traverse the tree after encountering a close. (I _could_ change this, but `ObjectIndexRelTo` removes a significant amount of complexity for the caller, so I'd rather not.) I did explore other options. I rejected the first one, then rejected this one, then rejected the first one again before returning back to this one after having previously sidelined the entire thing, because of the above example. The core point is: I need confidence that the graph isn't being changed in ways that I forgot about, and because of the complexity of the system and the heavy refactoring that I do, I need the compiler's help; otherwise I risk introducing subtle bugs as objects get out of sync with the actual state of the graph. (I wish the graph supported these things directly, but that's a project well outside the scope of my TAMER work. So I have to make do, as I have been all this time, by layering atop of Petgraph.) (...I'm beginning to ramble.) (...beginning?) Anyway: my other rejected idea was to provide attestation via the `ObjectIndex` APIs to force callers to go through those APIs to add an edge to the graph; it would use sealed objects that are inaccessible to any modules other than the objects, and assert that the caller is able to provide a zero-sized object of that sealed type. The problem with this is...exactly what was mentioned above: `ObjectIndexRelTo` is dynamic. We don't always know the source object type statically, and so we cannot make those static assertions. I could have tried the same tricks to store attestation at some other time, but what a confusing mess it would be. And so here we are. Most of this work is cleaning up the callers---adding edges is now fallible, from the `ObjectIndex` API standpoint, and so AIR needed to be set up to handle those failures. There _aren't_ any failures yet, but again, since things are dynamic, they could appear at any moment. Furthermore, since ref/def is commutative (things can be defined and referenced in any order), there could be surprise errors on edge additions in places that might not otherwise expect it in the future. We're now ready for that, and I'll be able to e.g. traverse incoming edges on a `Missing->Transparent` definition to notify dependents. This project is going to be the end of me. As interesting as it is. I can see why Rust just chose to require macro definitions _before_ use. So much less work. DEV-13163	2023-07-24 16:41:32 -04:00
Mike Gerwitz	e414782def	tamer: asg::graph: Encapsulate edge additions AIR is no longer able to explicitly add edges without going through an object-specific `ObjectIndex` API. `Asg::add_edge` was already private, but `ObjectIndex::add_edge_{to,from}` was not. The problem is that I want to augment the graph with other invariants, such as caches. I'd normally have this built into the graph system itself, but I don't have the time for the engineering effort to extend or replace Petgraph, so I'm going to build atop of it. To have confidence in any sort of caching, I need assurances that the graph can't change out from underneath an object. This gets _close_ to accomplishing that, but I'm still uncomfortable: - We're one `pub` addition away from breaking these invariants; and - Other `Object` types can still manipulates one-anothers' edges. So this is a first step that at least proves encapsulation within `asg::graph`, but ideally we'd have the system enforce, statically, that `Objects` own their _outgoing_ edges, and no other `Object` is able to manipulate them. This would ensure that any accidental future changes, or bugs, will cause compilation failures rather than e.g. allowing caches to get out of sync with the graph. DEV-13163	2023-07-21 10:21:57 -04:00
Mike Gerwitz	19a5ec1e0f	tamer: asg: Reduce Debug output of `Asg` and `AirAggregateCtx` The ASG had its output reduced previously but I had apparently stashed it; I found it while trying to clean up after so many failed or partial attempts and the various scoping changes. The most fundamental issue is that there's too much information: it's very difficult to interrogate so I seldom look at it, and it slows down Parser trace output to the point where it's useless on even one of our smallest systems, generating 1.5GiB of output for a graph of ~10k objects (via tameld). DEV-13162	2023-05-23 16:15:38 -04:00
Mike Gerwitz	e940fc5aa0	tamer: asg: Move index from Asg to AirAggregateCtx This finally removes the awkward index from the ASG. This will need much more documentation and a better organized abstraction, but in the meantime, previous commit dive into some of the rationale. In essence: it only really makes sense to have indexing on the ASG itself if it is used to cache queries or other expensive operations. But that is not what we were using it for---it was used for caching _lexical_ properties, which are useful only during parsing for the sake of forming relationships on the graph. Once those relationships have formed, different types of indexes will be useful in different lowering, optimization, or querying contexts. This formalizes that, and in doing so, ensures that the index is will always be accurate relative to the content of the ASG. Once the index becomes separated from it---through the `AirAggregateCtx::finish` operation---then it is discarded and the ASG exposed. This is also important because the index is incomplete---it contains only the information necessary for the parser to carry out its task. This change was a long time coming, and has reduced ASG to its essence. DEV-13162	2023-05-19 13:38:17 -04:00
Mike Gerwitz	94bbc2d725	tamer: asg::air: Root AirIdent operations using AirAggregateCtx This is the culmination of a great deal of work over the past few weeks. Indeed, this change has been prototyped a number of different ways and has lived in a stash of mine, in one form or another, for a few weeks. This is not done just yet---I have to finish moving the index out of Asg, and then clean up a little bit more---but this is a significant simplification of the system. It was very difficult to reason about prior approaches, and this finally moves toward doing something that I wasn't sure if I'd be able to do successfully: formalize scope using AirAggregate's stack and encapsulate indexing as something that is _supplemental_ to the graph, rather than an integral component of it. This _does not yet_ index the AirIdent operation on the package itself because the active state is not part of the stack; that is one of the remaining changes I still have stashed. It will be needed shortly for package imports. This rationale will have to appear in docs, which I intend to write soon, but: this means that `Asg` contains _resolved_ data and itself has no concept of scope. The state of the ASG immediately after parsing _can_ be used to derive what the scope _must_ be (and indeed that's what `asg::air::test::scope::derive_scopes_from_asg` does), but once we start performing optimizations, that will no longer be true in all cases. This means that lexical scope is a property of parsing, which, well, seems kind of obvious from its name. But the awkwardness was that, if we consider scope to be purely a parse-time thing---used only to construct the relationships on the graph and then be discarded---then how do we query for information on the graph? We'd have to walk the graph in search of an identifier, which is slow. But when do we need to do such a thing? For tests, it doesn't matter if it's a little bit slow, and the graphs aren't all that large. And for operations like template expansion and optimizations, if they need access to a particular index, then we'll be sure to generate or provide the appropriate one. If we need a central database of identifiers for tooling in the future, we'll create one then. No general-purpose identifier lookup _is_ actually needed. And with that, `Asg::lookup_or_missing` is removed. It has been around since the beginning of the ASG, when the linker was just a prototype, so it's the end of TAMER's early era as I was trying to discover exactly what I wanted the ASG to represent. DEV-13162	2023-05-17 12:23:36 -04:00
Mike Gerwitz	33f34bf244	tamer: asg: Initial identifier scoping Okay, this is finally distilling into something fairly simple and reasonable, but I'm not quite there yet. In particular, the responsibility is simply between `Asg` (as the owner of the index) and `AirAggregateCtx` (as the owner of the stack frames from which environments and scope are derived). This was inevitable and I was waiting for it, but now I have a good idea of how to clean it up and proceed. This also doesn't index in root yet (`active_rooting_oi` is still `None` for `Root`), and I think I may remove `Pool` and just make it `Visible` at that point, since it won't be going any further anyway. I don't think the distinction is meaningful and will just complicate implementations. The tests also need some more cleanup---the assertions ideally would live in independent tests, and the assertion failure is in a function call rather than the test (function) itself, so requires a Rust backtrace to locate the line number of (unless you look at the failure data). So I suppose this is more of a mental synchronization point than anything. Nothing's broken, though. DEV-13162	2023-05-16 14:58:21 -04:00
Mike Gerwitz	9fb2169a06	tamer: asg::air: Begin to introduce explicit scope testing There's a lot of documentation on this in the commit itself, but this stems from a) frustration with trying to understand how the system needs to operate with all of the objects involved; and b) recognizing that if I'm having difficulty, then others reading the system later on (including myself) and possibly looking to improve upon it are going to have a whole lot of trouble. Identifier scope is something I've been mulling over for years, and more formally for the past couple of months. This finally begins to formalize that, out of frustration with package imports. But it will be a weight lifted off of me as well, with issues of scope always looming. This demonstrates a declarative means of testing for scope by scanning the entire graph in tests to determine where an identifier has been scoped. Since no such scoping has been implemented yet, the tests demonstrate how they will look, but otherwise just test for current behavior. There is more existing behavior to check, and further there will be _references_ to check, as they'll also leave a trail of scope indexing behind as part of the resolution process. See the documentation introduced by this commit for more information on that part of this commit. Introducing the graph scanning, with the ASG's static assurances, required more lowering of dynamic types into the static types required by the API. This was itself a confusing challenge that, while not all that bad in retrospect, was something that I initially had some trouble with. The documentation includes clarifying remarks that hopefully make it all understandable. DEV-13162	2023-05-12 14:07:29 -04:00
Mike Gerwitz	7cfe6a6f8d	tamer: asg::graph: Index Root->Pkg with canonical names The previous commit introduced canonical names, and this uses them to index. The next step will be to utilize those names to look up packages on definition rather than creating a new package node, so that references to yet-to-be-defined (or yet-to-be-imported) packages can be resolved on the graph. DEV-13162	2023-05-02 16:15:07 -04:00
Mike Gerwitz	77ada079e1	tamer: asg::graph::Asg.graph: Finally encapsulate With the previous commit using a visitor implemented within the `asg` module, we can now finally encapsulate the graph. This is a wonderfully liberating, long-awaited change, since I have been fighting with the lack of encapsulation for some time; it has made certain changes challenging and has made the system more difficult to reason about. It also made it impossible to assert that invariants were _actually_ properly enforced, if things could just peer into and modify the graph directly, out from underneath the API that provides those assurances. This also removes our dependency on Petgraph outside of the `asg` module. There are no plans to migrate away from it currently; we'll see how the graph continues to evolve over time and what redundancies are introduced with our data structures. It may render petgraph unnecessary. Interestingly, because my DFS implementation is so similar to Petgraph's, the emitted ordering is _identical_ between this commit and the previous. DEV-13162	2023-04-28 15:36:07 -04:00
Mike Gerwitz	e3094e0bad	tamer: asg::graph::visit::topo: Introduce topological sort This is an initial implementation that does not yet produce errors on cycles. Documentation is not yet complete. The implementation is fairly basic, and similar to Petgraph's DFS. A terminology note: the DFS will be ontology-aware (or at least aware of edge metadata) to avoid traversing edges that would introduce cycles in situations where they are permitted, which effectively performs a topological sort on an implicitly _filtered_ graph. This will end up replacing ld::xmle::lower::sort. DEV-13162	2023-04-26 09:51:45 -04:00
Mike Gerwitz	42aa5bd407	tamer: asg::graph: Root->Ident {tree=>cross} edge tameld isn't yet adding edges to Idents from their associated Pkg (see previous commit), but this formalizes how the ontology will interpret such a relationship. The idea is that Idents are always owned by Pkgs, but they may be optionally explicitly rooted, which will be used by a particular type of DFS walk that is about to be written, which can ignore Root->Pkg and focus instead on cross edges to Idents. Though it's not lost on me that now that I'll be introducing a DFS for the linker, the terms "cross" and "tree" edge now become ambiguous; I used to call them "ontological X edge", but I had fallen out of that habit; perhaps I need to reintroduce that rigor. DEV-13162	2023-04-24 09:44:02 -04:00
Mike Gerwitz	48d9bca3b7	tamer: obj::xmlo: Add Pkg nodes for identifiers This modifies the xmlo reader, xmlo->AIR lowering, and AIR->ASG to introduce a package for identifiers. It does not yet, however, add edges from the package to the identifier. Once edges are added, the DFS will change in undesirable ways, which will require a new implementation. This is desirable to decouple from Petgraph anyway, and then will be able to restore the prior single-pass sort+cycle check. That will also encapsulate visiting behavior within the `asg::graph` module and, in turn, allow encapsulating `Asg.graph` finally. DEV-13162	2023-04-21 16:24:11 -04:00
Mike Gerwitz	6f68292df5	tamer: asg::graph::{index_identifier=>index}: Generalize This may now index _any_ type of object, in preparation for indexing package import paths. In practice, this only makes sense (at least currently) for `Pkg` and `Ident`. This generalization also applies to `Asg::lookup_or_missing`. DEV-13162	2023-04-20 16:46:30 -04:00
Mike Gerwitz	f183600c3a	tamer: asg: Move Ident-specific methods off of Asg Historically, the ASG was better described as a "dependency graph", containing only identifiers (which are simply called "symbols" in the XSLT-based compiler). Consequently, it was appropriate for the graph to have operations specific to identifiers. (Indeed, that's the only type of object the graph supported.) Much has changed since then. This cleans things up, and makes parenting identifiers to root an _explicit_ operation. This will make it easier to move forward with handling of scope, and importing identifiers into packages, and removing `Source`, and so on. DEV-13162	2023-04-19 12:40:35 -04:00
Mike Gerwitz	778e90c81d	tamer: asg::air: Index package identifiers on `Pkg` rather than `Root` I've been torturing myself trying to figure out how I want to generalize indexing, lookups, and value numbering in a way that is appropriate for this project (that is, not over-engineered relative to my needs). Before I can do much of anything, though, I need to stop having indexing only as a `Root` thing (previously it wasn't even tied to `Root`). This makes that change for tamec, but temporarily removes scoping concerns until I can add more specific types of indexing. Not only does this allow cleaning up some `Ident`-specific stuff from `Asg`, but the cleanup also helps to show that portions of the system aren't still using Root-based globals. The linker (`tameld`) still uses the old `global` methods for now; those will eventually go away, but this needs to change to unify both tamec and tameld once we get to imports as part of the compiler. DEV-13162	2023-04-19 12:40:34 -04:00
Mike Gerwitz	a738a05461	tamer: asg::graph::object::rel: Hash impls for ObjectIndexTo{,Tree} All ObjectIndex-like objects hash using only the underlying identifier, which ultimately boils down to a `NodeIndex` (petgraph), which is just a u32. And so in that sense, the only purpose we have for hashing it is to (a) reduce the space required to store mappings, and (b) compose with other `Hash`es. DEV-13708	2023-04-05 15:46:42 -04:00
Mike Gerwitz	02dba0d63a	tamer: asg::graph::Asg: Index by (SymbolId, NodeIndex) pair The prior commit begins to explain the end goal of being able to index identifiers outside of the global environment. This change continues to index things as before, but introduces a new key based on the pair of the symbol id together with a node that is _part of_ its target environment. The only environment utilized at the moment (in this commit) is that of the root node (which is the global scope), in both indexing and lookup. Future commits will extend this, and contain more information about and rationale for the implementation. The new general index methods are restricted to `pub(super)` until an abstraction can be put in place that is responsible for environment indexing; that's a responsibility that is currently handled by `AirAggregateCtx` for tamec, and the linker has no scoping requirements since all of that has already been dealt with. DEV-13708	2023-04-03 16:14:30 -04:00
Mike Gerwitz	5b0a4561a2	Revert "Revert "tamer: asg::graph::index: Use FxHashMap in place of Vec"" This reverts commit 1b7eac337cd5909c01ede3a5b3fba577898d5961. This is a revert of the previous revert, just so that I (and you) have references to prior rationale. This was previously reverted because it wasn't worth doing, but now we have a situation where we need to begin implementing lexical scoping rules for nested containers (packages and templates). In particular, as you'll see in the commits that follow, we need to be able to look up an identifier that may have been created as Missing at one level of scope (certain types of blocks), but then define it at another level. Or, even more simply at this point, since I'm not yet doing anything sophisticated with scope: we're only indexing in the global environment, and we need to be able to index elsewhere too. The next commit will go into more information, but suffice it to say for now that indexing is going to get more complicated than a SymbolId. Sticking with FxHash for now; we don't need a stable hash now. DEV-13708	2023-04-03 15:15:54 -04:00
Mike Gerwitz	e3d60750a9	tamer: asg::air: Errors for rooting_ci() TODOs This eliminates the TODOs that existed when looking for an OI for rooting an identifier. The change to `rooting_ci` is ridiculous, but I want to get other things done before I jump down the rabbit hole of generalizing that (indexing local identifiers). Though I have an approach in mind. DEV-13708	2023-03-31 13:57:11 -04:00
Mike Gerwitz	2ae33a1dfa	tamer: asg::graph::object: ObjectIndexTo and ObjectIndexRelTo The graph's ontology is defined in the direction of the edge: from OA to OB. This is enforced by the type system to ensure that no code path is able to generate an invalid graph. But that also makes it very difficult to work with a generic source to a specific target. This introduces a `ObjectIndexRelTo` trait that says whether `Self` is able to be related to some `ObjectKind` `OB`, implements it for `ObjectIndex where ObjectRelTo<OB>`, and introduces a new semi-opaque type `ObjectIndexTo` that allows for the source `ObjectIndex` to be generic. This then redefines some existing graph primitives in terms of `ObjectIndexRelTo`, in particular creating edges, so that `ObjectIndex` can be used as today, and the new `ObjectIndexTo` can be used in the same way with the same API, without violating the graph ontology. This will be used by `AirAggregate` to create dynamic targets for rooting and splicing/expansion. DEV-13708	2023-03-29 12:58:35 -04:00
Mike Gerwitz	9c0e20e58c	tamer: asg: Shorthand and long-form template arguments This applies to template application only; there's still some work to do for template parameters in definitions (well, for deriving them in `xmli` at least). And, as you can see, there's still a lot of TODO items here. I ended up backtracking on tree edges to Meta, and even on cross edges to Meta, because it complicated xmli derivation with no benefit right now; maybe a cross edge will be re-added in the future, but I need to move on and see where this takes me. But, it works. DEV-13708	2023-03-29 12:58:35 -04:00
Mike Gerwitz	fcd25d581c	tamer: asg::air::expr: Do not cache (globally) identifiers created with StoreDangling I'm not happy with this implementation. The linear search is undesirable, but not too bad (and maybe wouldn't even be worth caching, if this were the whole story), but we _also_ need to prevent duplicate identifiers. We are not going to want to perform a linear search of a linked list (effectively) every time we add an identifier to check for uniqueness, so I think the caching is going to have to be generalized very shortly anyway. As it stands now, a duplicate identifier would cause an error at expansion time. That's not what we want, but it's not terrible, because you can have that same problem in normal circumstances without local conflicts. But this'll be used for metavariables as well, where we absolutely _do_ want to fail at template definition time. DEV-13708	2023-03-29 12:58:35 -04:00
Mike Gerwitz	1c7df894ea	tamer: asg::graph: lookup{=>_global} Identifier lookups, as done using the graph methods today, look up from a cache representing the global environment. Templates must not contribute to this environment until expansion. Further, metavariables will not be present in this environment. To avoid confusion and help obviate accidental contributions to this environment, the methods have been renamed. This will also allow for the creation of more general methods down the line. DEV-13708	2023-03-29 12:58:35 -04:00
Mike Gerwitz	a5b03e8790	tamer: Embed ASG ontology visualization in rustdoc-generated docs There, in-your-face and not hidden in some tools directory. DEV-13708	2023-03-10 14:28:00 -05:00
Mike Gerwitz	3587d032c3	tamer: asg::graph::object::rel::DynObjectRel: Store source data This is generic over the source, just as the target, defaulting just the same to `ObjectIndex`. This allows us to use only the edge information provided rather than having to perform another lookup on the graph and then assert that we found the correct edge. In this case, we're dealing with an `Ident->Expr` edge, of which there is only one, but in other cases, there may be many such edges, and it wouldn't be possible to know _which_ was referred to without also keeping context of the previous edge in the walk. So, in addition to avoiding more indirection and being more immune to logic bugs, this also allows us to avoid states in `AsgTreeToXirf` for the purpose of tracking previous edges in the current path. And it means that the tree walk can seed further traversals in conjunction with it, if that is so needed for deriving sources. More cleanup will be needed, but this does well to set us up for moving forward; I was too uncomfortable with having to do the separate lookup. This is also a more intuitive API. But it does have the awkward effect that now I don't need the pair---I just need the `Object`---but I'm not going to remove it because I suspect I may need it in the future. We'll see. The TODO references the fact that I'm using a convenient `resolve_oi_pairs` instead of resolving only the target first and then the source only in the code path that needs it. I'll want to verify that Rust will properly optimize to avoid the source resolution in branches that do not need it. DEV-13708	2023-03-10 14:27:58 -05:00
Mike Gerwitz	ee9128fbe0	tamer: asg::graph::{object::xir=>xmli}: Rename module This better reflects what is being done and makes it easier for someone to find. DEV-13708	2023-03-10 14:27:58 -05:00
Mike Gerwitz	7f3ce44481	tamer: asg::graph: Formalize dynamic relationships (edges) The `TreePreOrderDfs` iterator needed to expose additional edge context to the caller (specifically, the `Span`). This was getting a bit messy, so this consolodates everything into a new `DynObjectRel`, which also emphasizes that it is in need of narrowing. Packing everything up like that also allows us to return more information to the caller without complicating the API, since the caller does not need to be concerned with all of those values individually. Depth is kept separate, since that is a property of the traversal and is not stored on the graph. (Rather, it _is_ a property of the graph, but it's not calculated until traversal. But, depth will also vary for a given node because of cross edges, and so we cannot store any concrete depth on the graph for a given node. Not even a canonical one, because once we start doing inlining and common subexpression elimination, there will be shared edges that are _not_ cross edges (the node is conceptually part of _both_ trees). Okay, enough of this rambling parenthetical.) DEV-13708	2023-03-10 14:27:57 -05:00
Mike Gerwitz	e6f736298b	tamer: asg::graph::visit::tree_reconstruction: New graph traversal This begins to introduce a graph traversal useful for a source reconstruction from the current state of the ASG. The idea is to, after having parsed and ingested the source through the lowering pipeline, to re-output it to (a) prove that we have parsed correctly and (b) allow progressively moving things from the XSLT-based compiler into TAMER. There's quite a bit of documentation here; see that for more information. Generalizing this in an appropriate way took some time, but I think this makes sense (that work began with the introduction of cross edges in terms of the tree described by the graph's ontology). But I do need to come up with an illustration to include in the documentation. DEV-13708	2023-03-10 14:27:57 -05:00
Mike Gerwitz	2d3b27ac01	tamer: asg: Root package definition This causes a package definition to be rooted (so that it can be easily accessed for a graph walk). This keeps consistent with the new `ObjectIndex`-based API by introducing a unit `Root` `ObjectKind` and the boilerplate that goes with it. This boilerplate, now glaringly obvious, will be refactored at some point, since its repetition is onerous and distracting. DEV-13159	2023-02-01 10:34:17 -05:00
Mike Gerwitz	f753a23bad	tamer: asg: Introduce edge from Package to Ident Included in this diff are the corresponding changes to the graph to support the change. Adding the edge was easy, but we also need a way to get the package for an identifier. The easiest way to do that is to modify the edge weight to include not just the target node type, but also the source. DEV-13159	2023-02-01 10:34:17 -05:00
Mike Gerwitz	2f08985111	tamer: asg::graph::object::new_rel_dyn: Use Option Rather than panicing at this level, let's panic at the caller, simplifying impls and keeping them total. This can't occur now, but an upcoming change introducing a package type will allow for such a thing. DEV-13159	2023-02-01 10:34:16 -05:00
Mike Gerwitz	e6abd996b7	tamer: asg::graph::Asg: Non-exhaustive Debug impl This hides information that's taking up a lot of space in the parser traces and is not useful information. In particular, the `index` contains a lot of empty space due to pre-interned symbols. The index was going to be converted into a HashMap, but that was reverted because the tradeoff did not make sense, and so this problem remains; see the previous commit for more information. DEV-13159	2023-02-01 10:34:16 -05:00
Mike Gerwitz	d066bb370f	Revert "tamer: asg::graph::index: Use FxHashMap in place of Vec" This reverts commit 1b7eac337cd5909c01ede3a5b3fba577898d5961. I don't actually think this ends up being worth it in the end. Sure, the implementation is simpler at a glance, but it is more complex at runtime, adding more cycles for little benefit. There are ~220 pre-interned symbols at the time of writing, so ~880 bytes (4 bytes per symbol) are potentially wasted if _none_ of the pre-interned symbols end up serving as identifiers in the graph. The reality is that some of them _will_ but, but using HashMap also introduces overhead, so in practice, the savings is much less. On a fairly small package, it was <100 bytes memory saving in `tamec`. For `tameld`, it actually uses _more_ memory, especially on larger packages, because there are 10s of thousands of symbols involved. And we're incurring a rehashing cost on resize, unlike this original plain `Vec` implementation. So, I'm leaving this in the history to reference in the future or return to it if others ask; maybe it'll be worth it in the future.	2023-02-01 10:34:16 -05:00
Mike Gerwitz	417df548cf	tamer: asg::graph::index: Use FxHashMap in place of Vec This was originally written before there were a bunch of preinterned symbols. Now the index vector is very sparse. This simplifies things a bit. If this ends up manifesting as a bottleneck in the future, we can revisit the implementation. While this does result in more cycles, it's neglibable relative to the total cycle count.	2023-02-01 10:34:16 -05:00
Mike Gerwitz	055ff4a9d9	tamer: Remove graphml target This was originally created to populate Neo4J for querying, but it has not been utilized. It's become a maintenance burden as I try to change the API of and encapsulate the graph, which is important for upholding its invariants. This feature, or one like it, will return in the future. I have other related plans; we'll see if they materialize. The graph can't be encapsulated fully just yet because of the linker; those commits will come in the following days. DEV-13597	2023-01-26 14:45:17 -05:00
Mike Gerwitz	8735c2fca3	tamer: asg::graph: Static- and runtime-enforced multi-kind edge ontolgoy This allows for edges to be multiple types, and gives us two important benefits: (a) Compiler-verified correctness to ensure that we don't generate graphs that do not adhere to the ontology; and (b) Runtime verification of types, so that bugs are still memory safe. There is a lot more information in the documentation within the patch. This took a lot of iterating to get something that was tolerable. There's quite a bit of boilerplate here, and maybe that'll be abstracted away better in the future as the graph grows. In particular, it was challenging to determine how I wanted to actually go about narrowing and looking up edges. Initially I had hoped to represent the subsets as `ObjectKind`s as well so that you could use them anywhere `ObjectKind` was expected, but that proved to be far too difficult because I cannot return a reference to a subset of `Object` (the value would be owned on generation). And while in a language like C maybe I'd pad structures and cast between them safely, since they _do_ overlap, I can't confidently do that here since Rust's discriminant and layout are not under my control. I tried playing around with `std::mem::Discriminant` as well, but `discriminant` (the function) requires a _value_, meaning I couldn't get the discriminant of a static `Object` variant without some dummy value; wasn't worth it over `ObjectRelTy.` We further can't assign values to enum variants unless they hold no data. Rust a decade from now may be different and will be interesting to look back on this struggle. DEV-13597	2023-01-26 14:45:14 -05:00
Mike Gerwitz	954b5a2795	Copyright year and name update Ryan Specialty Group (RSG) rebranded to Ryan Specialty after its IPO.	2023-01-20 23:37:30 -05:00
Mike Gerwitz	1be0f2fe70	tamer: asg::object: Move into graph module The ASG delegates certain operations to Objects so that they may enforce their own invariants and ontology. It is therefore important that only objects have access to certain methods on `Asg`, otherwise those invariants could be circumvented. It should be noted that the nesting of this module is such that AIR should _not_ have privileged access to the ASG---it too must utilize objects to ensure those invariants are enforced in a single place. DEV-13597	2023-01-20 23:37:30 -05:00
Mike Gerwitz	c9746230ef	tamer: asg::graph::test: Extract into own file DEV-13597	2023-01-20 23:37:29 -05:00
Mike Gerwitz	4e3a81d7f5	tamer: asg: Bind transparent ident This provides the initial implementation allowing an identifier to be defined (bound to an object and made transparent). I'm not yet entirely sure whether I'll stick with the "transparent" and "opaque" terminology when there's also "declare" and "define", but a `Missing` state is a type of declaration and so the distinction does still seem to be important. There is still work to be done on `ObjectIndex::<Ident>::bind_definition`, which will follow. I'm going to be balancing work to provide type-level guarantees, since I don't have the time to go as far as I'd like. DEV-13597	2023-01-20 23:37:29 -05:00
Mike Gerwitz	378fe3db66	tamer: asg::Asg::lookup: SymbolId=>SPair This seems to have been an oversight from when I recently introduced SPairs to ASG; I noticed it while working on another change and receiving back a `DUMMY_SPAN`. DEV-13597	2023-01-20 23:37:29 -05:00
Mike Gerwitz	a9e65300fb	tamer: diagnose::panic: Require thunk or static ref for diagnostic data Some investigation into the disassembly of TAMER's binaries showed that Rust was not able to conditionalize `expect`-like expressions as I was hoping due to eager evaluation language semantics in combination with the use of `format!`. This solves the problem for the diagnostic system be creating types that prevent this situation from occurring statically, without the need for a lint.	2023-01-20 23:37:29 -05:00
Mike Gerwitz	e6640c0019	tamer: Integrate clippy This invokes clippy as part of `make check` now, which I had previously avoided doing (I'll elaborate on that below). This commit represents the changes needed to resolve all the warnings presented by clippy. Many changes have been made where I find the lints to be useful and agreeable, but there are a number of lints, rationalized in `src/lib.rs`, where I found the lints to be disagreeable. I have provided rationale, primarily for those wondering why I desire to deviate from the default lints, though it does feel backward to rationalize why certain lints ought to be applied (the reverse should be true). With that said, this did catch some legitimage issues, and it was also helpful in getting some older code up-to-date with new language additions that perhaps I used in new code but hadn't gone back and updated old code for. My goal was to get clippy working without errors so that, in the future, when others get into TAMER and are still getting used to Rust, clippy is able to help guide them in the right direction. One of the reasons I went without clippy for so long (though I admittedly forgot I wasn't using it for a period of time) was because there were a number of suggestions that I found disagreeable, and I didn't take the time to go through them and determine what I wanted to follow. Furthermore, it was hard to make that judgment when I was new to the language and lacked the necessary experience to do so. One thing I would like to comment further on is the use of `format!` with `expect`, which is also what the diagnostic system convenience methods do (which clippy does not cover). Because of all the work I've done trying to understand Rust and looking at disassemblies and seeing what it optimizes, I falsely assumed that Rust would convert such things into conditionals in my otherwise-pure code...but apparently that's not the case, when `format!` is involved. I noticed that, after making the suggested fix with `get_ident`, Rust proceeded to then inline it into each call site and then apply further optimizations. It was also previously invoking the thread lock (for the interner) unconditionally and invoking the `Display` implementation. That is not at all what I intended for, despite knowing the eager semantics of function calls in Rust. Anyway, possibly more to come on that, I'm just tired of typing and need to move on. I'll be returning to investigate further diagnostic messages soon.	2023-01-20 23:37:29 -05:00
Mike Gerwitz	f1cf35f499	tamer: asg: Add expression edges This introduces a number of abstractions, whose concepts are not fully documented yet since I want to see how it evolves in practice first. This introduces the concept of edge ontology (similar to a schema) using the type system. Even though we are not able to determine what the graph will look like statically---since that's determined by data fed to us at runtime---we _can_ ensure that the code _producing_ the graph from those data will produce a graph that adheres to its ontology. Because of the typed `ObjectIndex`, we're also able to implement operations that are specific to the type of object that we're operating on. Though, since the type is not (yet?) stored on the edge itself, it is possible to walk the graph without looking at node weights (the `ObjectContainer`) and therefore avoid panics for invalid type assumptions, which is bad, but I don't think that'll happen in practice, since we'll want to be resolving nodes at some point. But I'll addres that more in the future. Another thing to note is that walking edges is only done in tests right now, and so there's no filtering or anything; once there are nodes (if there are nodes) that allow for different outgoing edge types, we'll almost certainly want filtering as well, rather than panicing. We'll also want to be able to query for any object type, but filter only to what's permitted by the ontology. DEV-13160	2023-01-20 23:37:29 -05:00
Mike Gerwitz	5e13c93a8f	tamer: asg: New ObjectContainer for Node type Working with the graph can be confusing with all of the layers involved. This begins to provide a better layer of abstraction that can encapsulate the concept and enforce invariants. Since I'm better able to enforce invariants now, this also removes the span from the diagnostic message, since the invariant is now always enforced with certainty. I'm not removing the runtime panic, though; we can revisit that if future profiling shows that it makes a negative impact. DEV-13160	2023-01-20 23:37:29 -05:00
Mike Gerwitz	8786ee74fa	tamer: asg::air: Expression building error cases This addresses the two outstanding `todo!` match arms representing errors in lowering expressions into the graph. As noted in the comments, these errors are unlikely to be hit when using TAME in the traditional way, since e.g. XIR and NIR are going to catch the equivalent problems within their own contexts (unbalanced tags and a valid expression grammar respectively). _But_, the IR does need to stand on its own, and I further hope that some tooling maybe can interact more directly with AIR in the future. DEV-13160	2023-01-20 23:37:29 -05:00
Mike Gerwitz	40c941d348	tamer: asg::air::AirAggregate: Initial impl of nested exprs This introduces a number of concepts together, again to demonstrate that they were derived. This introduces support for nested expressions, extending the previous work. It also supports error recovery for dangling expressions. The parser states are a mess; there is a lot of duplicate code here that needs refactoring, but I wanted to commit this first at a known-good state so that the diff will demonstrate the need for the change that will follow; the opportunities for abstraction are plainly visible. The immutable stack introduced here could be generalized, if needed, in the future. Another important note is that Rust optimizes away the `memcpy`s for the stack that was introduced here. The initial Parser Context was introduced because of `ArrayVec` inhibiting that elision, but Vec never had that problem. In the future, I may choose to go back and remove ArrayVec, but I had wanted to keep memory allocation out of the picture as much as possible to make the disassembly and call graph easier to reason about and to have confidence that optimizations were being performed as intended. With that said---it _should_ be eliding in tamec, since we're not doing anything meaningful yet with the graph. It does also elide in tameld, but it's possible that Rust recognizes that those code paths are never taken because tameld does nothing with expressions. So I'll have to monitor this as I progress and adjust accordingly; it's possible a future commit will call BS on everything I just said. Of course, the counter-point to that is that Rust is optimizing them away anyway, but Vec _does_ still require allocation; I was hoping to keep such allocation at the fringes. But another counter-point is that it _still_ is allocated at the fringe, when the context is initialized for the parser as part of the lowering pipeline. But I didn't know how that would all come together back then. ...alright, enough rambling. DEV-13160	2023-01-20 23:37:29 -05:00
Mike Gerwitz	edbfc87a54	tamer: f::Functor: New trait This commit is purposefully coupled with changes that utilize it to demonstrate that the need for this abstraction has been _derived_, not forced; TAMER doesn't aim to be functional for the sake of it, since idiomatic Rust achieves many of its benefits without the formalisms. But, the formalisms do occasionally help, and this is one such example. There is other existing code that can be refactored to take advantage of this style as well. I do _not_ wish to pull an existing functional dependency into TAMER; I want to keep these abstractions light, and eliminate them as necessary, as Rust continues to integrate new features into its core. I also want to be able to modify the abstractions to suit our particular needs. (This is _not_ a general recommendation; it's particular to TAMER and to my experience.) This implementation of `Functor` is one such example. While it is modeled after Haskell in that it provides `fmap`, the primitive here is instead `map`, with `fmap` derived from it, since `map` allows for better use of Rust idioms. Furthermore, it's polymorphic over _trait_ type parameters, not method, allowing for separate trait impls for different container types, which can in turn be inferred by Rust and allow for some very concise mapping; this is particularly important for TAMER because of the disciplined use of newtypes. For example, `foo.overwrite(span)` and `foo.overwrite(name)` are both self-documenting, and better alternatives than, say, `foo.map_span(\|_\| span)` and `foo.map_symbol(\|_\| name)`; the latter are perfectly clear in what they do, but lack a layer of abstraction, and are verbose. But the clarity of the _new_ form does rely on either good naming conventions of arguments, or explicit type annotations using turbofish notation if necessary. This will be implemented on core Rust types as appropriate and as possible. At the time of writing, we do not yet have trait specialization, and there's too many soundness issues for me to be comfortable enabling it, so that limits that we can do with something like, say, a generic `Result`, while also allowing for specialized implementations based on newtypes. DEV-13160	2023-01-20 23:37:27 -05:00
Mike Gerwitz	0863536149	tamer: asg::Asg::get: Narrow object type This uses `ObjectIndex` to automatically narrow the type to what is expected. Given that `ObjectIndex` is supposed to mean that there must be an object with that index, perhaps the next step is to remove the `Option` from `get` as well. DEV-13160	2022-12-22 16:32:21 -05:00
Mike Gerwitz	6e90867212	tamer: asg::object::Object{Ref=>Index}: Associate object type This makes the system a bit more ergonomic and introduces additional type safety by associating the narrowed object type with the `ObjectIndex` (previously `ObjectRef`). Not only does this allow us to explicitly state the type of object wherever those indices are stored, but it also allows the API to automatically narrow to that type when operating on it again without the caller having to worry about it. DEV-13160	2022-12-22 15:18:08 -05:00

1 2

67 Commits (28b83ad6a3d7ef3c60ad2b8895b031758c8ee3d1)