tame/tamer/src/asg/air.rs

627 lines
21 KiB
Rust
Raw Normal View History

tamer: Refactor asg_builder into obj::xmlo::lower and asg::air This finally uses `parse` all the way up to aggregation into the ASG, as can be seen by the mess in `poc`. This will be further simplified---I just need to get this committed so that I can mentally get it off my plate. I've been separating this commit into smaller commits, but there's a point where it's just not worth the effort anymore. I don't like making large changes such as this one. There is still work to do here. First, it's worth re-mentioning that `poc` means "proof-of-concept", and represents things that still need a proper home/abstraction. Secondly, `poc` is retrieving the context of two parsers---`LowerContext` and `Asg`. The latter is desirable, since it's the final aggregation point, but the former needs to be eliminated; in particular, packages need to be worked into the ASG so that `found` can be removed. Recursively loading `xmlo` files still happens in `poc`, but the compiler will need this as well. Once packages are on the ASG, along with their state, that responsibility can be generalized as well. That will then simplify lowering even further, to the point where hopefully everything has the same shape (once final aggregation has an abstraction), after which we can then create a final abstraction to concisely stitch everything together. Right now, Rust isn't able to infer `S` for `Lower<S, LS>`, which is unfortunate, but we'll be able to help it along with a more explicit abstraction. DEV-11864
2022-05-27 13:51:29 -04:00
// ASG IR
//
// Copyright (C) 2014-2023 Ryan Specialty, LLC.
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air This finally uses `parse` all the way up to aggregation into the ASG, as can be seen by the mess in `poc`. This will be further simplified---I just need to get this committed so that I can mentally get it off my plate. I've been separating this commit into smaller commits, but there's a point where it's just not worth the effort anymore. I don't like making large changes such as this one. There is still work to do here. First, it's worth re-mentioning that `poc` means "proof-of-concept", and represents things that still need a proper home/abstraction. Secondly, `poc` is retrieving the context of two parsers---`LowerContext` and `Asg`. The latter is desirable, since it's the final aggregation point, but the former needs to be eliminated; in particular, packages need to be worked into the ASG so that `found` can be removed. Recursively loading `xmlo` files still happens in `poc`, but the compiler will need this as well. Once packages are on the ASG, along with their state, that responsibility can be generalized as well. That will then simplify lowering even further, to the point where hopefully everything has the same shape (once final aggregation has an abstraction), after which we can then create a final abstraction to concisely stitch everything together. Right now, Rust isn't able to infer `S` for `Lower<S, LS>`, which is unfortunate, but we'll be able to help it along with a more explicit abstraction. DEV-11864
2022-05-27 13:51:29 -04:00
//
// This file is part of TAME.
//
// This program is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU General Public License
// along with this program. If not, see <http://www.gnu.org/licenses/>.
tamer: asg::air: AIR as a sum IR This introduces a new macro `sum_ir!` to help with a long-standing problem of not being able to easily narrow types in Rust without a whole lot of boilerplate. This patch includes a bit of documentation, so see that for more information. This was not a welcome change---I jumped down this rabbit hole trying to decompose `AirAggregate` so that I can share portions of parsing with the current parser and a template parser. I can now proceed with that. This is not the only implementation that I had tried. I previously inverted the approach, as I've been doing manually for some time: manually create types to hold the sets of variants, and then create a sum type to hold those types. That works, but it resulted in a mess for systems that have to use the IR, since now you have two enums to contend with. I didn't find that to be appropriate, because we shouldn't complicate the external API for implementation details. The enum for IRs is supposed to be like a bytecode---a list of operations that can be performed with the IR. They can be grouped if it makes sense for a public API, but in my case, I only wanted subsets for the sake of delegating responsibilities to smaller subsystems, while retaining the context that `match` provides via its exhaustiveness checking but does not expose as something concrete (which is deeply frustrating!). Anyway, here we are; this'll be refined over time, hopefully, and portions of it can be generalized for removing boilerplate from other IRs. Another thing to note is that this syntax is really a compromise---I had to move on, and I was spending too much time trying to get creative with `macro_rules!`. It isn't the best, and it doesn't seem very Rust-like in some places and is therefore not necessarily all that intuitive. This can be refined further in the future. But the end result, all things considered, isn't too bad. DEV-13708
2023-03-02 15:15:28 -05:00
//! Intermediate representation for construction of the
//! [abstract semantic graph (ASG)](super) (AIR).
//!
//! AIR serves as an abstraction layer between higher-level parsers and the
//! aggregate ASG.
//! It allows parsers to operate as a raw stream of data without having to
//! worry about ownership of or references to the ASG,
//! and allows for multiple such parsers to be joined.
//!
//! AIR is _not_ intended to replace the API of the ASG---it
//! is intended as a termination point for the parsing pipeline,
//! and as such implements a subset of the ASG's API that is suitable
//! for aggregating raw data from source and object files.
//! Given that it does so little and is so close to the [`Asg`] API,
//! one might say that the abstraction is as light as air,
//! but that would surely result in face-palming and so we're not going
//! air such cringeworthy dad jokes here.
use super::{
graph::object::{ObjectIndexTo, ObjectIndexToTree, Pkg, Tpl},
Asg, AsgError, Expr, Ident, ObjectIndex,
};
use crate::{
diagnose::Annotate,
diagnostic_todo,
parse::{prelude::*, StateStack},
span::{Span, UNKNOWN_SPAN},
sym::SymbolId,
};
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air This finally uses `parse` all the way up to aggregation into the ASG, as can be seen by the mess in `poc`. This will be further simplified---I just need to get this committed so that I can mentally get it off my plate. I've been separating this commit into smaller commits, but there's a point where it's just not worth the effort anymore. I don't like making large changes such as this one. There is still work to do here. First, it's worth re-mentioning that `poc` means "proof-of-concept", and represents things that still need a proper home/abstraction. Secondly, `poc` is retrieving the context of two parsers---`LowerContext` and `Asg`. The latter is desirable, since it's the final aggregation point, but the former needs to be eliminated; in particular, packages need to be worked into the ASG so that `found` can be removed. Recursively loading `xmlo` files still happens in `poc`, but the compiler will need this as well. Once packages are on the ASG, along with their state, that responsibility can be generalized as well. That will then simplify lowering even further, to the point where hopefully everything has the same shape (once final aggregation has an abstraction), after which we can then create a final abstraction to concisely stitch everything together. Right now, Rust isn't able to infer `S` for `Lower<S, LS>`, which is unfortunate, but we'll be able to help it along with a more explicit abstraction. DEV-11864
2022-05-27 13:51:29 -04:00
use std::fmt::{Debug, Display};
tamer: asg::air: AIR as a sum IR This introduces a new macro `sum_ir!` to help with a long-standing problem of not being able to easily narrow types in Rust without a whole lot of boilerplate. This patch includes a bit of documentation, so see that for more information. This was not a welcome change---I jumped down this rabbit hole trying to decompose `AirAggregate` so that I can share portions of parsing with the current parser and a template parser. I can now proceed with that. This is not the only implementation that I had tried. I previously inverted the approach, as I've been doing manually for some time: manually create types to hold the sets of variants, and then create a sum type to hold those types. That works, but it resulted in a mess for systems that have to use the IR, since now you have two enums to contend with. I didn't find that to be appropriate, because we shouldn't complicate the external API for implementation details. The enum for IRs is supposed to be like a bytecode---a list of operations that can be performed with the IR. They can be grouped if it makes sense for a public API, but in my case, I only wanted subsets for the sake of delegating responsibilities to smaller subsystems, while retaining the context that `match` provides via its exhaustiveness checking but does not expose as something concrete (which is deeply frustrating!). Anyway, here we are; this'll be refined over time, hopefully, and portions of it can be generalized for removing boilerplate from other IRs. Another thing to note is that this syntax is really a compromise---I had to move on, and I was spending too much time trying to get creative with `macro_rules!`. It isn't the best, and it doesn't seem very Rust-like in some places and is therefore not necessarily all that intuitive. This can be refined further in the future. But the end result, all things considered, isn't too bad. DEV-13708
2023-03-02 15:15:28 -05:00
#[macro_use]
mod ir;
pub use ir::Air;
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air This finally uses `parse` all the way up to aggregation into the ASG, as can be seen by the mess in `poc`. This will be further simplified---I just need to get this committed so that I can mentally get it off my plate. I've been separating this commit into smaller commits, but there's a point where it's just not worth the effort anymore. I don't like making large changes such as this one. There is still work to do here. First, it's worth re-mentioning that `poc` means "proof-of-concept", and represents things that still need a proper home/abstraction. Secondly, `poc` is retrieving the context of two parsers---`LowerContext` and `Asg`. The latter is desirable, since it's the final aggregation point, but the former needs to be eliminated; in particular, packages need to be worked into the ASG so that `found` can be removed. Recursively loading `xmlo` files still happens in `poc`, but the compiler will need this as well. Once packages are on the ASG, along with their state, that responsibility can be generalized as well. That will then simplify lowering even further, to the point where hopefully everything has the same shape (once final aggregation has an abstraction), after which we can then create a final abstraction to concisely stitch everything together. Right now, Rust isn't able to infer `S` for `Lower<S, LS>`, which is unfortunate, but we'll be able to help it along with a more explicit abstraction. DEV-11864
2022-05-27 13:51:29 -04:00
mod expr;
mod tpl;
use expr::AirExprAggregate;
use tpl::AirTplAggregate;
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air This finally uses `parse` all the way up to aggregation into the ASG, as can be seen by the mess in `poc`. This will be further simplified---I just need to get this committed so that I can mentally get it off my plate. I've been separating this commit into smaller commits, but there's a point where it's just not worth the effort anymore. I don't like making large changes such as this one. There is still work to do here. First, it's worth re-mentioning that `poc` means "proof-of-concept", and represents things that still need a proper home/abstraction. Secondly, `poc` is retrieving the context of two parsers---`LowerContext` and `Asg`. The latter is desirable, since it's the final aggregation point, but the former needs to be eliminated; in particular, packages need to be worked into the ASG so that `found` can be removed. Recursively loading `xmlo` files still happens in `poc`, but the compiler will need this as well. Once packages are on the ASG, along with their state, that responsibility can be generalized as well. That will then simplify lowering even further, to the point where hopefully everything has the same shape (once final aggregation has an abstraction), after which we can then create a final abstraction to concisely stitch everything together. Right now, Rust isn't able to infer `S` for `Lower<S, LS>`, which is unfortunate, but we'll be able to help it along with a more explicit abstraction. DEV-11864
2022-05-27 13:51:29 -04:00
pub type IdentSym = SymbolId;
pub type DepSym = SymbolId;
tamer: asg::air::AirAggregate: Initial impl of nested exprs This introduces a number of concepts together, again to demonstrate that they were derived. This introduces support for nested expressions, extending the previous work. It also supports error recovery for dangling expressions. The parser states are a mess; there is a lot of duplicate code here that needs refactoring, but I wanted to commit this first at a known-good state so that the diff will demonstrate the need for the change that will follow; the opportunities for abstraction are plainly visible. The immutable stack introduced here could be generalized, if needed, in the future. Another important note is that Rust optimizes away the `memcpy`s for the stack that was introduced here. The initial Parser Context was introduced because of `ArrayVec` inhibiting that elision, but Vec never had that problem. In the future, I may choose to go back and remove ArrayVec, but I had wanted to keep memory allocation out of the picture as much as possible to make the disassembly and call graph easier to reason about and to have confidence that optimizations were being performed as intended. With that said---it _should_ be eliding in tamec, since we're not doing anything meaningful yet with the graph. It does also elide in tameld, but it's possible that Rust recognizes that those code paths are never taken because tameld does nothing with expressions. So I'll have to monitor this as I progress and adjust accordingly; it's possible a future commit will call BS on everything I just said. Of course, the counter-point to that is that Rust is optimizing them away anyway, but Vec _does_ still require allocation; I was hoping to keep such allocation at the fringes. But another counter-point is that it _still_ is allocated at the fringe, when the context is initialized for the parser as part of the lowering pipeline. But I didn't know how that would all come together back then. ...alright, enough rambling. DEV-13160
2023-01-05 15:57:06 -05:00
/// AIR parser state.
#[derive(Debug, PartialEq, Default)]
pub enum AirAggregate {
tamer: asg::air::AirAggregate: Initial impl of nested exprs This introduces a number of concepts together, again to demonstrate that they were derived. This introduces support for nested expressions, extending the previous work. It also supports error recovery for dangling expressions. The parser states are a mess; there is a lot of duplicate code here that needs refactoring, but I wanted to commit this first at a known-good state so that the diff will demonstrate the need for the change that will follow; the opportunities for abstraction are plainly visible. The immutable stack introduced here could be generalized, if needed, in the future. Another important note is that Rust optimizes away the `memcpy`s for the stack that was introduced here. The initial Parser Context was introduced because of `ArrayVec` inhibiting that elision, but Vec never had that problem. In the future, I may choose to go back and remove ArrayVec, but I had wanted to keep memory allocation out of the picture as much as possible to make the disassembly and call graph easier to reason about and to have confidence that optimizations were being performed as intended. With that said---it _should_ be eliding in tamec, since we're not doing anything meaningful yet with the graph. It does also elide in tameld, but it's possible that Rust recognizes that those code paths are never taken because tameld does nothing with expressions. So I'll have to monitor this as I progress and adjust accordingly; it's possible a future commit will call BS on everything I just said. Of course, the counter-point to that is that Rust is optimizing them away anyway, but Vec _does_ still require allocation; I was hoping to keep such allocation at the fringes. But another counter-point is that it _still_ is allocated at the fringe, when the context is initialized for the parser as part of the lowering pipeline. But I didn't know how that would all come together back then. ...alright, enough rambling. DEV-13160
2023-01-05 15:57:06 -05:00
/// Parser is not currently performing any work.
#[default]
Empty,
tamer: Initial concept for AIR/ASG Expr This begins to place expressions on the graph---something that I've been thinking about for a couple of years now, so it's interesting to finally be doing it. This is going to evolve; I want to get some things committed so that it's clear how I'm moving forward. The ASG makes things a bit awkward for a number of reasons: 1. I'm dealing with older code where I had a different model of doing things; 2. It's mutable, rather than the mostly-functional lowering pipeline; 3. We're dealing with an aggregate ever-evolving blob of data (the graph) rather than a stream of tokens; and 4. We don't have as many type guarantees. I've shown with the lowering pipeline that I'm able to take a mutable reference and convert it into something that's both functional and performant, where I remove it from its container (an `Option`), create a new version of it, and place it back. Rust is able to optimize away the memcpys and such and just directly manipulate the underlying value, which is often a register with all of the inlining. _But_ this is a different scenario now. The lowering pipeline has a narrow context. The graph has to keep hitting memory. So we'll see how this goes. But it's most important to get this working and measure how it performs; I'm not trying to prematurely optimize. My attempts right now are for the way that I wish to develop. Speaking to #4 above, it also sucks that I'm not able to type the relationships between nodes on the graph. Rather, it's not that I _can't_, but a project to created a typed graph library is beyond the scope of this work and would take far too much time. I'll leave that to a personal, non-work project. Instead, I'm going to have to narrow the type any time the graph is accessed. And while that sucks, I'm going to do my best to encapsulate those details to make it as seamless as possible API-wise. The performance hit of performing the narrowing I'm hoping will be very small relative to all the business logic going on (a single cache miss is bound to be far more expensive than many narrowings which are just integer comparisons and branching)...but we'll see. Introducing branching sucks, but branch prediction is pretty damn good in modern CPUs. DEV-13160
2022-12-21 16:47:04 -05:00
/// Expecting a package-level token.
Toplevel(ObjectIndex<Pkg>),
/// Parsing an expression.
tamer: Initial concept for AIR/ASG Expr This begins to place expressions on the graph---something that I've been thinking about for a couple of years now, so it's interesting to finally be doing it. This is going to evolve; I want to get some things committed so that it's clear how I'm moving forward. The ASG makes things a bit awkward for a number of reasons: 1. I'm dealing with older code where I had a different model of doing things; 2. It's mutable, rather than the mostly-functional lowering pipeline; 3. We're dealing with an aggregate ever-evolving blob of data (the graph) rather than a stream of tokens; and 4. We don't have as many type guarantees. I've shown with the lowering pipeline that I'm able to take a mutable reference and convert it into something that's both functional and performant, where I remove it from its container (an `Option`), create a new version of it, and place it back. Rust is able to optimize away the memcpys and such and just directly manipulate the underlying value, which is often a register with all of the inlining. _But_ this is a different scenario now. The lowering pipeline has a narrow context. The graph has to keep hitting memory. So we'll see how this goes. But it's most important to get this working and measure how it performs; I'm not trying to prematurely optimize. My attempts right now are for the way that I wish to develop. Speaking to #4 above, it also sucks that I'm not able to type the relationships between nodes on the graph. Rather, it's not that I _can't_, but a project to created a typed graph library is beyond the scope of this work and would take far too much time. I'll leave that to a personal, non-work project. Instead, I'm going to have to narrow the type any time the graph is accessed. And while that sucks, I'm going to do my best to encapsulate those details to make it as seamless as possible API-wise. The performance hit of performing the narrowing I'm hoping will be very small relative to all the business logic going on (a single cache miss is bound to be far more expensive than many narrowings which are just integer comparisons and branching)...but we'll see. Introducing branching sucks, but branch prediction is pretty damn good in modern CPUs. DEV-13160
2022-12-21 16:47:04 -05:00
///
/// This expects to inherit an [`AirExprAggregate`] from the prior state
/// so that we are not continuously re-allocating its stack for each
/// new expression root.
PkgExpr(AirExprAggregate),
/// Parser is in template parsing mode.
///
/// All objects encountered until the closing [`Air::TplEnd`] will be
/// parented to this template rather than the parent [`Pkg`].
/// See [`Air::TplStart`] for more information.
PkgTpl(AirTplAggregate),
tamer: asg::air::AirAggregate: Initial impl of nested exprs This introduces a number of concepts together, again to demonstrate that they were derived. This introduces support for nested expressions, extending the previous work. It also supports error recovery for dangling expressions. The parser states are a mess; there is a lot of duplicate code here that needs refactoring, but I wanted to commit this first at a known-good state so that the diff will demonstrate the need for the change that will follow; the opportunities for abstraction are plainly visible. The immutable stack introduced here could be generalized, if needed, in the future. Another important note is that Rust optimizes away the `memcpy`s for the stack that was introduced here. The initial Parser Context was introduced because of `ArrayVec` inhibiting that elision, but Vec never had that problem. In the future, I may choose to go back and remove ArrayVec, but I had wanted to keep memory allocation out of the picture as much as possible to make the disassembly and call graph easier to reason about and to have confidence that optimizations were being performed as intended. With that said---it _should_ be eliding in tamec, since we're not doing anything meaningful yet with the graph. It does also elide in tameld, but it's possible that Rust recognizes that those code paths are never taken because tameld does nothing with expressions. So I'll have to monitor this as I progress and adjust accordingly; it's possible a future commit will call BS on everything I just said. Of course, the counter-point to that is that Rust is optimizing them away anyway, but Vec _does_ still require allocation; I was hoping to keep such allocation at the fringes. But another counter-point is that it _still_ is allocated at the fringe, when the context is initialized for the parser as part of the lowering pipeline. But I didn't know how that would all come together back then. ...alright, enough rambling. DEV-13160
2023-01-05 15:57:06 -05:00
}
tamer: Initial concept for AIR/ASG Expr This begins to place expressions on the graph---something that I've been thinking about for a couple of years now, so it's interesting to finally be doing it. This is going to evolve; I want to get some things committed so that it's clear how I'm moving forward. The ASG makes things a bit awkward for a number of reasons: 1. I'm dealing with older code where I had a different model of doing things; 2. It's mutable, rather than the mostly-functional lowering pipeline; 3. We're dealing with an aggregate ever-evolving blob of data (the graph) rather than a stream of tokens; and 4. We don't have as many type guarantees. I've shown with the lowering pipeline that I'm able to take a mutable reference and convert it into something that's both functional and performant, where I remove it from its container (an `Option`), create a new version of it, and place it back. Rust is able to optimize away the memcpys and such and just directly manipulate the underlying value, which is often a register with all of the inlining. _But_ this is a different scenario now. The lowering pipeline has a narrow context. The graph has to keep hitting memory. So we'll see how this goes. But it's most important to get this working and measure how it performs; I'm not trying to prematurely optimize. My attempts right now are for the way that I wish to develop. Speaking to #4 above, it also sucks that I'm not able to type the relationships between nodes on the graph. Rather, it's not that I _can't_, but a project to created a typed graph library is beyond the scope of this work and would take far too much time. I'll leave that to a personal, non-work project. Instead, I'm going to have to narrow the type any time the graph is accessed. And while that sucks, I'm going to do my best to encapsulate those details to make it as seamless as possible API-wise. The performance hit of performing the narrowing I'm hoping will be very small relative to all the business logic going on (a single cache miss is bound to be far more expensive than many narrowings which are just integer comparisons and branching)...but we'll see. Introducing branching sucks, but branch prediction is pretty damn good in modern CPUs. DEV-13160
2022-12-21 16:47:04 -05:00
impl Display for AirAggregate {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
use AirAggregate::*;
match self {
Empty => write!(f, "awaiting AIR input for ASG"),
Toplevel(_) => {
write!(f, "expecting package header or an expression")
}
PkgExpr(expr) => {
write!(f, "defining a package expression: {expr}")
tamer: asg::air::AirAggregate: Initial impl of nested exprs This introduces a number of concepts together, again to demonstrate that they were derived. This introduces support for nested expressions, extending the previous work. It also supports error recovery for dangling expressions. The parser states are a mess; there is a lot of duplicate code here that needs refactoring, but I wanted to commit this first at a known-good state so that the diff will demonstrate the need for the change that will follow; the opportunities for abstraction are plainly visible. The immutable stack introduced here could be generalized, if needed, in the future. Another important note is that Rust optimizes away the `memcpy`s for the stack that was introduced here. The initial Parser Context was introduced because of `ArrayVec` inhibiting that elision, but Vec never had that problem. In the future, I may choose to go back and remove ArrayVec, but I had wanted to keep memory allocation out of the picture as much as possible to make the disassembly and call graph easier to reason about and to have confidence that optimizations were being performed as intended. With that said---it _should_ be eliding in tamec, since we're not doing anything meaningful yet with the graph. It does also elide in tameld, but it's possible that Rust recognizes that those code paths are never taken because tameld does nothing with expressions. So I'll have to monitor this as I progress and adjust accordingly; it's possible a future commit will call BS on everything I just said. Of course, the counter-point to that is that Rust is optimizing them away anyway, but Vec _does_ still require allocation; I was hoping to keep such allocation at the fringes. But another counter-point is that it _still_ is allocated at the fringe, when the context is initialized for the parser as part of the lowering pipeline. But I didn't know how that would all come together back then. ...alright, enough rambling. DEV-13160
2023-01-05 15:57:06 -05:00
}
PkgTpl(tpl) => {
write!(f, "building a template: {tpl}",)
}
tamer: Initial concept for AIR/ASG Expr This begins to place expressions on the graph---something that I've been thinking about for a couple of years now, so it's interesting to finally be doing it. This is going to evolve; I want to get some things committed so that it's clear how I'm moving forward. The ASG makes things a bit awkward for a number of reasons: 1. I'm dealing with older code where I had a different model of doing things; 2. It's mutable, rather than the mostly-functional lowering pipeline; 3. We're dealing with an aggregate ever-evolving blob of data (the graph) rather than a stream of tokens; and 4. We don't have as many type guarantees. I've shown with the lowering pipeline that I'm able to take a mutable reference and convert it into something that's both functional and performant, where I remove it from its container (an `Option`), create a new version of it, and place it back. Rust is able to optimize away the memcpys and such and just directly manipulate the underlying value, which is often a register with all of the inlining. _But_ this is a different scenario now. The lowering pipeline has a narrow context. The graph has to keep hitting memory. So we'll see how this goes. But it's most important to get this working and measure how it performs; I'm not trying to prematurely optimize. My attempts right now are for the way that I wish to develop. Speaking to #4 above, it also sucks that I'm not able to type the relationships between nodes on the graph. Rather, it's not that I _can't_, but a project to created a typed graph library is beyond the scope of this work and would take far too much time. I'll leave that to a personal, non-work project. Instead, I'm going to have to narrow the type any time the graph is accessed. And while that sucks, I'm going to do my best to encapsulate those details to make it as seamless as possible API-wise. The performance hit of performing the narrowing I'm hoping will be very small relative to all the business logic going on (a single cache miss is bound to be far more expensive than many narrowings which are just integer comparisons and branching)...but we'll see. Introducing branching sucks, but branch prediction is pretty damn good in modern CPUs. DEV-13160
2022-12-21 16:47:04 -05:00
}
}
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air This finally uses `parse` all the way up to aggregation into the ASG, as can be seen by the mess in `poc`. This will be further simplified---I just need to get this committed so that I can mentally get it off my plate. I've been separating this commit into smaller commits, but there's a point where it's just not worth the effort anymore. I don't like making large changes such as this one. There is still work to do here. First, it's worth re-mentioning that `poc` means "proof-of-concept", and represents things that still need a proper home/abstraction. Secondly, `poc` is retrieving the context of two parsers---`LowerContext` and `Asg`. The latter is desirable, since it's the final aggregation point, but the former needs to be eliminated; in particular, packages need to be worked into the ASG so that `found` can be removed. Recursively loading `xmlo` files still happens in `poc`, but the compiler will need this as well. Once packages are on the ASG, along with their state, that responsibility can be generalized as well. That will then simplify lowering even further, to the point where hopefully everything has the same shape (once final aggregation has an abstraction), after which we can then create a final abstraction to concisely stitch everything together. Right now, Rust isn't able to infer `S` for `Lower<S, LS>`, which is unfortunate, but we'll be able to help it along with a more explicit abstraction. DEV-11864
2022-05-27 13:51:29 -04:00
}
impl From<AirExprAggregate> for AirAggregate {
fn from(st: AirExprAggregate) -> Self {
Self::PkgExpr(st)
}
}
impl From<AirTplAggregate> for AirAggregate {
fn from(st: AirTplAggregate) -> Self {
Self::PkgTpl(st)
}
}
impl ParseState for AirAggregate {
type Token = Air;
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air This finally uses `parse` all the way up to aggregation into the ASG, as can be seen by the mess in `poc`. This will be further simplified---I just need to get this committed so that I can mentally get it off my plate. I've been separating this commit into smaller commits, but there's a point where it's just not worth the effort anymore. I don't like making large changes such as this one. There is still work to do here. First, it's worth re-mentioning that `poc` means "proof-of-concept", and represents things that still need a proper home/abstraction. Secondly, `poc` is retrieving the context of two parsers---`LowerContext` and `Asg`. The latter is desirable, since it's the final aggregation point, but the former needs to be eliminated; in particular, packages need to be worked into the ASG so that `found` can be removed. Recursively loading `xmlo` files still happens in `poc`, but the compiler will need this as well. Once packages are on the ASG, along with their state, that responsibility can be generalized as well. That will then simplify lowering even further, to the point where hopefully everything has the same shape (once final aggregation has an abstraction), after which we can then create a final abstraction to concisely stitch everything together. Right now, Rust isn't able to infer `S` for `Lower<S, LS>`, which is unfortunate, but we'll be able to help it along with a more explicit abstraction. DEV-11864
2022-05-27 13:51:29 -04:00
type Object = ();
type Error = AsgError;
type Context = AirAggregateCtx;
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air This finally uses `parse` all the way up to aggregation into the ASG, as can be seen by the mess in `poc`. This will be further simplified---I just need to get this committed so that I can mentally get it off my plate. I've been separating this commit into smaller commits, but there's a point where it's just not worth the effort anymore. I don't like making large changes such as this one. There is still work to do here. First, it's worth re-mentioning that `poc` means "proof-of-concept", and represents things that still need a proper home/abstraction. Secondly, `poc` is retrieving the context of two parsers---`LowerContext` and `Asg`. The latter is desirable, since it's the final aggregation point, but the former needs to be eliminated; in particular, packages need to be worked into the ASG so that `found` can be removed. Recursively loading `xmlo` files still happens in `poc`, but the compiler will need this as well. Once packages are on the ASG, along with their state, that responsibility can be generalized as well. That will then simplify lowering even further, to the point where hopefully everything has the same shape (once final aggregation has an abstraction), after which we can then create a final abstraction to concisely stitch everything together. Right now, Rust isn't able to infer `S` for `Lower<S, LS>`, which is unfortunate, but we'll be able to help it along with a more explicit abstraction. DEV-11864
2022-05-27 13:51:29 -04:00
/// Destination [`Asg`] that this parser lowers into.
///
tamer: asg::air: AIR as a sum IR This introduces a new macro `sum_ir!` to help with a long-standing problem of not being able to easily narrow types in Rust without a whole lot of boilerplate. This patch includes a bit of documentation, so see that for more information. This was not a welcome change---I jumped down this rabbit hole trying to decompose `AirAggregate` so that I can share portions of parsing with the current parser and a template parser. I can now proceed with that. This is not the only implementation that I had tried. I previously inverted the approach, as I've been doing manually for some time: manually create types to hold the sets of variants, and then create a sum type to hold those types. That works, but it resulted in a mess for systems that have to use the IR, since now you have two enums to contend with. I didn't find that to be appropriate, because we shouldn't complicate the external API for implementation details. The enum for IRs is supposed to be like a bytecode---a list of operations that can be performed with the IR. They can be grouped if it makes sense for a public API, but in my case, I only wanted subsets for the sake of delegating responsibilities to smaller subsystems, while retaining the context that `match` provides via its exhaustiveness checking but does not expose as something concrete (which is deeply frustrating!). Anyway, here we are; this'll be refined over time, hopefully, and portions of it can be generalized for removing boilerplate from other IRs. Another thing to note is that this syntax is really a compromise---I had to move on, and I was spending too much time trying to get creative with `macro_rules!`. It isn't the best, and it doesn't seem very Rust-like in some places and is therefore not necessarily all that intuitive. This can be refined further in the future. But the end result, all things considered, isn't too bad. DEV-13708
2023-03-02 15:15:28 -05:00
/// This ASG will be yielded by [`crate::parse::Parser::finalize`].
type PubContext = Asg;
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air This finally uses `parse` all the way up to aggregation into the ASG, as can be seen by the mess in `poc`. This will be further simplified---I just need to get this committed so that I can mentally get it off my plate. I've been separating this commit into smaller commits, but there's a point where it's just not worth the effort anymore. I don't like making large changes such as this one. There is still work to do here. First, it's worth re-mentioning that `poc` means "proof-of-concept", and represents things that still need a proper home/abstraction. Secondly, `poc` is retrieving the context of two parsers---`LowerContext` and `Asg`. The latter is desirable, since it's the final aggregation point, but the former needs to be eliminated; in particular, packages need to be worked into the ASG so that `found` can be removed. Recursively loading `xmlo` files still happens in `poc`, but the compiler will need this as well. Once packages are on the ASG, along with their state, that responsibility can be generalized as well. That will then simplify lowering even further, to the point where hopefully everything has the same shape (once final aggregation has an abstraction), after which we can then create a final abstraction to concisely stitch everything together. Right now, Rust isn't able to infer `S` for `Lower<S, LS>`, which is unfortunate, but we'll be able to help it along with a more explicit abstraction. DEV-11864
2022-05-27 13:51:29 -04:00
fn parse_token(
self,
tok: Self::Token,
ctx: &mut Self::Context,
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air This finally uses `parse` all the way up to aggregation into the ASG, as can be seen by the mess in `poc`. This will be further simplified---I just need to get this committed so that I can mentally get it off my plate. I've been separating this commit into smaller commits, but there's a point where it's just not worth the effort anymore. I don't like making large changes such as this one. There is still work to do here. First, it's worth re-mentioning that `poc` means "proof-of-concept", and represents things that still need a proper home/abstraction. Secondly, `poc` is retrieving the context of two parsers---`LowerContext` and `Asg`. The latter is desirable, since it's the final aggregation point, but the former needs to be eliminated; in particular, packages need to be worked into the ASG so that `found` can be removed. Recursively loading `xmlo` files still happens in `poc`, but the compiler will need this as well. Once packages are on the ASG, along with their state, that responsibility can be generalized as well. That will then simplify lowering even further, to the point where hopefully everything has the same shape (once final aggregation has an abstraction), after which we can then create a final abstraction to concisely stitch everything together. Right now, Rust isn't able to infer `S` for `Lower<S, LS>`, which is unfortunate, but we'll be able to help it along with a more explicit abstraction. DEV-11864
2022-05-27 13:51:29 -04:00
) -> crate::parse::TransitionResult<Self> {
tamer: asg::air: AIR as a sum IR This introduces a new macro `sum_ir!` to help with a long-standing problem of not being able to easily narrow types in Rust without a whole lot of boilerplate. This patch includes a bit of documentation, so see that for more information. This was not a welcome change---I jumped down this rabbit hole trying to decompose `AirAggregate` so that I can share portions of parsing with the current parser and a template parser. I can now proceed with that. This is not the only implementation that I had tried. I previously inverted the approach, as I've been doing manually for some time: manually create types to hold the sets of variants, and then create a sum type to hold those types. That works, but it resulted in a mess for systems that have to use the IR, since now you have two enums to contend with. I didn't find that to be appropriate, because we shouldn't complicate the external API for implementation details. The enum for IRs is supposed to be like a bytecode---a list of operations that can be performed with the IR. They can be grouped if it makes sense for a public API, but in my case, I only wanted subsets for the sake of delegating responsibilities to smaller subsystems, while retaining the context that `match` provides via its exhaustiveness checking but does not expose as something concrete (which is deeply frustrating!). Anyway, here we are; this'll be refined over time, hopefully, and portions of it can be generalized for removing boilerplate from other IRs. Another thing to note is that this syntax is really a compromise---I had to move on, and I was spending too much time trying to get creative with `macro_rules!`. It isn't the best, and it doesn't seem very Rust-like in some places and is therefore not necessarily all that intuitive. This can be refined further in the future. But the end result, all things considered, isn't too bad. DEV-13708
2023-03-02 15:15:28 -05:00
use ir::{
AirBind::*, AirDoc::*, AirIdent::*, AirPkg::*, AirSubsets::*,
AirTodo::*,
tamer: asg::air: AIR as a sum IR This introduces a new macro `sum_ir!` to help with a long-standing problem of not being able to easily narrow types in Rust without a whole lot of boilerplate. This patch includes a bit of documentation, so see that for more information. This was not a welcome change---I jumped down this rabbit hole trying to decompose `AirAggregate` so that I can share portions of parsing with the current parser and a template parser. I can now proceed with that. This is not the only implementation that I had tried. I previously inverted the approach, as I've been doing manually for some time: manually create types to hold the sets of variants, and then create a sum type to hold those types. That works, but it resulted in a mess for systems that have to use the IR, since now you have two enums to contend with. I didn't find that to be appropriate, because we shouldn't complicate the external API for implementation details. The enum for IRs is supposed to be like a bytecode---a list of operations that can be performed with the IR. They can be grouped if it makes sense for a public API, but in my case, I only wanted subsets for the sake of delegating responsibilities to smaller subsystems, while retaining the context that `match` provides via its exhaustiveness checking but does not expose as something concrete (which is deeply frustrating!). Anyway, here we are; this'll be refined over time, hopefully, and portions of it can be generalized for removing boilerplate from other IRs. Another thing to note is that this syntax is really a compromise---I had to move on, and I was spending too much time trying to get creative with `macro_rules!`. It isn't the best, and it doesn't seem very Rust-like in some places and is therefore not necessarily all that intuitive. This can be refined further in the future. But the end result, all things considered, isn't too bad. DEV-13708
2023-03-02 15:15:28 -05:00
};
use AirAggregate::*;
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air This finally uses `parse` all the way up to aggregation into the ASG, as can be seen by the mess in `poc`. This will be further simplified---I just need to get this committed so that I can mentally get it off my plate. I've been separating this commit into smaller commits, but there's a point where it's just not worth the effort anymore. I don't like making large changes such as this one. There is still work to do here. First, it's worth re-mentioning that `poc` means "proof-of-concept", and represents things that still need a proper home/abstraction. Secondly, `poc` is retrieving the context of two parsers---`LowerContext` and `Asg`. The latter is desirable, since it's the final aggregation point, but the former needs to be eliminated; in particular, packages need to be worked into the ASG so that `found` can be removed. Recursively loading `xmlo` files still happens in `poc`, but the compiler will need this as well. Once packages are on the ASG, along with their state, that responsibility can be generalized as well. That will then simplify lowering even further, to the point where hopefully everything has the same shape (once final aggregation has an abstraction), after which we can then create a final abstraction to concisely stitch everything together. Right now, Rust isn't able to infer `S` for `Lower<S, LS>`, which is unfortunate, but we'll be able to help it along with a more explicit abstraction. DEV-11864
2022-05-27 13:51:29 -04:00
// TODO: Seems to be about time for refactoring this...
tamer: asg::air: AIR as a sum IR This introduces a new macro `sum_ir!` to help with a long-standing problem of not being able to easily narrow types in Rust without a whole lot of boilerplate. This patch includes a bit of documentation, so see that for more information. This was not a welcome change---I jumped down this rabbit hole trying to decompose `AirAggregate` so that I can share portions of parsing with the current parser and a template parser. I can now proceed with that. This is not the only implementation that I had tried. I previously inverted the approach, as I've been doing manually for some time: manually create types to hold the sets of variants, and then create a sum type to hold those types. That works, but it resulted in a mess for systems that have to use the IR, since now you have two enums to contend with. I didn't find that to be appropriate, because we shouldn't complicate the external API for implementation details. The enum for IRs is supposed to be like a bytecode---a list of operations that can be performed with the IR. They can be grouped if it makes sense for a public API, but in my case, I only wanted subsets for the sake of delegating responsibilities to smaller subsystems, while retaining the context that `match` provides via its exhaustiveness checking but does not expose as something concrete (which is deeply frustrating!). Anyway, here we are; this'll be refined over time, hopefully, and portions of it can be generalized for removing boilerplate from other IRs. Another thing to note is that this syntax is really a compromise---I had to move on, and I was spending too much time trying to get creative with `macro_rules!`. It isn't the best, and it doesn't seem very Rust-like in some places and is therefore not necessarily all that intuitive. This can be refined further in the future. But the end result, all things considered, isn't too bad. DEV-13708
2023-03-02 15:15:28 -05:00
match (self, tok.into()) {
(st, AirTodo(Todo(_))) => Transition(st).incomplete(),
tamer: asg::air::AirAggregate: Initial impl of nested exprs This introduces a number of concepts together, again to demonstrate that they were derived. This introduces support for nested expressions, extending the previous work. It also supports error recovery for dangling expressions. The parser states are a mess; there is a lot of duplicate code here that needs refactoring, but I wanted to commit this first at a known-good state so that the diff will demonstrate the need for the change that will follow; the opportunities for abstraction are plainly visible. The immutable stack introduced here could be generalized, if needed, in the future. Another important note is that Rust optimizes away the `memcpy`s for the stack that was introduced here. The initial Parser Context was introduced because of `ArrayVec` inhibiting that elision, but Vec never had that problem. In the future, I may choose to go back and remove ArrayVec, but I had wanted to keep memory allocation out of the picture as much as possible to make the disassembly and call graph easier to reason about and to have confidence that optimizations were being performed as intended. With that said---it _should_ be eliding in tamec, since we're not doing anything meaningful yet with the graph. It does also elide in tameld, but it's possible that Rust recognizes that those code paths are never taken because tameld does nothing with expressions. So I'll have to monitor this as I progress and adjust accordingly; it's possible a future commit will call BS on everything I just said. Of course, the counter-point to that is that Rust is optimizing them away anyway, but Vec _does_ still require allocation; I was hoping to keep such allocation at the fringes. But another counter-point is that it _still_ is allocated at the fringe, when the context is initialized for the parser as part of the lowering pipeline. But I didn't know how that would all come together back then. ...alright, enough rambling. DEV-13160
2023-01-05 15:57:06 -05:00
(Empty, AirPkg(PkgStart(span))) => {
let oi_pkg = ctx.begin_pkg(span);
Transition(Toplevel(oi_pkg)).incomplete()
}
(
st @ (Toplevel(_) | PkgExpr(_) | PkgTpl(_)),
AirPkg(PkgStart(span)),
) => {
// This should always be available in this context.
let first_span =
ctx.pkg_oi().map(|oi| oi.span()).unwrap_or(UNKNOWN_SPAN);
Transition(st).err(AsgError::NestedPkgStart(span, first_span))
}
// No expression was started.
(Toplevel(oi_pkg), AirPkg(PkgEnd(span))) => {
oi_pkg.close(ctx.asg_mut(), span);
Transition(Empty).incomplete()
}
// Packages are identified by their paths.
(st @ Toplevel(..), AirBind(BindIdent(id))) => {
Transition(st).err(AsgError::InvalidBindContext(id))
}
(Toplevel(oi_pkg), tok @ AirDoc(DocIndepClause(..))) => {
diagnostic_todo!(
vec![
oi_pkg.note("for this package"),
tok.internal_error(
"this package description is not yet supported"
)
],
"package-level short description is not yet supported by TAMER",
)
}
(Toplevel(oi_pkg), AirDoc(DocText(text))) => {
oi_pkg.append_doc_text(ctx.asg_mut(), text);
Transition(Toplevel(oi_pkg)).incomplete()
}
// Package import
(Toplevel(oi_pkg), AirBind(RefIdent(pathspec))) => {
oi_pkg.import(ctx.asg_mut(), pathspec);
Transition(Toplevel(oi_pkg)).incomplete()
}
// Note: We unfortunately can't match on `AirExpr | AirBind | ...`
// and delegate in the same block
// (without having to duplicate type checks and then handle
// unreachable paths)
// because of the different inner types.
(st @ (Toplevel(_) | PkgTpl(_)), tok @ AirExpr(..)) => {
ctx.ret_or_transfer(st, tok, AirExprAggregate::new())
}
(PkgExpr(expr), AirExpr(etok)) => ctx.proxy(expr, etok),
(PkgExpr(expr), AirBind(etok)) => ctx.proxy(expr, etok),
(PkgExpr(expr), AirDoc(etok)) => ctx.proxy(expr, etok),
// Template parsing.
(st @ (Toplevel(_) | PkgExpr(_)), tok @ AirTpl(..)) => {
ctx.ret_or_transfer(st, tok, AirTplAggregate::new())
}
(PkgTpl(tplst), AirTpl(ttok)) => ctx.proxy(tplst, ttok),
(PkgTpl(tplst), AirBind(ttok)) => ctx.proxy(tplst, ttok),
(PkgTpl(tplst), AirDoc(ttok)) => ctx.proxy(tplst, ttok),
(Empty, AirPkg(PkgEnd(span))) => {
Transition(Empty).err(AsgError::InvalidPkgEndContext(span))
}
(st @ (PkgExpr(_) | PkgTpl(_)), AirPkg(PkgEnd(span))) => {
match st.active_is_accepting(ctx) {
true => {
ctx.stack().ret_or_dead(Empty, AirPkg(PkgEnd(span)))
}
false => {
Transition(st).err(AsgError::InvalidPkgEndContext(span))
}
}
}
(
Empty,
tok @ (AirExpr(..) | AirBind(..) | AirTpl(..) | AirDoc(..)),
) => Transition(Empty).err(AsgError::PkgExpected(tok.span())),
(Empty, AirIdent(IdentDecl(name, kind, src))) => {
let asg = ctx.asg_mut();
let oi_root = asg.root(name);
asg.lookup_or_missing(oi_root, name)
.declare(asg, name, kind, src)
.map(|_| ())
.transition(Empty)
}
(Empty, AirIdent(IdentExternDecl(name, kind, src))) => {
let asg = ctx.asg_mut();
let oi_root = asg.root(name);
asg.lookup_or_missing(oi_root, name)
.declare_extern(asg, name, kind, src)
.map(|_| ())
.transition(Empty)
}
(Empty, AirIdent(IdentDep(name, dep))) => {
let asg = ctx.asg_mut();
let oi_root = asg.root(dep);
let oi_from = asg.lookup_or_missing(oi_root, name);
let oi_to = asg.lookup_or_missing(oi_root, dep);
oi_from.add_opaque_dep(ctx.asg_mut(), oi_to);
Transition(Empty).incomplete()
}
(Empty, AirIdent(IdentFragment(name, text))) => {
let asg = ctx.asg_mut();
let oi_root = asg.root(name);
asg.lookup_or_missing(oi_root, name)
.set_fragment(asg, text)
.map(|_| ())
.transition(Empty)
}
(Empty, AirIdent(IdentRoot(name))) => {
let asg = ctx.asg_mut();
asg.root(name).root_ident(asg, name);
Transition(Empty).incomplete()
}
(st, tok @ AirIdent(..)) => todo!("{st:?}, {tok:?}"),
}
}
fn is_accepting(&self, _: &Self::Context) -> bool {
matches!(self, Self::Empty)
}
}
impl AirAggregate {
/// Whether the active parser is in an accepting state.
///
/// If a child parser is active,
/// then its [`ParseState::is_accepting`] will be consulted.
fn active_is_accepting(&self, ctx: &<Self as ParseState>::Context) -> bool {
use AirAggregate::*;
match self {
Empty => true,
Toplevel(_) => self.is_accepting(ctx),
PkgExpr(st) => st.is_accepting(ctx),
PkgTpl(st) => st.is_accepting(ctx),
}
}
/// The rooting context for [`Ident`]s for the active parser.
///
/// A value of [`None`] indicates that the current parser does not
/// support direct bindings,
/// but a parent context may
/// (see [`AirAggregateCtx::rooting_oi`]).
fn active_rooting_oi(&self) -> Option<ObjectIndexToTree<Ident>> {
match self {
AirAggregate::Empty => None,
AirAggregate::Toplevel(pkg_oi) => Some((*pkg_oi).into()),
// Expressions never serve as roots for identifiers;
// this will always fall through to the parent context.
// Since the parent context is a package or a template,
// the next frame should succeed.
AirAggregate::PkgExpr(_) => None,
// Identifiers bound while within a template definition context
// must bind to the eventual _expansion_ site,
// as if the body were pasted there.
// Templates must therefore serve as containers for identifiers
// bound therein.
AirAggregate::PkgTpl(tplst) => {
tplst.active_tpl_oi().map(Into::into)
}
}
}
}
/// Additional parser context,
/// including the ASG and parser stack frames.
///
/// [`ObjectIndex`] lookups perform reverse linear searches beginning from
/// the last stack frame until a non-[`None`] value is found;
/// this creates an environment whereby inner contexts shadow outer.
/// Missing values create holes,
/// much like a prototype chain.
/// In practice,
/// this should only have to search the last two frames.
#[derive(Debug, Default)]
pub struct AirAggregateCtx(Asg, AirStack, Option<ObjectIndex<Pkg>>);
/// Limit of the maximum number of held parser frames.
///
/// Note that this is the number of [`ParseState`]s held,
/// _not_ the depth of the graph at a given point.
/// The intent of this is to limit runaway recursion in the event of some
/// bug in the system;
/// while the input stream is certainly finite,
/// lookahead tokens cause recursion that does not provably
/// terminate.
///
/// This limit is arbitrarily large,
/// but hopefully such that no legitimate case will ever hit it.
const MAX_AIR_STACK_DEPTH: usize = 1024;
/// Held parser stack frames.
///
/// See [`AirAggregateCtx`] for more information.
pub type AirStack = StateStack<AirAggregate, MAX_AIR_STACK_DEPTH>;
impl AirAggregateCtx {
fn asg_mut(&mut self) -> &mut Asg {
self.as_mut()
}
fn stack(&mut self) -> &mut AirStack {
let Self(_, stack, _) = self;
stack
}
/// Return control to the parser atop of the stack if `st` is an
/// accepting state,
/// otherwise transfer control to a new parser `to`.
///
/// This serves as a balance with the behavior of [`Self::proxy`].
/// Rather than checking for an accepting state after each proxy,
/// or having the child parsers return to the top stack frame once
/// they have completed,
/// we leave the child parser in place to potentially handle more
/// tokens of the same type.
/// For example,
/// adjacent expressions can re-use the same parser rather than having
/// to pop and push for each sibling.
///
/// Consequently,
/// this means that a parser may be complete when we need to push and
/// transfer control to another parser.
/// Before pushing,
/// we first check to see if the parser atop of the stack is in an
/// accepting state.
/// If so,
/// then we are a sibling,
/// and so instead of proceeding with instantiating a new parser,
/// we return to the one atop of the stack and delegate to it.
///
/// If `st` is _not_ in an accepting state,
/// that means that we are a _child_;
/// we then set aside the state `st` on the stack and transfer
/// control to the child `to`.
///
/// See also [`Self::proxy`].
fn ret_or_transfer<S: Into<AirAggregate>, SB: Into<AirAggregate>>(
&mut self,
st: S,
tok: impl Token + Into<Air>,
to: SB,
) -> TransitionResult<AirAggregate> {
let st_super = st.into();
if st_super.active_is_accepting(self) {
// TODO: dead state or error
self.stack().ret_or_dead(AirAggregate::Empty, tok)
} else {
self.stack().transfer_with_ret(
Transition(st_super),
Transition(to.into()).incomplete().with_lookahead(tok),
)
}
}
/// Proxy `tok` to `st`,
/// returning to the state atop of the stack if parsing reaches a dead
/// state.
///
/// See also [`Self::ret_or_transfer`].
fn proxy<S: ParseState<Super = AirAggregate, Context = Self>>(
&mut self,
st: S,
tok: impl Token + Into<S::Token>,
) -> TransitionResult<AirAggregate> {
st.delegate_child(tok.into(), self, |_deadst, tok, ctx| {
ctx.stack().ret_or_dead(AirAggregate::Empty, tok)
})
}
/// Create a new rooted package and record it as the active package.
fn begin_pkg(&mut self, span: Span) -> ObjectIndex<Pkg> {
let Self(asg, _, pkg) = self;
let oi_pkg = asg.create(Pkg::new(span)).root(asg);
pkg.replace(oi_pkg);
oi_pkg
}
/// The active package if any.
fn pkg_oi(&self) -> Option<ObjectIndex<Pkg>> {
match self {
Self(_, _, oi) => *oi,
}
}
/// The active container (rooting context) for [`Ident`]s.
///
/// The integer value returned represents the stack offset at which the
/// rooting index was found,
/// with `0` representing the package.
///
/// A value of [`None`] indicates that no bindings are permitted in the
/// current context.
fn rooting_oi(&self) -> Option<ObjectIndexToTree<Ident>> {
let Self(_, stack, _) = self;
stack.iter().rev().find_map(|st| st.active_rooting_oi())
}
/// The active dangling expression context for [`Expr`]s.
///
/// A value of [`None`] indicates that expressions are not permitted to
/// dangle in the current context
/// (and so must be identified).
fn dangling_expr_oi(&self) -> Option<ObjectIndexTo<Expr>> {
let Self(_, stack, _) = self;
stack.iter().rev().find_map(|st| match st {
AirAggregate::Empty => None,
// A dangling expression in a package context would be
// unreachable.
// There should be no parent frame and so this will fail to find
// a value.
AirAggregate::Toplevel(_) => None,
// Expressions may always contain other expressions,
// and so this method should not be consulted in such a
// context.
// Nonetheless,
// fall through to the parent frame and give a correct answer.
AirAggregate::PkgExpr(_) => None,
// Templates serve as containers for dangling expressions,
// since they may expand into an context where they are not
// considered to be dangling.
AirAggregate::PkgTpl(tplst) => {
tplst.active_tpl_oi().map(Into::into)
}
})
}
/// The active expansion target (splicing context) for [`Tpl`]s.
///
/// A value of [`None`] indicates that template expansion is not
/// permitted in this current context.
fn expansion_oi(&self) -> Option<ObjectIndexTo<Tpl>> {
let Self(_, stack, _) = self;
stack.iter().rev().find_map(|st| match st {
AirAggregate::Empty => None,
AirAggregate::Toplevel(pkg_oi) => Some((*pkg_oi).into()),
AirAggregate::PkgExpr(exprst) => {
exprst.active_expr_oi().map(Into::into)
}
AirAggregate::PkgTpl(tplst) => {
tplst.active_tpl_oi().map(Into::into)
}
})
}
/// Root an identifier using the [`Self::rooting_oi`] atop of the stack.
fn defines(&mut self, name: SPair) -> Result<ObjectIndex<Ident>, AsgError> {
let oi_root = self
.rooting_oi()
.ok_or(AsgError::InvalidBindContext(name))?;
Ok(self.lookup_lexical_or_missing(name).add_edge_from(
self.asg_mut(),
oi_root,
None,
))
}
/// Attempt to locate a lexically scoped identifier,
/// or create a new one if missing.
///
/// Until [`Asg`] can be further generalized,
/// there are unfortunately two rooting strategies employed:
///
/// 1. If the stack has only a single held frame at a scope boundary,
/// then it is assumed to be the package representing the active
/// compilation unit and the identifier is indexed in the global
/// scope.
/// 2. Otherwise,
/// the identifier is defined locally and does not undergo
/// indexing.
///
/// TODO: This is very informal and just starts to get things working.
fn lookup_lexical_or_missing(&mut self, name: SPair) -> ObjectIndex<Ident> {
let Self(asg, stack, _) = self;
stack
.iter()
.rev()
.filter_map(|st| st.active_rooting_oi())
.find_map(|oi| asg.lookup(oi, name))
.unwrap_or_else(|| self.create_env_indexed_ident(name))
}
/// Index an identifier within its environment.
///
/// TODO: More information as this is formalized.
fn create_env_indexed_ident(&mut self, name: SPair) -> ObjectIndex<Ident> {
let oi_ident = self.asg_mut().create(Ident::declare(name));
// TODO: This currently only indexes for the top of the stack,
// but we'll want no-shadow records for the rest of the env.
if let Some(oi) = self.rooting_oi() {
self.asg_mut().index_identifier(oi, name, oi_ident);
}
oi_ident
}
}
impl AsMut<AirAggregateCtx> for AirAggregateCtx {
fn as_mut(&mut self) -> &mut AirAggregateCtx {
self
}
}
impl AsRef<Asg> for AirAggregateCtx {
fn as_ref(&self) -> &Asg {
match self {
Self(asg, _, _) => asg,
}
}
}
impl AsMut<Asg> for AirAggregateCtx {
fn as_mut(&mut self) -> &mut Asg {
match self {
Self(asg, _, _) => asg,
}
}
}
impl AsMut<AirStack> for AirAggregateCtx {
fn as_mut(&mut self) -> &mut AirStack {
match self {
Self(_, stack, _) => stack,
}
}
}
impl From<AirAggregateCtx> for Asg {
fn from(ctx: AirAggregateCtx) -> Self {
match ctx {
AirAggregateCtx(asg, _, _) => asg,
}
}
}
impl From<Asg> for AirAggregateCtx {
fn from(asg: Asg) -> Self {
Self(asg, Default::default(), None)
}
}
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air This finally uses `parse` all the way up to aggregation into the ASG, as can be seen by the mess in `poc`. This will be further simplified---I just need to get this committed so that I can mentally get it off my plate. I've been separating this commit into smaller commits, but there's a point where it's just not worth the effort anymore. I don't like making large changes such as this one. There is still work to do here. First, it's worth re-mentioning that `poc` means "proof-of-concept", and represents things that still need a proper home/abstraction. Secondly, `poc` is retrieving the context of two parsers---`LowerContext` and `Asg`. The latter is desirable, since it's the final aggregation point, but the former needs to be eliminated; in particular, packages need to be worked into the ASG so that `found` can be removed. Recursively loading `xmlo` files still happens in `poc`, but the compiler will need this as well. Once packages are on the ASG, along with their state, that responsibility can be generalized as well. That will then simplify lowering even further, to the point where hopefully everything has the same shape (once final aggregation has an abstraction), after which we can then create a final abstraction to concisely stitch everything together. Right now, Rust isn't able to infer `S` for `Lower<S, LS>`, which is unfortunate, but we'll be able to help it along with a more explicit abstraction. DEV-11864
2022-05-27 13:51:29 -04:00
#[cfg(test)]
mod test;