2022-10-19 10:00:08 -04:00
|
|
|
|
// IR that is "near" the source code.
|
tamer: Introduce NIR (accepting only)
This introduces NIR, but only as an accepting grammar; it doesn't yet emit
the NIR IR, beyond TODOs.
This modifies `tamec` to, while copying XIR, also attempt to lower NIR to
produce parser errors, if any. It does not yet fail compilation, as I just
want to be cautious and observe that everything's working properly for a
little while as people use it, before I potentially break builds.
This is the culmination of months of supporting effort. The NIR grammar is
derived from our existing TAME sources internally, which I use for now as a
test case until I introduce test cases directly into TAMER later on (I'd do
it now, if I hadn't spent so much time on this; I'll start introducing tests
as I begin emitting NIR tokens). This is capable of fully parsing our
largest system with >900 packages, as well as `core`.
`tamec`'s lowering is a mess; that'll be cleaned up in future commits. The
same can be said about `tameld`.
NIR's grammar has some initial documentation, but this will improve over
time as well.
The generated docs still need some improvement, too, especially with
generated identifiers; I just want to get this out here for testing.
DEV-7145
2022-08-29 15:28:03 -04:00
|
|
|
|
//
|
|
|
|
|
// Copyright (C) 2014-2022 Ryan Specialty Group, LLC.
|
|
|
|
|
//
|
|
|
|
|
// This file is part of TAME.
|
|
|
|
|
//
|
|
|
|
|
// This program is free software: you can redistribute it and/or modify
|
|
|
|
|
// it under the terms of the GNU General Public License as published by
|
|
|
|
|
// the Free Software Foundation, either version 3 of the License, or
|
|
|
|
|
// (at your option) any later version.
|
|
|
|
|
//
|
|
|
|
|
// This program is distributed in the hope that it will be useful,
|
|
|
|
|
// but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
|
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
|
// GNU General Public License for more details.
|
|
|
|
|
//
|
|
|
|
|
// You should have received a copy of the GNU General Public License
|
|
|
|
|
// along with this program. If not, see <http://www.gnu.org/licenses/>.
|
|
|
|
|
|
2022-09-16 09:59:38 -04:00
|
|
|
|
//! An IR that is "near" the source code.
|
tamer: Introduce NIR (accepting only)
This introduces NIR, but only as an accepting grammar; it doesn't yet emit
the NIR IR, beyond TODOs.
This modifies `tamec` to, while copying XIR, also attempt to lower NIR to
produce parser errors, if any. It does not yet fail compilation, as I just
want to be cautious and observe that everything's working properly for a
little while as people use it, before I potentially break builds.
This is the culmination of months of supporting effort. The NIR grammar is
derived from our existing TAME sources internally, which I use for now as a
test case until I introduce test cases directly into TAMER later on (I'd do
it now, if I hadn't spent so much time on this; I'll start introducing tests
as I begin emitting NIR tokens). This is capable of fully parsing our
largest system with >900 packages, as well as `core`.
`tamec`'s lowering is a mess; that'll be cleaned up in future commits. The
same can be said about `tameld`.
NIR's grammar has some initial documentation, but this will improve over
time as well.
The generated docs still need some improvement, too, especially with
generated identifiers; I just want to get this out here for testing.
DEV-7145
2022-08-29 15:28:03 -04:00
|
|
|
|
//!
|
|
|
|
|
//! This IR is "near" the source code written by the user,
|
2022-09-16 09:59:38 -04:00
|
|
|
|
//! performing only basic normalization tasks like desugaring.
|
|
|
|
|
//! It takes a verbose input language and translates it into a much more
|
|
|
|
|
//! concise internal representation.
|
|
|
|
|
//! The hope is that most desugaring will be done by templates in the future.
|
|
|
|
|
//!
|
|
|
|
|
//! NIR cannot completely normalize the source input because it does not
|
|
|
|
|
//! have enough information to do so---the
|
|
|
|
|
//! template system requires a compile-time interpreter that is beyond
|
|
|
|
|
//! the capabilities of NIR,
|
|
|
|
|
//! and so a final normalization pass must be done later on in the
|
|
|
|
|
//! lowering pipeline.
|
tamer: Introduce NIR (accepting only)
This introduces NIR, but only as an accepting grammar; it doesn't yet emit
the NIR IR, beyond TODOs.
This modifies `tamec` to, while copying XIR, also attempt to lower NIR to
produce parser errors, if any. It does not yet fail compilation, as I just
want to be cautious and observe that everything's working properly for a
little while as people use it, before I potentially break builds.
This is the culmination of months of supporting effort. The NIR grammar is
derived from our existing TAME sources internally, which I use for now as a
test case until I introduce test cases directly into TAMER later on (I'd do
it now, if I hadn't spent so much time on this; I'll start introducing tests
as I begin emitting NIR tokens). This is capable of fully parsing our
largest system with >900 packages, as well as `core`.
`tamec`'s lowering is a mess; that'll be cleaned up in future commits. The
same can be said about `tameld`.
NIR's grammar has some initial documentation, but this will improve over
time as well.
The generated docs still need some improvement, too, especially with
generated identifiers; I just want to get this out here for testing.
DEV-7145
2022-08-29 15:28:03 -04:00
|
|
|
|
//!
|
|
|
|
|
//! This is a streaming IR,
|
|
|
|
|
//! meaning that the equivalent AST is not explicitly represented as a
|
|
|
|
|
//! tree structure in memory.
|
|
|
|
|
//!
|
2022-09-16 09:59:38 -04:00
|
|
|
|
//! NIR is lossy and does not retain enough information for code
|
tamer: Introduce NIR (accepting only)
This introduces NIR, but only as an accepting grammar; it doesn't yet emit
the NIR IR, beyond TODOs.
This modifies `tamec` to, while copying XIR, also attempt to lower NIR to
produce parser errors, if any. It does not yet fail compilation, as I just
want to be cautious and observe that everything's working properly for a
little while as people use it, before I potentially break builds.
This is the culmination of months of supporting effort. The NIR grammar is
derived from our existing TAME sources internally, which I use for now as a
test case until I introduce test cases directly into TAMER later on (I'd do
it now, if I hadn't spent so much time on this; I'll start introducing tests
as I begin emitting NIR tokens). This is capable of fully parsing our
largest system with >900 packages, as well as `core`.
`tamec`'s lowering is a mess; that'll be cleaned up in future commits. The
same can be said about `tameld`.
NIR's grammar has some initial documentation, but this will improve over
time as well.
The generated docs still need some improvement, too, especially with
generated identifiers; I just want to get this out here for testing.
DEV-7145
2022-08-29 15:28:03 -04:00
|
|
|
|
//! formatting---that
|
|
|
|
|
//! type of operation will require a mapping between
|
|
|
|
|
//! XIRF and NIR,
|
|
|
|
|
//! where the latter is used to gather enough context for formatting
|
|
|
|
|
//! and the former is used as a concrete representation of what the user
|
|
|
|
|
//! actually typed.
|
2022-09-19 09:22:07 -04:00
|
|
|
|
//!
|
|
|
|
|
//! For more information on the parser,
|
|
|
|
|
//! see [`parse`].
|
|
|
|
|
//! The entry point for NIR in the lowering pipeline is exported as
|
|
|
|
|
//! [`XirfToNir`].
|
tamer: Introduce NIR (accepting only)
This introduces NIR, but only as an accepting grammar; it doesn't yet emit
the NIR IR, beyond TODOs.
This modifies `tamec` to, while copying XIR, also attempt to lower NIR to
produce parser errors, if any. It does not yet fail compilation, as I just
want to be cautious and observe that everything's working properly for a
little while as people use it, before I potentially break builds.
This is the culmination of months of supporting effort. The NIR grammar is
derived from our existing TAME sources internally, which I use for now as a
test case until I introduce test cases directly into TAMER later on (I'd do
it now, if I hadn't spent so much time on this; I'll start introducing tests
as I begin emitting NIR tokens). This is capable of fully parsing our
largest system with >900 packages, as well as `core`.
`tamec`'s lowering is a mess; that'll be cleaned up in future commits. The
same can be said about `tameld`.
NIR's grammar has some initial documentation, but this will improve over
time as well.
The generated docs still need some improvement, too, especially with
generated identifiers; I just want to get this out here for testing.
DEV-7145
2022-08-29 15:28:03 -04:00
|
|
|
|
|
2022-10-19 10:00:08 -04:00
|
|
|
|
mod desugar;
|
tamer: Introduce NIR (accepting only)
This introduces NIR, but only as an accepting grammar; it doesn't yet emit
the NIR IR, beyond TODOs.
This modifies `tamec` to, while copying XIR, also attempt to lower NIR to
produce parser errors, if any. It does not yet fail compilation, as I just
want to be cautious and observe that everything's working properly for a
little while as people use it, before I potentially break builds.
This is the culmination of months of supporting effort. The NIR grammar is
derived from our existing TAME sources internally, which I use for now as a
test case until I introduce test cases directly into TAMER later on (I'd do
it now, if I hadn't spent so much time on this; I'll start introducing tests
as I begin emitting NIR tokens). This is capable of fully parsing our
largest system with >900 packages, as well as `core`.
`tamec`'s lowering is a mess; that'll be cleaned up in future commits. The
same can be said about `tameld`.
NIR's grammar has some initial documentation, but this will improve over
time as well.
The generated docs still need some improvement, too, especially with
generated identifiers; I just want to get this out here for testing.
DEV-7145
2022-08-29 15:28:03 -04:00
|
|
|
|
mod parse;
|
|
|
|
|
|
|
|
|
|
use crate::{
|
|
|
|
|
diagnose::{Annotate, Diagnostic},
|
|
|
|
|
fmt::{DisplayWrapper, TtQuote},
|
|
|
|
|
parse::{Object, Token},
|
2022-09-19 16:21:41 -04:00
|
|
|
|
span::{Span, UNKNOWN_SPAN},
|
tamer: nir: Detect interpolated values
This simply detects whether a value will need to be further parsed for
interpolation; it does not yet perform the parsing itself, which will happen
during desugaring.
This introduces a performance regression, for an interesting reason. I
found that introducing a single new variant to `SugaredNir` (with a
`(SymbolId, Span)` pair), was causing the width of the `NirParseState` type
to increase just enough to cause Rust to be unable to optimize away a
significant number of memcpys related to `Parser` moves, and consequently
reducing performance by nearly 50% for `tamec`. Yikes.
I suspected this would be a problem, and indeed have tried in all other
cases to avoid aggregation until the ASG---the problem is that I had wanted
to aggregate attributes for NIR so that the IR could actually make some
progress toward simplifying the stream (and therefore working with the
data), and be able to validate against a grammar defined in a single
place. The problem is that the `NirParseState` type contains a sum type for
every attribute parser, and is therefore as wide as the largest one. That
is what Rust is having trouble optimizing memcpy away for.
Indeed, reducing the number of attributes improves the situation
drastically. However, it doesn't make it go away entirely.
If you look at a callgrind profile for `tameld` (or a dissassembly), you'll
notice that I put quite a bit of effort into ensuring that the hot code path
for the lowering pipeline contains _no_ memcpys for the parsers. But that
is not the case with `tamec`---I had to move on. But I do still have the
same escape hatch that I introduced for `tameld`, which is the mutable
`Context`.
It seems that may be the solution there too, but I want to get a bit further
along first to see how these data end up propagating before I go through
that somewhat significant effort.
DEV-13156
2022-11-01 14:30:34 -04:00
|
|
|
|
sym::{st::quick_contains_byte, GlobalSymbolResolve, SymbolId},
|
|
|
|
|
xir::{
|
|
|
|
|
attr::{Attr, AttrSpan},
|
|
|
|
|
fmt::TtXmlAttr,
|
|
|
|
|
QName,
|
|
|
|
|
},
|
|
|
|
|
};
|
|
|
|
|
use memchr::memchr;
|
|
|
|
|
use std::{
|
|
|
|
|
convert::Infallible,
|
|
|
|
|
error::Error,
|
|
|
|
|
fmt::{Debug, Display},
|
tamer: Introduce NIR (accepting only)
This introduces NIR, but only as an accepting grammar; it doesn't yet emit
the NIR IR, beyond TODOs.
This modifies `tamec` to, while copying XIR, also attempt to lower NIR to
produce parser errors, if any. It does not yet fail compilation, as I just
want to be cautious and observe that everything's working properly for a
little while as people use it, before I potentially break builds.
This is the culmination of months of supporting effort. The NIR grammar is
derived from our existing TAME sources internally, which I use for now as a
test case until I introduce test cases directly into TAMER later on (I'd do
it now, if I hadn't spent so much time on this; I'll start introducing tests
as I begin emitting NIR tokens). This is capable of fully parsing our
largest system with >900 packages, as well as `core`.
`tamec`'s lowering is a mess; that'll be cleaned up in future commits. The
same can be said about `tameld`.
NIR's grammar has some initial documentation, but this will improve over
time as well.
The generated docs still need some improvement, too, especially with
generated identifiers; I just want to get this out here for testing.
DEV-7145
2022-08-29 15:28:03 -04:00
|
|
|
|
};
|
|
|
|
|
|
2022-10-19 10:00:08 -04:00
|
|
|
|
pub use desugar::{DesugarNir, DesugarNirError};
|
tamer: Introduce NIR (accepting only)
This introduces NIR, but only as an accepting grammar; it doesn't yet emit
the NIR IR, beyond TODOs.
This modifies `tamec` to, while copying XIR, also attempt to lower NIR to
produce parser errors, if any. It does not yet fail compilation, as I just
want to be cautious and observe that everything's working properly for a
little while as people use it, before I potentially break builds.
This is the culmination of months of supporting effort. The NIR grammar is
derived from our existing TAME sources internally, which I use for now as a
test case until I introduce test cases directly into TAMER later on (I'd do
it now, if I hadn't spent so much time on this; I'll start introducing tests
as I begin emitting NIR tokens). This is capable of fully parsing our
largest system with >900 packages, as well as `core`.
`tamec`'s lowering is a mess; that'll be cleaned up in future commits. The
same can be said about `tameld`.
NIR's grammar has some initial documentation, but this will improve over
time as well.
The generated docs still need some improvement, too, especially with
generated identifiers; I just want to get this out here for testing.
DEV-7145
2022-08-29 15:28:03 -04:00
|
|
|
|
pub use parse::{
|
|
|
|
|
NirParseState as XirfToNir, NirParseStateError_ as XirfToNirError,
|
|
|
|
|
};
|
|
|
|
|
|
2022-10-19 10:00:08 -04:00
|
|
|
|
/// IR that is "near" the source code,
|
|
|
|
|
/// without its syntactic sugar.
|
|
|
|
|
///
|
|
|
|
|
/// This form contains only primitives that cannot be reasonably represented
|
|
|
|
|
/// by other primitives.
|
|
|
|
|
/// This is somewhat arbitrary and may change over time,
|
|
|
|
|
/// but represents a balance between the level of abstraction of the IR
|
|
|
|
|
/// and performance of lowering operations.
|
|
|
|
|
///
|
|
|
|
|
/// See [`SugaredNir`] for more information about the sugared form.
|
tamer: Introduce NIR (accepting only)
This introduces NIR, but only as an accepting grammar; it doesn't yet emit
the NIR IR, beyond TODOs.
This modifies `tamec` to, while copying XIR, also attempt to lower NIR to
produce parser errors, if any. It does not yet fail compilation, as I just
want to be cautious and observe that everything's working properly for a
little while as people use it, before I potentially break builds.
This is the culmination of months of supporting effort. The NIR grammar is
derived from our existing TAME sources internally, which I use for now as a
test case until I introduce test cases directly into TAMER later on (I'd do
it now, if I hadn't spent so much time on this; I'll start introducing tests
as I begin emitting NIR tokens). This is capable of fully parsing our
largest system with >900 packages, as well as `core`.
`tamec`'s lowering is a mess; that'll be cleaned up in future commits. The
same can be said about `tameld`.
NIR's grammar has some initial documentation, but this will improve over
time as well.
The generated docs still need some improvement, too, especially with
generated identifiers; I just want to get this out here for testing.
DEV-7145
2022-08-29 15:28:03 -04:00
|
|
|
|
#[derive(Debug, PartialEq, Eq)]
|
2022-10-19 10:00:08 -04:00
|
|
|
|
pub enum PlainNir {
|
tamer: Introduce NIR (accepting only)
This introduces NIR, but only as an accepting grammar; it doesn't yet emit
the NIR IR, beyond TODOs.
This modifies `tamec` to, while copying XIR, also attempt to lower NIR to
produce parser errors, if any. It does not yet fail compilation, as I just
want to be cautious and observe that everything's working properly for a
little while as people use it, before I potentially break builds.
This is the culmination of months of supporting effort. The NIR grammar is
derived from our existing TAME sources internally, which I use for now as a
test case until I introduce test cases directly into TAMER later on (I'd do
it now, if I hadn't spent so much time on this; I'll start introducing tests
as I begin emitting NIR tokens). This is capable of fully parsing our
largest system with >900 packages, as well as `core`.
`tamec`'s lowering is a mess; that'll be cleaned up in future commits. The
same can be said about `tameld`.
NIR's grammar has some initial documentation, but this will improve over
time as well.
The generated docs still need some improvement, too, especially with
generated identifiers; I just want to get this out here for testing.
DEV-7145
2022-08-29 15:28:03 -04:00
|
|
|
|
Todo,
|
|
|
|
|
}
|
|
|
|
|
|
2022-10-19 10:00:08 -04:00
|
|
|
|
impl Token for PlainNir {
|
tamer: Introduce NIR (accepting only)
This introduces NIR, but only as an accepting grammar; it doesn't yet emit
the NIR IR, beyond TODOs.
This modifies `tamec` to, while copying XIR, also attempt to lower NIR to
produce parser errors, if any. It does not yet fail compilation, as I just
want to be cautious and observe that everything's working properly for a
little while as people use it, before I potentially break builds.
This is the culmination of months of supporting effort. The NIR grammar is
derived from our existing TAME sources internally, which I use for now as a
test case until I introduce test cases directly into TAMER later on (I'd do
it now, if I hadn't spent so much time on this; I'll start introducing tests
as I begin emitting NIR tokens). This is capable of fully parsing our
largest system with >900 packages, as well as `core`.
`tamec`'s lowering is a mess; that'll be cleaned up in future commits. The
same can be said about `tameld`.
NIR's grammar has some initial documentation, but this will improve over
time as well.
The generated docs still need some improvement, too, especially with
generated identifiers; I just want to get this out here for testing.
DEV-7145
2022-08-29 15:28:03 -04:00
|
|
|
|
fn ir_name() -> &'static str {
|
2022-10-19 10:00:08 -04:00
|
|
|
|
"Plain NIR"
|
tamer: Introduce NIR (accepting only)
This introduces NIR, but only as an accepting grammar; it doesn't yet emit
the NIR IR, beyond TODOs.
This modifies `tamec` to, while copying XIR, also attempt to lower NIR to
produce parser errors, if any. It does not yet fail compilation, as I just
want to be cautious and observe that everything's working properly for a
little while as people use it, before I potentially break builds.
This is the culmination of months of supporting effort. The NIR grammar is
derived from our existing TAME sources internally, which I use for now as a
test case until I introduce test cases directly into TAMER later on (I'd do
it now, if I hadn't spent so much time on this; I'll start introducing tests
as I begin emitting NIR tokens). This is capable of fully parsing our
largest system with >900 packages, as well as `core`.
`tamec`'s lowering is a mess; that'll be cleaned up in future commits. The
same can be said about `tameld`.
NIR's grammar has some initial documentation, but this will improve over
time as well.
The generated docs still need some improvement, too, especially with
generated identifiers; I just want to get this out here for testing.
DEV-7145
2022-08-29 15:28:03 -04:00
|
|
|
|
}
|
|
|
|
|
|
2022-10-19 10:00:08 -04:00
|
|
|
|
fn span(&self) -> Span {
|
|
|
|
|
use PlainNir::*;
|
2022-09-19 16:21:41 -04:00
|
|
|
|
|
|
|
|
|
match self {
|
|
|
|
|
Todo => UNKNOWN_SPAN,
|
|
|
|
|
}
|
tamer: Introduce NIR (accepting only)
This introduces NIR, but only as an accepting grammar; it doesn't yet emit
the NIR IR, beyond TODOs.
This modifies `tamec` to, while copying XIR, also attempt to lower NIR to
produce parser errors, if any. It does not yet fail compilation, as I just
want to be cautious and observe that everything's working properly for a
little while as people use it, before I potentially break builds.
This is the culmination of months of supporting effort. The NIR grammar is
derived from our existing TAME sources internally, which I use for now as a
test case until I introduce test cases directly into TAMER later on (I'd do
it now, if I hadn't spent so much time on this; I'll start introducing tests
as I begin emitting NIR tokens). This is capable of fully parsing our
largest system with >900 packages, as well as `core`.
`tamec`'s lowering is a mess; that'll be cleaned up in future commits. The
same can be said about `tameld`.
NIR's grammar has some initial documentation, but this will improve over
time as well.
The generated docs still need some improvement, too, especially with
generated identifiers; I just want to get this out here for testing.
DEV-7145
2022-08-29 15:28:03 -04:00
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2022-10-19 10:00:08 -04:00
|
|
|
|
impl Object for PlainNir {}
|
tamer: Introduce NIR (accepting only)
This introduces NIR, but only as an accepting grammar; it doesn't yet emit
the NIR IR, beyond TODOs.
This modifies `tamec` to, while copying XIR, also attempt to lower NIR to
produce parser errors, if any. It does not yet fail compilation, as I just
want to be cautious and observe that everything's working properly for a
little while as people use it, before I potentially break builds.
This is the culmination of months of supporting effort. The NIR grammar is
derived from our existing TAME sources internally, which I use for now as a
test case until I introduce test cases directly into TAMER later on (I'd do
it now, if I hadn't spent so much time on this; I'll start introducing tests
as I begin emitting NIR tokens). This is capable of fully parsing our
largest system with >900 packages, as well as `core`.
`tamec`'s lowering is a mess; that'll be cleaned up in future commits. The
same can be said about `tameld`.
NIR's grammar has some initial documentation, but this will improve over
time as well.
The generated docs still need some improvement, too, especially with
generated identifiers; I just want to get this out here for testing.
DEV-7145
2022-08-29 15:28:03 -04:00
|
|
|
|
|
2022-10-19 10:00:08 -04:00
|
|
|
|
impl Display for PlainNir {
|
2022-09-19 16:21:41 -04:00
|
|
|
|
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
|
2022-10-19 10:00:08 -04:00
|
|
|
|
use PlainNir::*;
|
2022-09-19 16:21:41 -04:00
|
|
|
|
|
|
|
|
|
match self {
|
|
|
|
|
Todo => write!(f, "TODO"),
|
|
|
|
|
}
|
tamer: Introduce NIR (accepting only)
This introduces NIR, but only as an accepting grammar; it doesn't yet emit
the NIR IR, beyond TODOs.
This modifies `tamec` to, while copying XIR, also attempt to lower NIR to
produce parser errors, if any. It does not yet fail compilation, as I just
want to be cautious and observe that everything's working properly for a
little while as people use it, before I potentially break builds.
This is the culmination of months of supporting effort. The NIR grammar is
derived from our existing TAME sources internally, which I use for now as a
test case until I introduce test cases directly into TAMER later on (I'd do
it now, if I hadn't spent so much time on this; I'll start introducing tests
as I begin emitting NIR tokens). This is capable of fully parsing our
largest system with >900 packages, as well as `core`.
`tamec`'s lowering is a mess; that'll be cleaned up in future commits. The
same can be said about `tameld`.
NIR's grammar has some initial documentation, but this will improve over
time as well.
The generated docs still need some improvement, too, especially with
generated identifiers; I just want to get this out here for testing.
DEV-7145
2022-08-29 15:28:03 -04:00
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2022-10-19 10:00:08 -04:00
|
|
|
|
/// Syntactic sugar atop of [`PlainNir`].
|
|
|
|
|
///
|
|
|
|
|
/// NIR contains various syntax features that serve as mere quality-of-life
|
|
|
|
|
/// conveniences for users
|
|
|
|
|
/// ("sugar" to sweeten the experience).
|
|
|
|
|
/// These features do not add an expressiveness to the language,
|
|
|
|
|
/// and are able to be lowered into other primitives without changing
|
|
|
|
|
/// its meaning.
|
|
|
|
|
///
|
|
|
|
|
/// The process of lowering syntactic sugar into primitives is called
|
|
|
|
|
/// "desugaring" and is carried out by the [`DesugarNir`] lowering
|
|
|
|
|
/// operation,
|
|
|
|
|
/// producing [`PlainNir`].
|
|
|
|
|
#[derive(Debug, PartialEq, Eq)]
|
|
|
|
|
pub enum SugaredNir {
|
tamer: nir: Detect interpolated values
This simply detects whether a value will need to be further parsed for
interpolation; it does not yet perform the parsing itself, which will happen
during desugaring.
This introduces a performance regression, for an interesting reason. I
found that introducing a single new variant to `SugaredNir` (with a
`(SymbolId, Span)` pair), was causing the width of the `NirParseState` type
to increase just enough to cause Rust to be unable to optimize away a
significant number of memcpys related to `Parser` moves, and consequently
reducing performance by nearly 50% for `tamec`. Yikes.
I suspected this would be a problem, and indeed have tried in all other
cases to avoid aggregation until the ASG---the problem is that I had wanted
to aggregate attributes for NIR so that the IR could actually make some
progress toward simplifying the stream (and therefore working with the
data), and be able to validate against a grammar defined in a single
place. The problem is that the `NirParseState` type contains a sum type for
every attribute parser, and is therefore as wide as the largest one. That
is what Rust is having trouble optimizing memcpy away for.
Indeed, reducing the number of attributes improves the situation
drastically. However, it doesn't make it go away entirely.
If you look at a callgrind profile for `tameld` (or a dissassembly), you'll
notice that I put quite a bit of effort into ensuring that the hot code path
for the lowering pipeline contains _no_ memcpys for the parsers. But that
is not the case with `tamec`---I had to move on. But I do still have the
same escape hatch that I introduced for `tameld`, which is the mutable
`Context`.
It seems that may be the solution there too, but I want to get a bit further
along first to see how these data end up propagating before I go through
that somewhat significant effort.
DEV-13156
2022-11-01 14:30:34 -04:00
|
|
|
|
/// A primitive token that may have sugared values.
|
2022-11-01 15:13:29 -04:00
|
|
|
|
Todo,
|
2022-10-19 10:00:08 -04:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
impl Token for SugaredNir {
|
|
|
|
|
fn ir_name() -> &'static str {
|
|
|
|
|
"Sugared NIR"
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
fn span(&self) -> Span {
|
|
|
|
|
use SugaredNir::*;
|
|
|
|
|
|
|
|
|
|
match self {
|
2022-11-01 15:13:29 -04:00
|
|
|
|
Todo => UNKNOWN_SPAN,
|
2022-10-19 10:00:08 -04:00
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
impl Object for SugaredNir {}
|
|
|
|
|
|
|
|
|
|
impl Display for SugaredNir {
|
|
|
|
|
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
|
|
|
|
|
use SugaredNir::*;
|
|
|
|
|
|
|
|
|
|
match self {
|
2022-11-01 15:13:29 -04:00
|
|
|
|
Todo => write!(f, "TODO"),
|
2022-10-19 10:00:08 -04:00
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
tamer: nir: Detect interpolated values
This simply detects whether a value will need to be further parsed for
interpolation; it does not yet perform the parsing itself, which will happen
during desugaring.
This introduces a performance regression, for an interesting reason. I
found that introducing a single new variant to `SugaredNir` (with a
`(SymbolId, Span)` pair), was causing the width of the `NirParseState` type
to increase just enough to cause Rust to be unable to optimize away a
significant number of memcpys related to `Parser` moves, and consequently
reducing performance by nearly 50% for `tamec`. Yikes.
I suspected this would be a problem, and indeed have tried in all other
cases to avoid aggregation until the ASG---the problem is that I had wanted
to aggregate attributes for NIR so that the IR could actually make some
progress toward simplifying the stream (and therefore working with the
data), and be able to validate against a grammar defined in a single
place. The problem is that the `NirParseState` type contains a sum type for
every attribute parser, and is therefore as wide as the largest one. That
is what Rust is having trouble optimizing memcpy away for.
Indeed, reducing the number of attributes improves the situation
drastically. However, it doesn't make it go away entirely.
If you look at a callgrind profile for `tameld` (or a dissassembly), you'll
notice that I put quite a bit of effort into ensuring that the hot code path
for the lowering pipeline contains _no_ memcpys for the parsers. But that
is not the case with `tamec`---I had to move on. But I do still have the
same escape hatch that I introduced for `tameld`, which is the mutable
`Context`.
It seems that may be the solution there too, but I want to get a bit further
along first to see how these data end up propagating before I go through
that somewhat significant effort.
DEV-13156
2022-11-01 14:30:34 -04:00
|
|
|
|
/// Tag representing the type of a NIR value.
|
|
|
|
|
///
|
|
|
|
|
/// NIR values originate from attributes,
|
|
|
|
|
/// which are refined into types as enough information becomes available.
|
|
|
|
|
/// Value parsing must be deferred if a value requires desugaring or
|
|
|
|
|
/// metavalue expansion.
|
|
|
|
|
#[derive(Debug, PartialEq, Eq)]
|
|
|
|
|
#[repr(u8)]
|
|
|
|
|
pub enum NirSymbolTy {
|
|
|
|
|
AnyIdent,
|
|
|
|
|
BooleanLiteral,
|
|
|
|
|
ClassIdent,
|
|
|
|
|
ClassIdentList,
|
|
|
|
|
ConstIdent,
|
|
|
|
|
DescLiteral,
|
|
|
|
|
Dim,
|
|
|
|
|
DynNodeLiteral,
|
|
|
|
|
FuncIdent,
|
|
|
|
|
IdentDtype,
|
|
|
|
|
IdentType,
|
|
|
|
|
MapTransformLiteral,
|
|
|
|
|
NumLiteral,
|
|
|
|
|
ParamDefault,
|
|
|
|
|
ParamIdent,
|
|
|
|
|
ParamName,
|
|
|
|
|
ParamType,
|
|
|
|
|
PkgPath,
|
|
|
|
|
ShortDimNumLiteral,
|
|
|
|
|
StringLiteral,
|
|
|
|
|
SymbolTableKey,
|
|
|
|
|
TexMathLiteral,
|
|
|
|
|
Title,
|
|
|
|
|
TplMetaIdent,
|
2022-11-01 16:23:51 -04:00
|
|
|
|
TplIdent,
|
tamer: nir: Detect interpolated values
This simply detects whether a value will need to be further parsed for
interpolation; it does not yet perform the parsing itself, which will happen
during desugaring.
This introduces a performance regression, for an interesting reason. I
found that introducing a single new variant to `SugaredNir` (with a
`(SymbolId, Span)` pair), was causing the width of the `NirParseState` type
to increase just enough to cause Rust to be unable to optimize away a
significant number of memcpys related to `Parser` moves, and consequently
reducing performance by nearly 50% for `tamec`. Yikes.
I suspected this would be a problem, and indeed have tried in all other
cases to avoid aggregation until the ASG---the problem is that I had wanted
to aggregate attributes for NIR so that the IR could actually make some
progress toward simplifying the stream (and therefore working with the
data), and be able to validate against a grammar defined in a single
place. The problem is that the `NirParseState` type contains a sum type for
every attribute parser, and is therefore as wide as the largest one. That
is what Rust is having trouble optimizing memcpy away for.
Indeed, reducing the number of attributes improves the situation
drastically. However, it doesn't make it go away entirely.
If you look at a callgrind profile for `tameld` (or a dissassembly), you'll
notice that I put quite a bit of effort into ensuring that the hot code path
for the lowering pipeline contains _no_ memcpys for the parsers. But that
is not the case with `tamec`---I had to move on. But I do still have the
same escape hatch that I introduced for `tameld`, which is the mutable
`Context`.
It seems that may be the solution there too, but I want to get a bit further
along first to see how these data end up propagating before I go through
that somewhat significant effort.
DEV-13156
2022-11-01 14:30:34 -04:00
|
|
|
|
TplParamIdent,
|
|
|
|
|
TypeIdent,
|
|
|
|
|
ValueIdent,
|
|
|
|
|
}
|
|
|
|
|
|
2022-11-01 16:23:51 -04:00
|
|
|
|
impl Display for NirSymbolTy {
|
|
|
|
|
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
|
|
|
|
|
use NirSymbolTy::*;
|
|
|
|
|
|
|
|
|
|
match self {
|
|
|
|
|
AnyIdent => write!(f, "any identifier"),
|
|
|
|
|
BooleanLiteral => write!(
|
|
|
|
|
f,
|
|
|
|
|
"boolean literal {fmt_true} or {fmt_false}",
|
|
|
|
|
fmt_true = TtQuote::wrap("true"),
|
|
|
|
|
fmt_false = TtQuote::wrap("false"),
|
|
|
|
|
),
|
|
|
|
|
ClassIdent => write!(f, "classification identifier"),
|
|
|
|
|
ClassIdentList => {
|
|
|
|
|
write!(f, "space-delimited list of classification identifiers")
|
|
|
|
|
}
|
|
|
|
|
ConstIdent => write!(f, "constant identifier"),
|
|
|
|
|
DescLiteral => write!(f, "description literal"),
|
|
|
|
|
Dim => write!(f, "dimension declaration"),
|
|
|
|
|
DynNodeLiteral => write!(f, "dynamic node literal"),
|
|
|
|
|
FuncIdent => write!(f, "function identifier"),
|
|
|
|
|
IdentDtype => write!(f, "identifier primitive datatype"),
|
|
|
|
|
IdentType => write!(f, "identifier type"),
|
|
|
|
|
MapTransformLiteral => write!(f, "map transformation literal"),
|
|
|
|
|
NumLiteral => write!(f, "numeric literal"),
|
|
|
|
|
ParamDefault => write!(f, "param default"),
|
|
|
|
|
ParamIdent => write!(f, "param identifier"),
|
|
|
|
|
ParamName => write!(f, "param name"),
|
|
|
|
|
ParamType => write!(f, "param type"),
|
|
|
|
|
PkgPath => write!(f, "package path"),
|
|
|
|
|
ShortDimNumLiteral => {
|
|
|
|
|
write!(f, "short-hand dimensionalized numeric literal")
|
|
|
|
|
}
|
|
|
|
|
StringLiteral => write!(f, "string literal"),
|
|
|
|
|
SymbolTableKey => write!(f, "symbol table key name"),
|
|
|
|
|
TexMathLiteral => write!(f, "TeX math literal"),
|
|
|
|
|
Title => write!(f, "title"),
|
|
|
|
|
TplMetaIdent => write!(f, "template metadata identifier"),
|
|
|
|
|
TplIdent => write!(f, "template name"),
|
|
|
|
|
TplParamIdent => write!(f, "template param identifier"),
|
|
|
|
|
TypeIdent => write!(f, "type identifier"),
|
|
|
|
|
ValueIdent => write!(f, "value identifier"),
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
tamer: nir: Detect interpolated values
This simply detects whether a value will need to be further parsed for
interpolation; it does not yet perform the parsing itself, which will happen
during desugaring.
This introduces a performance regression, for an interesting reason. I
found that introducing a single new variant to `SugaredNir` (with a
`(SymbolId, Span)` pair), was causing the width of the `NirParseState` type
to increase just enough to cause Rust to be unable to optimize away a
significant number of memcpys related to `Parser` moves, and consequently
reducing performance by nearly 50% for `tamec`. Yikes.
I suspected this would be a problem, and indeed have tried in all other
cases to avoid aggregation until the ASG---the problem is that I had wanted
to aggregate attributes for NIR so that the IR could actually make some
progress toward simplifying the stream (and therefore working with the
data), and be able to validate against a grammar defined in a single
place. The problem is that the `NirParseState` type contains a sum type for
every attribute parser, and is therefore as wide as the largest one. That
is what Rust is having trouble optimizing memcpy away for.
Indeed, reducing the number of attributes improves the situation
drastically. However, it doesn't make it go away entirely.
If you look at a callgrind profile for `tameld` (or a dissassembly), you'll
notice that I put quite a bit of effort into ensuring that the hot code path
for the lowering pipeline contains _no_ memcpys for the parsers. But that
is not the case with `tamec`---I had to move on. But I do still have the
same escape hatch that I introduced for `tameld`, which is the mutable
`Context`.
It seems that may be the solution there too, but I want to get a bit further
along first to see how these data end up propagating before I go through
that somewhat significant effort.
DEV-13156
2022-11-01 14:30:34 -04:00
|
|
|
|
/// A ([`SymbolId`], [`Span`]) pair in an attribute value context that may
|
|
|
|
|
/// require desugaring and interpretation within the context of a template
|
|
|
|
|
/// application.
|
|
|
|
|
///
|
|
|
|
|
/// Interpolated values require desugaring;
|
|
|
|
|
/// see [`DesugarNir`] for more information.
|
|
|
|
|
///
|
|
|
|
|
/// _This object must be kept small_,
|
|
|
|
|
/// since it is used in objects that aggregate portions of the token
|
|
|
|
|
/// stream,
|
|
|
|
|
/// which must persist in memory for a short period of time,
|
|
|
|
|
/// and therefore cannot be optimized away as other portions of the IR.
|
|
|
|
|
/// As such,
|
|
|
|
|
/// this does not nest enums.
|
|
|
|
|
#[derive(Debug, PartialEq, Eq)]
|
|
|
|
|
pub enum SugaredNirSymbol<const TY: NirSymbolTy> {
|
|
|
|
|
/// The symbol contains an expression representing the concatenation of
|
|
|
|
|
/// any number of literals and metavariables
|
|
|
|
|
/// (referred to as "string interpolation" in many languages).
|
|
|
|
|
Interpolate(SymbolId, Span),
|
|
|
|
|
|
|
|
|
|
/// It's not ripe yet.
|
|
|
|
|
///
|
|
|
|
|
/// No parsing has been performed.
|
|
|
|
|
Todo(SymbolId, Span),
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Force developer to be conscious of any changes in size;
|
|
|
|
|
// see `SugaredNirSymbol` docs for more information.
|
|
|
|
|
assert_eq_size!(SugaredNirSymbol<{ NirSymbolTy::AnyIdent }>, u128);
|
|
|
|
|
|
|
|
|
|
/// Character whose presence in a string indicates that interpolation
|
|
|
|
|
/// parsing must occur.
|
|
|
|
|
pub const INTERPOLATE_CHAR: u8 = b'{';
|
tamer: Introduce NIR (accepting only)
This introduces NIR, but only as an accepting grammar; it doesn't yet emit
the NIR IR, beyond TODOs.
This modifies `tamec` to, while copying XIR, also attempt to lower NIR to
produce parser errors, if any. It does not yet fail compilation, as I just
want to be cautious and observe that everything's working properly for a
little while as people use it, before I potentially break builds.
This is the culmination of months of supporting effort. The NIR grammar is
derived from our existing TAME sources internally, which I use for now as a
test case until I introduce test cases directly into TAMER later on (I'd do
it now, if I hadn't spent so much time on this; I'll start introducing tests
as I begin emitting NIR tokens). This is capable of fully parsing our
largest system with >900 packages, as well as `core`.
`tamec`'s lowering is a mess; that'll be cleaned up in future commits. The
same can be said about `tameld`.
NIR's grammar has some initial documentation, but this will improve over
time as well.
The generated docs still need some improvement, too, especially with
generated identifiers; I just want to get this out here for testing.
DEV-7145
2022-08-29 15:28:03 -04:00
|
|
|
|
|
|
|
|
|
#[derive(Debug, PartialEq, Eq)]
|
|
|
|
|
pub enum PkgType {
|
|
|
|
|
/// Package is intended to produce an executable program.
|
|
|
|
|
///
|
|
|
|
|
/// This is specified by the `rater` root node.
|
|
|
|
|
Prog,
|
|
|
|
|
/// Package is intended to be imported as a component of a larger
|
|
|
|
|
/// program.
|
|
|
|
|
Mod,
|
|
|
|
|
}
|
|
|
|
|
|
tamer: nir: Detect interpolated values
This simply detects whether a value will need to be further parsed for
interpolation; it does not yet perform the parsing itself, which will happen
during desugaring.
This introduces a performance regression, for an interesting reason. I
found that introducing a single new variant to `SugaredNir` (with a
`(SymbolId, Span)` pair), was causing the width of the `NirParseState` type
to increase just enough to cause Rust to be unable to optimize away a
significant number of memcpys related to `Parser` moves, and consequently
reducing performance by nearly 50% for `tamec`. Yikes.
I suspected this would be a problem, and indeed have tried in all other
cases to avoid aggregation until the ASG---the problem is that I had wanted
to aggregate attributes for NIR so that the IR could actually make some
progress toward simplifying the stream (and therefore working with the
data), and be able to validate against a grammar defined in a single
place. The problem is that the `NirParseState` type contains a sum type for
every attribute parser, and is therefore as wide as the largest one. That
is what Rust is having trouble optimizing memcpy away for.
Indeed, reducing the number of attributes improves the situation
drastically. However, it doesn't make it go away entirely.
If you look at a callgrind profile for `tameld` (or a dissassembly), you'll
notice that I put quite a bit of effort into ensuring that the hot code path
for the lowering pipeline contains _no_ memcpys for the parsers. But that
is not the case with `tamec`---I had to move on. But I do still have the
same escape hatch that I introduced for `tameld`, which is the mutable
`Context`.
It seems that may be the solution there too, but I want to get a bit further
along first to see how these data end up propagating before I go through
that somewhat significant effort.
DEV-13156
2022-11-01 14:30:34 -04:00
|
|
|
|
/// Whether a value represented by the provided [`SymbolId`] requires
|
|
|
|
|
/// interpolation.
|
|
|
|
|
///
|
|
|
|
|
/// _NB: This dereferences the provided [`SymbolId`] if it is dynamically
|
|
|
|
|
/// allocated._
|
|
|
|
|
///
|
|
|
|
|
/// The provided value requires interpolation if it contains,
|
|
|
|
|
/// anywhere in the string,
|
|
|
|
|
/// the character [`INTERPOLATE_CHAR`].
|
|
|
|
|
/// This does not know if the string will parse correctly;
|
|
|
|
|
/// that job is left for desugaring,
|
|
|
|
|
/// and so this will flag syntactically invalid interpolated strings
|
|
|
|
|
/// (which is expected).
|
|
|
|
|
#[inline]
|
|
|
|
|
fn needs_interpolation(val: SymbolId) -> bool {
|
|
|
|
|
// We can skip pre-interned symbols that we know cannot include the
|
|
|
|
|
// interpolation character.
|
|
|
|
|
// TODO: Abstract into `sym::symbol` module.
|
|
|
|
|
let ch = INTERPOLATE_CHAR;
|
|
|
|
|
quick_contains_byte(val, ch)
|
|
|
|
|
.or_else(|| memchr(ch, val.lookup_str().as_bytes()).map(|_| true))
|
|
|
|
|
.unwrap_or(false)
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
impl<const TY: NirSymbolTy> TryFrom<(SymbolId, Span)> for SugaredNirSymbol<TY> {
|
|
|
|
|
type Error = NirAttrParseError;
|
|
|
|
|
|
|
|
|
|
fn try_from((val, span): (SymbolId, Span)) -> Result<Self, Self::Error> {
|
|
|
|
|
match needs_interpolation(val) {
|
|
|
|
|
true => Ok(SugaredNirSymbol::Interpolate(val, span)),
|
|
|
|
|
false => Ok(SugaredNirSymbol::Todo(val, span)),
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
impl<const TY: NirSymbolTy> TryFrom<Attr> for SugaredNirSymbol<TY> {
|
|
|
|
|
type Error = NirAttrParseError;
|
|
|
|
|
|
|
|
|
|
fn try_from(attr: Attr) -> Result<Self, Self::Error> {
|
tamer: Introduce NIR (accepting only)
This introduces NIR, but only as an accepting grammar; it doesn't yet emit
the NIR IR, beyond TODOs.
This modifies `tamec` to, while copying XIR, also attempt to lower NIR to
produce parser errors, if any. It does not yet fail compilation, as I just
want to be cautious and observe that everything's working properly for a
little while as people use it, before I potentially break builds.
This is the culmination of months of supporting effort. The NIR grammar is
derived from our existing TAME sources internally, which I use for now as a
test case until I introduce test cases directly into TAMER later on (I'd do
it now, if I hadn't spent so much time on this; I'll start introducing tests
as I begin emitting NIR tokens). This is capable of fully parsing our
largest system with >900 packages, as well as `core`.
`tamec`'s lowering is a mess; that'll be cleaned up in future commits. The
same can be said about `tameld`.
NIR's grammar has some initial documentation, but this will improve over
time as well.
The generated docs still need some improvement, too, especially with
generated identifiers; I just want to get this out here for testing.
DEV-7145
2022-08-29 15:28:03 -04:00
|
|
|
|
match attr {
|
tamer: nir: Detect interpolated values
This simply detects whether a value will need to be further parsed for
interpolation; it does not yet perform the parsing itself, which will happen
during desugaring.
This introduces a performance regression, for an interesting reason. I
found that introducing a single new variant to `SugaredNir` (with a
`(SymbolId, Span)` pair), was causing the width of the `NirParseState` type
to increase just enough to cause Rust to be unable to optimize away a
significant number of memcpys related to `Parser` moves, and consequently
reducing performance by nearly 50% for `tamec`. Yikes.
I suspected this would be a problem, and indeed have tried in all other
cases to avoid aggregation until the ASG---the problem is that I had wanted
to aggregate attributes for NIR so that the IR could actually make some
progress toward simplifying the stream (and therefore working with the
data), and be able to validate against a grammar defined in a single
place. The problem is that the `NirParseState` type contains a sum type for
every attribute parser, and is therefore as wide as the largest one. That
is what Rust is having trouble optimizing memcpy away for.
Indeed, reducing the number of attributes improves the situation
drastically. However, it doesn't make it go away entirely.
If you look at a callgrind profile for `tameld` (or a dissassembly), you'll
notice that I put quite a bit of effort into ensuring that the hot code path
for the lowering pipeline contains _no_ memcpys for the parsers. But that
is not the case with `tamec`---I had to move on. But I do still have the
same escape hatch that I introduced for `tameld`, which is the mutable
`Context`.
It seems that may be the solution there too, but I want to get a bit further
along first to see how these data end up propagating before I go through
that somewhat significant effort.
DEV-13156
2022-11-01 14:30:34 -04:00
|
|
|
|
Attr(_, val, AttrSpan(_, vspan)) => (val, vspan).try_into(),
|
tamer: Introduce NIR (accepting only)
This introduces NIR, but only as an accepting grammar; it doesn't yet emit
the NIR IR, beyond TODOs.
This modifies `tamec` to, while copying XIR, also attempt to lower NIR to
produce parser errors, if any. It does not yet fail compilation, as I just
want to be cautious and observe that everything's working properly for a
little while as people use it, before I potentially break builds.
This is the culmination of months of supporting effort. The NIR grammar is
derived from our existing TAME sources internally, which I use for now as a
test case until I introduce test cases directly into TAMER later on (I'd do
it now, if I hadn't spent so much time on this; I'll start introducing tests
as I begin emitting NIR tokens). This is capable of fully parsing our
largest system with >900 packages, as well as `core`.
`tamec`'s lowering is a mess; that'll be cleaned up in future commits. The
same can be said about `tameld`.
NIR's grammar has some initial documentation, but this will improve over
time as well.
The generated docs still need some improvement, too, especially with
generated identifiers; I just want to get this out here for testing.
DEV-7145
2022-08-29 15:28:03 -04:00
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
#[derive(Debug, PartialEq, Eq)]
|
|
|
|
|
pub struct Literal<const S: SymbolId>;
|
|
|
|
|
|
|
|
|
|
impl<const S: SymbolId> TryFrom<Attr> for Literal<S> {
|
|
|
|
|
type Error = NirAttrParseError;
|
|
|
|
|
|
|
|
|
|
fn try_from(attr: Attr) -> Result<Self, Self::Error> {
|
|
|
|
|
match attr {
|
|
|
|
|
Attr(_, val, _) if val == S => Ok(Literal),
|
|
|
|
|
Attr(name, _, aspan) => Err(NirAttrParseError::LiteralMismatch(
|
|
|
|
|
name,
|
|
|
|
|
aspan.value_span(),
|
|
|
|
|
S,
|
|
|
|
|
)),
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
impl From<Infallible> for NirAttrParseError {
|
|
|
|
|
fn from(x: Infallible) -> Self {
|
|
|
|
|
match x {}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
type ExpectedSymbolId = SymbolId;
|
|
|
|
|
|
|
|
|
|
#[derive(Debug, PartialEq, Eq)]
|
|
|
|
|
pub enum NirAttrParseError {
|
|
|
|
|
LiteralMismatch(QName, Span, ExpectedSymbolId),
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
impl Error for NirAttrParseError {
|
|
|
|
|
fn source(&self) -> Option<&(dyn Error + 'static)> {
|
|
|
|
|
None
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
impl Display for NirAttrParseError {
|
|
|
|
|
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
|
|
|
|
|
match self {
|
|
|
|
|
Self::LiteralMismatch(name, _, _) => {
|
|
|
|
|
write!(f, "unexpected value for {}", TtXmlAttr::wrap(name),)
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
impl Diagnostic for NirAttrParseError {
|
|
|
|
|
fn describe(&self) -> Vec<crate::diagnose::AnnotatedSpan> {
|
|
|
|
|
match self {
|
|
|
|
|
Self::LiteralMismatch(_, span, expected) => span
|
|
|
|
|
.error(format!("expecting {}", TtQuote::wrap(expected)))
|
|
|
|
|
.into(),
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
tamer: nir: Detect interpolated values
This simply detects whether a value will need to be further parsed for
interpolation; it does not yet perform the parsing itself, which will happen
during desugaring.
This introduces a performance regression, for an interesting reason. I
found that introducing a single new variant to `SugaredNir` (with a
`(SymbolId, Span)` pair), was causing the width of the `NirParseState` type
to increase just enough to cause Rust to be unable to optimize away a
significant number of memcpys related to `Parser` moves, and consequently
reducing performance by nearly 50% for `tamec`. Yikes.
I suspected this would be a problem, and indeed have tried in all other
cases to avoid aggregation until the ASG---the problem is that I had wanted
to aggregate attributes for NIR so that the IR could actually make some
progress toward simplifying the stream (and therefore working with the
data), and be able to validate against a grammar defined in a single
place. The problem is that the `NirParseState` type contains a sum type for
every attribute parser, and is therefore as wide as the largest one. That
is what Rust is having trouble optimizing memcpy away for.
Indeed, reducing the number of attributes improves the situation
drastically. However, it doesn't make it go away entirely.
If you look at a callgrind profile for `tameld` (or a dissassembly), you'll
notice that I put quite a bit of effort into ensuring that the hot code path
for the lowering pipeline contains _no_ memcpys for the parsers. But that
is not the case with `tamec`---I had to move on. But I do still have the
same escape hatch that I introduced for `tameld`, which is the mutable
`Context`.
It seems that may be the solution there too, but I want to get a bit further
along first to see how these data end up propagating before I go through
that somewhat significant effort.
DEV-13156
2022-11-01 14:30:34 -04:00
|
|
|
|
|
|
|
|
|
#[cfg(test)]
|
|
|
|
|
mod test;
|