tamer: nir::parse: Grammar summary docs

This is intended to provide just enough information to help elucidate how the system works and why. DEV-7145
2022-09-19 09:22:07 -04:00 · 2022-09-19 09:22:07 -04:00 · 9966b82b9d
parent dcb42b6e4b
commit 9966b82b9d
2 changed files with 88 additions and 14 deletions
--- a/tamer/src/nir.rs
+++ b/tamer/src/nir.rs
@ -43,6 +43,11 @@
 //!       where the latter is used to gather enough context for formatting
 //!       and the former is used as a concrete representation of what the user
 //!       actually typed.
+//!
+//! For more information on the parser,
+//!   see [`parse`].
+//! The entry point for NIR in the lowering pipeline is exported as
+//!   [`XirfToNir`].

 mod parse;

--- a/tamer/src/nir/parse.rs
+++ b/tamer/src/nir/parse.rs
@ -17,23 +17,90 @@
 //  You should have received a copy of the GNU General Public License
 //  along with this program.  If not, see <http://www.gnu.org/licenses/>.

-//! Normalized source IR.
+//! NIR parser.
 //!
-//! This IR is "near" the source code written by the user,
-//!   performing only normalization tasks like desugaring.
-//! The hope is that all desugaring will be done by templates in the future.
+//! For general information about NIR,
+//!   see the [parent module](super).
 //!
-//! This is a streaming IR,
-//!   meaning that the equivalent AST is not explicitly represented as a
-//!   tree structure in memory.
+//! The entry point for this parser in the lowering pipeline is
+//!   [`NirParseState`].
+//! The grammar is defined declaratively using the [`ele_parse!`]
+//!   parser-generator,
+//!     which yields a parser compatible with TAME's [`crate::parse`]
+//!     framework.
 //!
-//! This IR is lossy and does not retain enough information for code
-//!   formatting---that
-//!     type of operation will require a mapping between
-//!     XIRF and NIR,
-//!       where the latter is used to gather enough context for formatting
-//!       and the former is used as a concrete representation of what the user
-//!       actually typed.
+//! Grammar Definition
+//! ==================
+//! The grammar can be seen in the TAMER sources;
+//!   if you are viewing the generated documentation,
+//!     it can be viewed by clicking on "source" in the upper-right-hand
+//!     corner of this page,
+//!       or on each individual identifier.
+//!
+//! The grammar defines nonterminals (NTs) of two forms:
+//!
+//!   1. [XIR](crate::xir) elements with their attributes and child NTs; and
+//!   2. Sum NTs of the form `(NT₀ | NT₁ | … | NTₙ)` which match on any of
+//!      inner NTs.
+//!
+//! Terminals are specified in element name and attribute contexts as
+//!   [static QName](crate::xir::st::qname) constants of the form `QN_*`.
+//! These constants are defined in [`crate::xir::st::qname`] and allow the
+//!   to efficiently match on element and attribute names by comparing a
+//!   single 64-bit integer value,
+//!     which in turn may be optimized to compare many possible QName
+//!     values simultaneously.
+//!
+//! The style of the grammar is meant to be a combination of a BNF and Rust
+//!   syntax.
+//!
+//! Repetition and Templates
+//! ------------------------
+//! _All NTs are implicitly defined as zero-or-more_
+//!   (as with the Kleene star),
+//!   and this behavior cannot be overridden.
+//! The rationale for this is somewhat complex,
+//!   but the tradeoff greatly simplifies the [`ele_parse!`]
+//!   parser-generator in recognition of a simple fact about NIR:
+//!     it cannot determine statically whether a source file will conform to
+//!     TAME's grammar when all templates are expanded.
+//!
+//! Templates require an interpreter and are expanded later in the lowering
+//!   pipeline.
+//! NIR is unable to perform that expansion,
+//!   and so we do the best we can do in this situation:
+//!     verify that templates,
+//!       when expanded,
+//!       will expand into primitives known to NIR,
+//!         and validate those primitives when possible.
+//! This can only go so far,
+//!   given that templates can appear virtually anywhere in the source tree.
+//!
+//! Because templates are able to expand into anything that is known to
+//!   NIR's grammar,
+//!     NIR cannot know whether a required element has been provided or not.
+//! Consequently,
+//!   we cannot require that an element be present as part of NIR's grammar,
+//!     since it may have been hidden behind a template application.
+//! For the same reason,
+//!   we cannot place _any_ restrictions on the number of repetitions of a
+//!   particular element.
+//!
+//! The best we can do is therefore to merely validate that,
+//!   whatever _is_ present,
+//!   is conceivably valid at that position within the grammar.
+//! It is then the burden of a future lowering operation to validate the
+//!   grammar post-expansion.
+//!
+//! What NIR therefore provides is an IR that is _closed_ under template
+//!   application---this
+//!     means that,
+//!       when a template _is_ expanded into an application site,
+//!       it _will_ expand into a sequence of parsed NIR tokens and cannot
+//!       possibly expand into anything else.
+//! What the template system does with those tokens is beyond our concern.
+//!
+//! See [`TplKw`] for template tokens that are accepted anywhere.

 use super::*;

@ -82,6 +149,8 @@ ele_parse! {
    ///   beyond the scope of NIR and should be handled as part of another
    ///   lowering operation.
    ///
+    /// See the [parent module](super) for more information.
+    ///
    /// Superstate
    /// ----------
    pub enum NirParseState;