tamer: sym::prefill: Static symbol polymorphism

See the docs for a much deeper discussion.  In summary: traits do not
support static methods, and this is the workaround, which relies on unstable
nightly constant function features.

This implementation is tested using `qname_const!`, and will be utilized
with a new static type in a following commit.
main
Mike Gerwitz 2021-10-02 00:50:20 -04:00
parent 9d87962e96
commit f9c9c95516
3 changed files with 132 additions and 9 deletions

View File

@ -31,7 +31,8 @@
use crate::span::Span;
use crate::sym::{
CIdentStaticSymbolId, GlobalSymbolIntern, SymbolId, UriStaticSymbolId,
CIdentStaticSymbolId, GlobalSymbolIntern, StaticSymbolId, SymbolId,
UriStaticSymbolId,
};
use std::convert::{TryFrom, TryInto};
use std::fmt::Display;
@ -41,13 +42,16 @@ pub mod pred;
pub mod tree;
pub mod writer;
pub trait QNameCompatibleStaticSymbolId: StaticSymbolId {}
impl QNameCompatibleStaticSymbolId for CIdentStaticSymbolId {}
macro_rules! qname_const_inner {
($name:ident = :$local:ident) => {
const $name: QName = QName::st_cid_local($local);
};
($name:ident = $prefix:ident:$local:ident) => {
const $name: QName = QName::st_cid($prefix, $local);
const $name: QName = QName::st_cid(&$prefix, &$local);
};
}
@ -280,13 +284,14 @@ impl QName {
}
/// Construct a constant QName from static C-style symbols.
pub const fn st_cid(
prefix_sym: CIdentStaticSymbolId,
local_sym: CIdentStaticSymbolId,
pub const fn st_cid<T: QNameCompatibleStaticSymbolId>(
prefix_sym: &T,
local_sym: &T,
) -> Self {
use crate::sym;
Self(
Some(Prefix(NCName(prefix_sym.as_sym()))),
LocalPart(NCName(local_sym.as_sym())),
Some(Prefix(NCName(sym::st_as_sym(prefix_sym)))),
LocalPart(NCName(sym::st_as_sym(local_sym))),
)
}

View File

@ -19,6 +19,14 @@
//! An incremental rewrite of TAME in Rust.
// Constant functions are still in their infancy as of the time of writing
// (October 2021).
// These two features are used by [`sym::prefill::st_as_sym`] to provide
// polymorphic symbol types despite Rust's lack of support for constant
// trait methods.
// See that function for more information.
#![feature(const_fn_trait_bound)]
#![feature(const_transmute_copy)]
// We build docs for private items
#![allow(rustdoc::private_intra_doc_links)]

View File

@ -30,11 +30,95 @@ use super::{Interner, SymbolId, SymbolIndexSize};
use crate::global;
use std::array;
/// Static symbol identifier that is stable between runs of the same version
/// of TAMER.
///
/// This symbol id is allocated at compile-time.
///
/// _All objects implementing this trait must have the same byte
/// representation as its inner [`SymbolId`]_.
pub unsafe trait StaticSymbolId<Ix: SymbolIndexSize = global::ProgSymSize>:
private::Sealed
{
// Traits cannot contain constant functions.
// See [`st_as_sym`] below.
}
/// Convert any [`StaticSymbolId`] into its inner [`SymbolId`].
///
/// Static symbols are typed to convey useful information to newtypes that
/// wish to wrap or compose them.
/// This function peels back that type information to expose the inner
/// symbol.
///
/// Safety and Rationale
/// ====================
/// This function does its best to work around the limitation in Rust that
/// traits cannot contain constant functions
/// (at the time of writing).
///
/// To do this,
/// we require that every object of type [`StaticSymbolId`] have _the same
/// byte representation_ as [`SymbolId`].
/// Since Rust optimizes away simple newtype wrappers,
/// this means that we can simply cast the value to a symbol.
///
/// For example, if we have `StaticSymbolId<u32>`,
/// this would cast to a `SymbolId<u32>`.
/// The inner value of `SymbolId<u32>` is
/// `<u32 as SymbolIndexSize>::NonZero`,
/// which has the same byte representation as `u32`.
///
/// This would normally be done using [`std::mem::transmute`],
/// which ensures that the two types have compatible sizes.
/// Unfortunately,
/// the types here do not have fixed size and constant functions are
/// unable to verify that they are compatible at the time of writing.
/// We therefore must use [`std::mem::transmute_copy`] to circumvent this
/// size check.
///
/// Circumventing this check is safe given our trait bounds for all static
/// symbols in this module and its children.
/// However,
/// for this safety to hold,
/// we must ensure that no outside modules can implement
/// [`StaticSymbolId`] on their own objects.
/// For this reason,
/// [`StaticSymbolId`] implements [`private::Sealed`].
///
/// With that,
/// we get [`SymbolId`] polymorphism despite Rust's limitations.
///
/// A Note About Nightly
/// ====================
/// At the time of writing,
/// though,
/// this _does_ require two unstable features:
/// `const_fn_trait_bound` and `const_transmute_copy`.
/// We can get rid of the latter using raw pointer casts,
/// just as it does,
/// but since we're already relying on unstable flags,
/// we may as well use it while we require nightly for other things as
/// well.
///
/// `const_fn_trait_bound` cannot be removed in this situation without
/// another plan.
/// `const_panic` could be used with an enum,
/// but that still requires nightly.
pub const fn st_as_sym<T, Ix>(st: &T) -> SymbolId<Ix>
where
T: StaticSymbolId<Ix>,
Ix: SymbolIndexSize,
{
// SAFETY: A number of precautions are taken to make this a safe and
// sensible transformation; see function doc above.
SymbolId(unsafe { std::mem::transmute_copy(st) })
}
/// Generate a newtype containing a condensed [`SymbolId`].
macro_rules! static_symbol_newtype {
($(#[$attr:meta])* $name:ident<$size:ty>) => {
$(#[$attr])*
#[doc=""]
/// This is a statically-allocated symbol.
///
/// This symbol is generated at compile-time and expected to be
@ -43,6 +127,13 @@ macro_rules! static_symbol_newtype {
#[derive(Copy, Clone, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct $name(<$size as SymbolIndexSize>::NonZero);
// Mark this as a static symbol type, ensuring that it size is fully
// compatible with the underlying `SymbolId` so as not to cause
// problems with `st_as_sym`.
impl private::Sealed for $name {}
unsafe impl StaticSymbolId<$size> for $name {}
assert_eq_size!($name, SymbolId<$size>);
impl $name {
const fn new(id: $size) -> Self {
Self(unsafe {
@ -132,7 +223,7 @@ macro_rules! static_symbol_consts {
/// and schedule their static strings to be interned upon initialization
/// of the global interner.
///
/// This generates [`fill`],
/// This generates `fill`,
/// which the global interners call by default.
/// Any interner may optionally invoke this,
/// immediately after initialization,
@ -369,6 +460,25 @@ pub mod st16 {
}
}
/// Non-public module that can contain public traits.
///
/// The problem this module tries to solve is preventing anything outside of
/// this crate from implementing the `StaticSymbolId` trait,
/// since doing so opens us up to undefined behavior when transmuting
/// via [`st_as_sym`](super::st_as_sym).
mod private {
/// Extend this trait to prevent other modules from implementing the
/// subtype.
///
/// Since other modules extend [`StaticSymbolId`](super::StaticSymbolId)
/// for their own traits,
/// this trait must be `pub`.
/// But, since it is contained within a private module,
/// it is not possible to import the trait to implement it on other
/// things.
pub trait Sealed {}
}
#[cfg(test)]
mod test {
use super::{st, st16, DecStaticSymbolId};