tame/tamer/src/lib.rs

102 lines
3.5 KiB
Rust
Raw Normal View History

// TAME in Rust (TAMER)
//
// Copyright (C) 2014-2022 Ryan Specialty Group, LLC.
2020-03-06 11:05:18 -05:00
//
// This file is part of TAME.
//
// This program is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU General Public License
// along with this program. If not, see <http://www.gnu.org/licenses/>.
//! An incremental rewrite of TAME in Rust.
// Constant functions are still in their infancy as of the time of writing
// (October 2021).
// These this feature is used by [`sym::prefill::st_as_sym`] to provide
// polymorphic symbol types despite Rust's lack of support for constant
// trait methods.
// See that function for more information.
#![feature(const_transmute_copy)]
// This is used to unwrap const Option results rather than providing
// panicing alternatives.
#![feature(const_option)]
// Trait aliases are convenient for reducing verbosity in situations where
// type aliases cannot be used.
// To remove this feature if it is not stabalized,
// simply replace each alias reference with its definition,
// or possibly write a trait with a `Self` bound.
#![feature(trait_alias)]
// Can be replaced with `assert!(matches!(...))`,
// but at a loss of a better error message.
#![feature(assert_matches)]
// Simplifies creating `Option` default values.
// To remove this feature,
// this can be done more verbosely in the usual way,
// or we can write our own version.
#![feature(option_get_or_insert_default)]
// For `Try` and `FromResidual`,
// allowing us to write our own `?`-compatible types.
#![feature(try_trait_v2)]
// Used primarily for convenience,
// rather than having to create type constructors as type aliases that are
// not associated with a trait.
// However,
// this also allows for the associated type default to be overridden by
// the implementer,
// in which case this feature's only substitute is a type parameter.
#![feature(associated_type_defaults)]
// Convenience features that are easily replaced if not stabalized.
#![feature(nonzero_min_max)]
#![feature(nonzero_ops)]
// Note: this is the first time TAMER was hit by a change in an unstable
// feature,
// when `log10` et al. were renamed to `ilog10` et al:
// <https://github.com/rust-lang/rust/pull/100332>
#![feature(int_log)]
// Enabled for qualified paths in `matches!`.
#![feature(more_qualified_paths)]
// Used for const params like `&'static str` in `crate::fmt`.
// If this is not stabalized,
// then we can do without by changing the abstraction;
// this is largely experimentation to see if it's useful.
#![allow(incomplete_features)]
#![feature(adt_const_params)]
// We build docs for private items.
#![allow(rustdoc::private_intra_doc_links)]
// For sym::prefill recursive macro `static_symbols!`.
#![recursion_limit = "512"]
pub mod global;
2020-03-24 14:14:05 -04:00
#[macro_use]
extern crate static_assertions;
tamer: xir::parse: Attribute parser generator This is the first parser generator for the parsing framework. I've been waiting quite a while to do this because I wanted to be sure that I understood how I intended to write the attribute parsers manually. Now that I'm about to start parsing source XML files, it is necessary to have a parser generator. Typically one thinks of a parser generator as a separate program that generates code for some language, but that is not always the case---that represents a lack of expressiveness in the language itself (e.g. C). Here, I simply use Rust's macro system, which should be a concept familiar to someone coming from a language like Lisp. This also resolves where I stand on parser combinators with respect to this abstraction: they both accomplish the exact same thing (composition of smaller parsers), but this abstraction doesn't do so in the typical functional way. But the end result is the same. The parser generated by this abstraction will be optimized an inlined in the same manner as the hand-written parsers. Since they'll be tightly coupled with an element parser (which too will have a parser generator), I expect that most attribute parsers will simply be inlined; they exist as separate parsers conceptually, for the same reason that you'd use parser combinators. It's worth mentioning that this awkward reliance on dead state for a lookahead token to determine when aggregation is complete rubs me the wrong way, but resolving it would involve reintroducing the XIR AttrEnd that I had previously removed. I'll keep fighting with myself on this, but I want to get a bit further before I determine if it's worth the tradeoff of reintroducing (more complex IR but simplified parsing). DEV-7145
2022-06-13 11:17:21 -04:00
#[macro_use]
pub mod xir;
pub mod asg;
pub mod convert;
pub mod diagnose;
pub mod fmt;
pub mod fs;
pub mod iter;
pub mod ld;
pub mod nir;
pub mod num;
pub mod obj;
pub mod parse;
pub mod span;
tamer: Global interners This is a major change, and I apologize for it all being in one commit. I had wanted to break it up, but doing so would have required a significant amount of temporary work that was not worth doing while I'm the only one working on this project at the moment. This accomplishes a number of important things, now that I'm preparing to write the first compiler frontend for TAMER: 1. `Symbol` has been removed; `SymbolId` is used in its place. 2. Consequently, symbols use 16 or 32 bits, rather than a 64-bit pointer. 3. Using symbols no longer requires dereferencing. 4. **Lifetimes no longer pollute the entire system! (`'i`)** 5. Two global interners are offered to produce `SymbolStr` with `'static` lifetimes, simplfiying lifetime management and borrowing where strings are still needed. 6. A nice API is provided for interning and lookups (e.g. "foo".intern()) which makes this look like a core feature of Rust. Unfortunately, making this change required modifications to...virtually everything. And that serves to emphasize why this change was needed: _everything_ used symbols, and so there's no use in not providing globals. I implemented this in a way that still provides for loose coupling through Rust's trait system. Indeed, Rustc offers a global interner, and I decided not to go that route initially because it wasn't clear to me that such a thing was desirable. It didn't become apparent to me, in fact, until the recent commit where I introduced `SymbolIndexSize` and saw how many things had to be touched; the linker evolved so rapidly as I was trying to learn Rust that I lost track of how bad it got. Further, this shows how the design of the internment system was a bit naive---I assumed certain requirements that never panned out. In particular, everything using symbols stored `&'i Symbol<'i>`---that is, a reference (usize) to an object containing an index (32-bit) and a string slice (128-bit). So it was a reference to a pretty large value, which was allocated in the arena alongside the interned string itself. But, that was assuming that something would need both the symbol index _and_ a readily available string. That's not the case. In fact, it's pretty clear that interning happens at the beginning of execution, that `SymbolId` is all that's needed during processing (unless an error occurs; more on that below); and it's not until _the very end_ that we need to retrieve interned strings from the pool to write either to a file or to display to the user. It was horribly wasteful! So `SymbolId` solves the lifetime issue in itself for most systems, but it still requires that an interner be available for anything that needs to create or resolve symbols, which, as it turns out, is still a lot of things. Therefore, I decided to implement them as thread-local static variables, which is very similar to what Rustc does itself (Rustc's are scoped). TAMER does not use threads, so the resulting `'static` lifetime should be just fine for now. Eventually I'd like to implement `!Send` and `!Sync`, though, to prevent references from escaping the thread (as noted in the patch); I can't do that yet, since the feature has not yet been stabalized. In the end, this leaves us with a system that's much easier to use and maintain; hopefully easier for newcomers to get into without having to deal with so many complex lifetimes; and a nice API that makes it a pleasure to work with symbols. Admittedly, the `SymbolIndexSize` adds some complexity, and we'll see if I end up regretting that down the line, but it exists for an important reason: the `Span` and other structures that'll be introduced need to pack a lot of data into 64 bits so they can be freely copied around to keep lifetimes simple without wreaking havoc in other ways, but a 32-bit symbol size needed by the linker is too large for that. (Actually, the linker doesn't yet need 32 bits for our systems, but it's going to in the somewhat near future unless we optimize away a bunch of symbols...but I'd really rather not have the linker hit a limit that requires a lot of code changes to resolve). Rustc uses interned spans when they exceed 8 bytes, but I'd prefer to avoid that for now. Most systems can just use on of the `PkgSymbolId` or `ProgSymbolId` type aliases and not have to worry about it. Systems that are actually shared between the compiler and the linker do, though, but it's not like we don't already have a bunch of trait bounds. Of course, as we implement link-time optimizations (LTO) in the future, it's possible most things will need the size and I'll grow frustrated with that and possibly revisit this. We shall see. Anyway, this was exhausting...and...onward to the first frontend!
2021-08-02 23:54:37 -04:00
pub mod sym;
#[cfg(test)]
pub mod test;