employer/tame - tame - Mike Gerwitz's Forge

employer

tame

Author	SHA1	Message	Date
Mike Gerwitz	ab181670b5	tamer: xir::reader: Initial introduction of spans This is a large change, and was a bit of a tedious one, given the comprehensive tests. This introduces proper offsets and lengths for spans, with the exception of some quick-xml errors that still need proper mapping. Further, this still uses `UNKNOWN_CONTEXT`, which will be resolved shortly. This also introduces `SpanlessError`, which `Error` explicitly _does not_ implement `From<SpanlessError>` for---this forces the caller to provide a span before the error is compatable with the return value, ensuring that spans will actually be available rather than forgotten for errors. This is important, given that errors are generally less tested than the happy path, and errors are when users need us the most (so, need span information). Further, I had to use pointer arithmetic in order to calculate many of the spans, because quick-xml does not provide enough information. There's no safety considerations here, and the comprehensive unit test will ensure correct behavior if the implementation changes in the future. I would like to introduce typed spans at some point---I made some opinionated choices when it comes to what the spans ought to represent. Specifically, whether to include the `<` or `>` with the open span (depends), whether to include quotes with attribute values (no), and some other details highlighted in the test cases. If we provide typed spans, then we could, knowing the type of span, calculate other spans on request, e.g. to include or omit quotes for attributes. Different such spans may be useful in different situations when presenting information to the user. This also highlights gaps in the tokens emitted by XIR, such as whitespace between attributes, the `=` between name and value, and so on. These are important when it comes to code formatting, so that we can reliably reconstruct the XML tree, but it's not important right now. I anticipate future changes would allow the XIR reader to be configured (perhaps via generics, like a strategy-type pattern) to optionally omit these tokens if desired. Anyway, more to come. DEV-10934	2022-04-08 13:59:37 -04:00
Mike Gerwitz	f42288f3a2	tamer: obj::xmlo::reader: Begin symbol table parsing This wasn't the simplest thing to start with, but I wanted to explore something with a higher level of complexity. There is some boilerplate to observe here, including: 1. The state stitching (as I guess I'm calling it now) of SymtableState with XmloReaderState is all boilerplate and requires no lookahead, presenting an abstraction opportunity that I was holding off on previously (attr parsing for XIRF requires lookahead). 2. This is simply collecting attributes into a struct. This can be abstracted away in the future. 3. Creating stub parsers to verify that generics are stitched rather than being tightly coupled with another state is boilerplate that maybe can be abstracted away after a pattern is observed in future tests. DEV-10863	2022-03-29 11:14:47 -04:00
Mike Gerwitz	aba89f809d	tamer: xir::parse: UnexpectedEof Span at final offset I'm not rendering errors yet in practice, so this wouldn't have been noticed, but we want error messages to reference the final byte in a file on EOF, not the offset of the last-encountered token, which would be confusing. This doesn't _directly_ pertain to what I'm working on; I just happened to notice it. DEV-10863	2022-03-17 21:33:05 -04:00
Mike Gerwitz	ce48a654b1	tamer: span::Span::offset_add: Make const This behavior is unchanged, but it allows us to create more constant spans for testing. For example: const S = DUMMY_SPAN.offset_add(1).unwrap(); This, in turn, will allow for removing lazy_static! for tests that use it for span generation. DEV-10863	2022-03-16 14:16:28 -04:00
Mike Gerwitz	7873d46afb	tamer: Replace all &'static str in errors with SymbolId Now that SymbolId implements Display and resolves, this works out well.	2021-10-11 15:39:53 -04:00
Mike Gerwitz	7e9271e189	tamer: span: Primitive Display impl This outputs enough information to be a little bit useful in the event of an error. In the future, we'll want to provide a (likely non-Display) implementation that provides line number and source file context with the problem characters indicated, like Rust.	2021-10-11 14:14:43 -04:00
Mike Gerwitz	cde08b125c	tamer: span (DUMMY_SPAN): New constant Rather than having to use lazy_static! in all these tests, we can derive an unlimited number of dummy spans from this one using e.g. `offset_add`.	2021-10-11 10:29:58 -04:00
Mike Gerwitz	cf239531e0	tamer: span (offset_add): New method More will come in the future, including the ability to add two spans.	2021-10-11 10:28:47 -04:00
Mike Gerwitz	de3d7ef393	tamer: span: Introduce twospan The intent is to support the composition and decomposition of spans such that (A, B) is as documented here. This only performs the trivial case for the sake of providing a convenient API when the developer would otherwise just type (S, S).	2021-10-11 09:56:48 -04:00
Mike Gerwitz	6864fbc1cd	tamer: Start of XIR-based xmle writer This has been a long time coming, and has been repeatedly stashed as other parts of the system have evolved to support it. The introduction of the XIR tree was to write tests for this (which are sloppy atm). This currently writes out the `xmle` header and _most_ of the `l:dep` section; it's missing the object-type-specific attributes. There is, relatively speaking, not much more work to do here. The feature flag `wip-xir-xmle-writer` was introduced to toggle this system in place of `XmleWriter`. Initial benchmarks show that it will be competitive with the quick-xml-based writer, but remember that is not the goal: the purpose of this is to test XIR in a production system before we continue to implement it for a frontend, and to refactor so that we do not have multiple implementations writing XML files (once we echo the source XML files). I'm excited to get this done with so that I can move on. This has been rather exhausting.	2021-09-28 14:52:53 -04:00
Mike Gerwitz	e91aeef478	tamer: Remove Ix generalization throughout system This had the writing on the wall all the same as the `'i` interner lifetime that came before it. It was too much of a maintenance burden trying to accommodate both 16-bit and 32-bit symbols generically. There is a situation where we do still want 16-bit symbols---the `Span`. Therefore, I have left generic support for symbol sizes, as well as the different global interners, but `SymbolId` now defaults to 32-bit, as does `Asg`. Further, the size parameter has been removed from the rest of the code, with the exception of `Span`. This cleans things up quite a bit, and is much nicer to work with. If we want 16-bit symbols in the future for packing to increase CPU cache performance, we can handle that situation then in that specific case; it's a premature optimization that's not at all worth the effort here.	2021-09-23 14:52:54 -04:00
Mike Gerwitz	0ff0f88e5f	tamer: Introduce span This is an initial implementation optimized for expected use cases. Hopefully that pans out and doesn't come back to bite me. Regarding the context: it only allows for interned paths atm, which are strings (and so much be valid UTF-8, which is fine for us, but sucks for something more general-purpose). I'll be curious if the context needs extension later on, or if different contexts will be stored in IRs (e.g. to store a template application site as well as the location of the expansion within the template body).	2021-08-13 15:16:39 -04:00

12 Commits (c49510646bc686a964cb44a6e6afe0da22da2fdf)