employer/tame - tame - Mike Gerwitz's Forge

employer

tame

Author	SHA1	Message	Date
Mike Gerwitz	61d556c89e	tamer: pipeline: Generate recoverable sum error types This was a significant undertaking, with a few thrown-out approaches. The documentation describes what approach was taken, but I'd also like to provide some insight into the approaches that I rejected for various reasons, or because they simply didn't work. The problem that this commit tries to solve is encapsulation of error types. Prior to the introduction of the lowering pipeline macro `lower_pipeline!`, all pipelines were written by hand using `Lower` and specifying the applicable types. This included creating sum types to accommodate each of the errors so that `Lower` could widen automatically. The introduction of the `lower_pipeline!` macro resolved the boilerplate and type complexity concerns for the parsers by allowing the pipeline to be concisely declared. However, it still accepted an error sum type `ER` for recoverable errors, which meant that we had to break a level of encapsulation, peering into the pipeline to know both what parsers were in play and what their error types were. These error sum types were also the source of a lot of tedious boilerplate that made adding new parsers to the pipeline unnecessarily unpleasant; the purpose of the macro is to make composition both easy and clear, and error types were undermining it. Another benefit of sum types per pipeline is that callers need only aggregate those pipeline types, if they care about them, rather than every error type used as a component of the pipeline. So, this commit generates the error types. Doing so was non-trivial. Associated Types and Lifetimes ------------------------------ Error types are associated with their `ParseState` as `ParseState::Error`. As described in this commit, TAMER's approach to errors is that they never contain non-static lifetimes; interning and copying are used to that effect. And, indeed, no errors in TAMER have lifetimes. But, some `ParseState`s may. In this case, `AsgTreeToXirf`: ``` impl<'a> ParseState for AsgTreeToXirf<'a> { // [...] type Error = AsgTreeToXirfError; // [...] } ``` Even though `AsgTreeToXirfError` does not have a lifetime, the `ParseState` it is associated with _does_`. So to reference that type, we must use `<AsgTreeToXirf<'a> as ParseState>::Error`. So if we have a sum type: ``` enum Sum<'a> { // ^^ oh no! vv AsgTreeToXirfError(<AsgTreeToXirf<'a> as ParseState>::Error), } ``` There's no way to elide or make anonymous that lifetime, since it's not used, at the time of writing. `for<'a>` also cannot be used in this context. The solution in this commit is to use a macro (`lower_error_sum`) to rewrite lifetimes: to `'static`: ``` enum Sum { AsgTreeToXirfError(<AsgTreeToXirf<'static> as ParseState>::Error), } ``` The `Error` associated type will resolve to `AsgTreeToXirfError` all the same either way, since it has no lifetimes of its own, letalone any referencing trait bounds. That's not to say that we _couldn't_ support lifetimes as long as they're attached to context, but we have no need to at the moment, and it adds significant cognitive overhead. Further, the diagnostic system doesn't deal in lifetimes, and so would need reworking as well. Not worth it. An alternative solution to this that was rejected is an explicitly `Error` type in the macro application: ``` // in the lowering pipeline \|> AsgTreeToXirf<'a> { // lifetime type Error = AsgTreeToXirfError; // no lifetime } ``` But this requires peeling back the `ParseState` to see what its error is and _duplicate_ it here. Silly, and it breaks encapsulation, since the lowering pipeline is supposed to return its own error type. Yet another option considered was to standardize a submodule convention whereby each `ParseState` would have a module exporting `Error`, among other types. This would decouple it from the parent type. However, we still have the duplication between that and an associated type. Further, there's no way to enforce this convention (effectively a module API)---the macro would just fail in obscure ways, at least with `macro_rules!`. It would have been an ugly kluge. Overlapping Error Types ----------------------- Another concern with generating the sum type, resolved in a previous commit, was overlapping error types, which prohibited `impl From<E> for ER` generation. The problem with that a number of `ParseState`s used `Infallible` as their `Error` type. This was resolved in a previous commit by creating Infallible-like newtypes (variantless enums). This was not the only option. `From` fits naturally into how TAMER handles sum types, and fits naturally into `Lower`'s `WidenedError`. The alternative is generating explicit `map_err`s in `lower_pipeline!`. This would have allowed for overlapping error types because the _caller_ knows what the correct target variant is in the sum type. The problem with an explicit `map_err` is that it places more power in `lower_pipeline!`, which is _supposed_ to be a macro that simply removes boilerplate; it's not supposed to increase expressiveness. It's also not fun dealing with complexity in macros; they're much more confusing that normal code. With the decided-upon approach (newtypes + `From`), hand-written `Lower` pipelines are just as expressive---just more verbose---as `lower_pipeline!`, and handles widening for you. Rust's type system will also handle the complexity of widening automatically for us without us having to reason about it in the macro. This is not always desirable, but in this case, I feel that it is.	2023-06-13 14:49:43 -04:00
Mike Gerwitz	31f6a102eb	tamer: pipeline::macro: Partially applied pipeline This configures the pipeline and returns a closure that can then be provided with the source and sink. The next obvious step would be to curry the source and sink. But I wanted to commit this before I take a different (but equivalent) approach that makes the pipeline operations more explicit and helps to guide the user (developer) in developing and composing them. The FP approach is less boilerplate, but is also more general and provides less guidance. Given that composition at the topmost levels of the system, especially with all the types involved, is one of the most confusing aspects of the system---and one of the most important to get right and make clear, since it's intended to elucidate the entire system at a high level, and guide the reader. Well, it does a poor job at that now, but that's the ultimate goal. In essence---brutally general abstractions make sense at lower levels, but the complexity at higher levels benefits from rigid guardrails, even though it does not necessitate it. DEV-13162	2023-06-13 10:02:51 -04:00
Mike Gerwitz	1bb25b05b3	tamer: Newtypes for all Infallible ParseState errors More information will be presented in the commit that follows to generalize these, but this sets the stage. The recently-introduced pipeline macro takes care of most of the job of a declarative pipeline, but it's still leaky, since it requires that the _caller_ create error sum types. This not only exposes implementation details and so undermines the goal of making pipelines easy to declare and compose, but it's also one of the last major components of boilerplate for the lowering pipeline. My previous attempts at generating error sum types automatically for pipelines ran into a problem because of overlapping `impl`s for the various `<S as ParseState>::Error` types; this resolves that issue via newtypes. I had considered other approaches, including explicitly generating code to `map_err` as part of the lowering pipeline, but in the end this is the easier way to reason about things that also keeps manual `Lower` pipelines on the same level of expressiveness as the pipeline macro; I want to restrict its unique capabilities as much as possible to elimination of boilerplate and nothing more. DEV-13162	2023-06-12 12:33:22 -04:00
Mike Gerwitz	6a99ee3cb3	tamer: pipeline::lower_xmli: Use `lower_pipeline!` All lowering pipelines are now using `lower_pipeline!`. Finally. The macro does require some refactoring and documentation, but it's working, and we now have three pipelines whose definitions are smaller than a single one was previously. I've been hoping to do this for many months, so it's nice to finally see this come to fruition. I had been putting it off, but doing so has made it difficult to compose other parts of the system, not knowing what abstractions I'll have at my disposal. DEV-13162	2023-06-05 13:44:49 -04:00
Mike Gerwitz	109ba5f797	tamer: pipeline::lower_xmli: Generalize sink like other pipelines This makes the sink similar to other pipelines without creating a new ParseState, and so will allow for integrating into the `lower_pipeline!` abstraction. DEV-13162	2023-06-05 13:44:49 -04:00
Mike Gerwitz	9c6b00a124	tamer: pipeline: Initial concept for declarative pipeline definition This has been the ultimate goal for the pipeline for some time---the ability to declaratively define the lowering pipeline in a way that is clear, concise, and is correct by definition. The reason that the lowering pipeline required so much boilerplate was because of the robust types involved, which ensures that everything in the pipeline is compatible with one-another---it's not possible to construct a pipeline that will not work. Of course, there is nuance involved in some cases---I didn't want to include the `until` clause, which makes it fail the "obviously correct" criterion, but that can be improved over time. This only abstracts away `load_xmlo` and `parse_package_xml`; next I'll have to evolve the abstraction to support lifetimes for `lower_xmli`'s `AsgTreeToXirf`. That pipeline also ends with a custom sink that really ought to become its own parser, but I don't want to jump down that rabbit hole right now, so we may just support custom sinks for now with the intent of removing it in the future. This has been a long time coming. The ultimate goal is that you should be able to look at the parser pipelines to have a clear, high-level overview of how everything fits together. I'm not generating documentation yet, but that'll help serve as a guide as well. DEV-13162	2023-06-05 13:44:49 -04:00
Mike Gerwitz	f34f2644e9	tamer: pipeline: Allow reporting on entire Result The report acts as the sink for `load_xmlo` and `parse_package_xml`. At the moment, the type is `()`, and so there's nothing to report on but the error. But the idea is to add logging via `AirAggregate::Object`, which is currently just `()`. This change therefore is only a refactoring---it changes no functionality but sets up for future changes. This also introduces consistency with `lower_xmli` in use of `terminal` for the final operation. DEV-13162	2023-06-05 13:44:49 -04:00
Mike Gerwitz	b5187de5dc	tamer: pipeline::load_xmlo: Accept reporter This makes the API of `load_xmlo` much closer to `parse_package_xml`, both accepting a reporter and distinguishing between recoverable and unrecoverable errors. The linker still does not use a reporter and still fails on the first error, as before; I wanted to keep this change small. DEV-13162	2023-06-05 13:44:49 -04:00
Mike Gerwitz	1f2315436c	tamer: tamec: Extract xmli lowering into pipeline module This is the same idea as the previous two commits: get all the lowering pipelines into the same place so that we can observe commonalities and attempt to derive an appropriate abstraction. `lower_xmli` could have invoked `tree_reconstruction` itself, since it has all the information that it needs to do so, but the idea is that these will accept sources from the caller. This also demonstrates that sinks need to be flexible. In an ideal abstraction, perhaps this would be able to produce an iterator that accepts the first token type and yields the last, which can then be directed to a sink, but that's not compatible with how the lowering operations currently work, which requires a single value to be returned. But if it did work that way, then they'd be able to compose just as any other parser. Maybe for the future. DEV-13162	2023-06-05 13:44:48 -04:00
Mike Gerwitz	9e8b809c14	tamer: tamec: Extract package parsing into pipeline module The previous commit extracted xmlo loading, because that will be a common operation between `tamec` and `tameld`. This extracts parsing, which will only used by `tamec` for now, though components of the pipeline are similar to xmlo loading. Not only does it need to be removed from `tamec` and better abstracted, but the intent now is to get all of these things into one place so that the patterns are obviated and a better abstraction can be created to remove all of this boilerplate and type complexity. Furthermore, xmlo loading needs to use reporting and recovery, so having `parse_package_xml` here will help show how to make that happen easily. I'm pleased that it ended up being trivial to extract error reporting from the lowering pipeline as a simple (mutable) callback. I'm not pleased about the side-effects, but, this works well for now given how the system works today. DEV-13162	2023-06-05 13:44:46 -04:00
Mike Gerwitz	7857460c1d	tamer: Re-use prior AirAggreagteCtx for subsequent parsers A new AirAggregate parser is utilized for each package import. This prevents us from moving the index from `Asg` onto `AirAggregateCtx` because the index would be dropped between each import. This allows re-using that context and solves for problems that result from attempting to do so, as explained in the new `resume_previous_parsing_context` test case. But, it's now clear that there's a missing abstraction, and that reasoning about this problem at the topmost level of the compiler/linker in terms of internal parsing details like "context" is not appropriate. What we're doing is suspending parsing and resuming it later on for another package, aggregating into the same destination (ASG + index). An abstraction ought to be formed in terms of that. DEV-13162	2023-05-19 13:38:15 -04:00
Mike Gerwitz	a686855e9d	tamer: Introduce desugaring operation for shorthand template application This moves translation from NirToAir into TplShortDesugar, and changes the output from AIR to NIR. This is going to be much easier to reason about as a desugaring operation (and indeed that's always how TAME has implemented it, in XSLT); this keeps the complexity isolated. Ideally, NirToAir wouldn't even accept tokens that it can't handle, but that's going to take quite a bit more work and I don't have the time right now. Instead, we'll fail at runtime with some hopefully-useful information. It shouldn't actually happen in practice. DEV-13708	2023-03-29 12:58:34 -04:00
Mike Gerwitz	f8c1ef5ef2	tamer: tamec: MILESONE: POC end-to-end lowering This has been a long time coming. The wiring of it all together is a little rough around the edges right now, but this commit represents a working POC to begin to fill in the gaps for the entire lowering pipeline. I had hoped to be at this point a year ago. Yeah. This marks a significant milestone in the project because this allows me to begin to observe the implementation end-to-end, testing it on real-life inputs as part of a production build pipeline. ...and now, with that, we can begin. So much work has gone into this project so far, but aside from the linker (which has been in production for years), most of this work has been foundational. It's been a significant investment that I intend to have pay off in many different ways. (All this outputs right now is `<package/>`.) DEV-13708	2023-03-10 14:27:57 -05:00
Mike Gerwitz	33d2b4f0b8	tamer: tamec: POC lowering pipeline with XirfAutoClose and XirfToXir This replaces the stub `derive_xmli` with the same result (well, minus a space before the '/' in the output) using what will become the lowering pipeline. Once again, this is quite verbose, and the lowering pipeline in general needs to be further abstracted away. Unlike the rest of the pipeline, an error during the derivation process will immediately terminate with an unrecoverable error, because we do not want to write partial files. This does not remove the garbage file, because the build system ought to do that itself (e.g. `make`)...but that is certainly open for debate. DEV-13708	2023-03-10 14:27:57 -05:00
Mike Gerwitz	29178f2360	tamer: xir::reader: Divorce from `parse` The reader previously yielded a `ParsedResult`, presumably to simplify lowering operations. But the reader is not a `ParseState`, and does not otherwise use the parsing API, so this was an inappropriate and confusing coupling. This resolves that, introducing a new `lowerable` which will translate an iterator into something that can be placed in a lowering pipeline. See the previous commit for more information. DEV-13708	2023-03-10 14:27:57 -05:00
Mike Gerwitz	963688f889	tamer: parse::lower::ParsedObject: Include Token type parameter The token type was previously hard-coded to `UnknownToken`, since the use case was the beginning of the lowering pipeline at the start of the program, where there was no token type because the first parser (`XirReader`, currently) is responsible for producing the first token type. But when we're lowering from the graph (so, the other side of the lowering pipeline), we _do_ have token types to deal with. This also emphasizes the inappropriate coupling of `<XirReader as Iterator>::Item` with `ParsedResult`; I'd like to follow the same approach that I'm about to introduce with `tamec`, so see a future commit. DEV-13708	2023-03-10 14:27:57 -05:00
Mike Gerwitz	52e5242af2	tamer: bin/tamec: wip-asg-derive-xmli-gated xmli output This will begin to derive `xmli` output from the graph. DEV-13708	2023-03-10 14:27:57 -05:00
Mike Gerwitz	954b5a2795	Copyright year and name update Ryan Specialty Group (RSG) rebranded to Ryan Specialty after its IPO.	2023-01-20 23:37:30 -05:00
Mike Gerwitz	e6640c0019	tamer: Integrate clippy This invokes clippy as part of `make check` now, which I had previously avoided doing (I'll elaborate on that below). This commit represents the changes needed to resolve all the warnings presented by clippy. Many changes have been made where I find the lints to be useful and agreeable, but there are a number of lints, rationalized in `src/lib.rs`, where I found the lints to be disagreeable. I have provided rationale, primarily for those wondering why I desire to deviate from the default lints, though it does feel backward to rationalize why certain lints ought to be applied (the reverse should be true). With that said, this did catch some legitimage issues, and it was also helpful in getting some older code up-to-date with new language additions that perhaps I used in new code but hadn't gone back and updated old code for. My goal was to get clippy working without errors so that, in the future, when others get into TAMER and are still getting used to Rust, clippy is able to help guide them in the right direction. One of the reasons I went without clippy for so long (though I admittedly forgot I wasn't using it for a period of time) was because there were a number of suggestions that I found disagreeable, and I didn't take the time to go through them and determine what I wanted to follow. Furthermore, it was hard to make that judgment when I was new to the language and lacked the necessary experience to do so. One thing I would like to comment further on is the use of `format!` with `expect`, which is also what the diagnostic system convenience methods do (which clippy does not cover). Because of all the work I've done trying to understand Rust and looking at disassemblies and seeing what it optimizes, I falsely assumed that Rust would convert such things into conditionals in my otherwise-pure code...but apparently that's not the case, when `format!` is involved. I noticed that, after making the suggested fix with `get_ident`, Rust proceeded to then inline it into each call site and then apply further optimizations. It was also previously invoking the thread lock (for the interner) unconditionally and invoking the `Display` implementation. That is not at all what I intended for, despite knowing the eager semantics of function calls in Rust. Anyway, possibly more to come on that, I'm just tired of typing and need to move on. I'll be returning to investigate further diagnostic messages soon.	2023-01-20 23:37:29 -05:00
Mike Gerwitz	56d1ecf0a3	tamer: Air{Token=>} Consistency with `Nir` et al. DEV-13430	2022-12-13 14:36:38 -05:00
Mike Gerwitz	daeaade53c	tamer: tamec: Expose ASG context in lowering pipeline The previous commit had the ASG implicitly constructed and then discarded. This will keep it around, which will be necessary not only for imports, but for passing the ASG off to the next phases of lowering. DEV-13429	2022-12-13 13:46:31 -05:00
Mike Gerwitz	aa1ca06a0e	tamer: tamec: Introduce NIR->AIR->ASG lowering This does not yet yield the produces ASG, but does set up the lowering pipeline to prepare to produce it. It's also currently a no-op, with `NirToAsg` just yielding `Incomplete`. The goal is to begin to move toward vertical slices for TAMER as I start to return to the previous approach of a handoff with the old compiler. Now that I've gained clarity from my previous failed approach (which I documented in previous commits), I feel that this is the best way forward that will allow me to incrementally introduce more fine-grained performance improvements, at the cost of some throwaway work as this progresses. But the cost of delay with these build times is far greater. DEV-13429	2022-12-13 13:37:07 -05:00
Mike Gerwitz	8d2d273932	tamer: nir::interp: Integrate NIR interpolation into lowering pipeline This is the culmination of all the recent work---the third attempt at trying to integrate this. It ended up much cleaner than what was originally going to be done, but only after gutting portions of the system and changing my approach to how NIR is parsed (WRT attributes). See prior commits for more information. The final step is to fill the error branches with actual errors rather than `todo!`s. What a relief. DEV-13156	2022-12-05 16:32:00 -05:00
Mike Gerwitz	6d39474127	tamer: NIR re-simplification Alright, this has been a rather tortured experience. The previous commit began to state what is going on. This is reversing a lot of prior work, with the benefit of hindsight. Little bit of history, for the people who will probably never read this, but who knows: As noted at the top of NIR, I've long wanted a very simple set of general primitives where all desugaring is done by the template system---TAME is a metalanguage after all. Therefore, I never intended on having any explicit desugaring operations. But I didn't have time to augment the template system to support parsing on attribute strings (nor am I sure if I want to do such a thing), so it became clear that interpolation would be a pass in the compiler. Which led me to the idea of a desugaring pass. That in turn spiraled into representing the status of whether NIR was desugared, and separating primitives, etc, which lead to a lot of additional complexity. The idea was to have a Sugared and Plan NIR, and further within them have symbols that have latent types---if they require interpolation, then those types would be deferred until after template expansion. The obvious problem there is that now: 1. NIR has the complexity of various types; and 2. Types were tightly coupled with NIR and how it was defined in terms of XML destructuring. The first attempt at this didn't go well: it was clear that the symbol types would make mapping from Sugared to Plain NIR very complicated. Further, since NIR had any number of symbols per Sugared NIR token, interpolation was a pain in the ass. So that lead to the idea of interpolating at the _attribute_ level. That seemed to be going well at first, until I realized that the token stream of the attribute parser does not match that of the element parser, and so that general solution fell apart. It wouldn't have been great anyway, since then interpolation was _also_ coupled to the destructuring of the document. Another goal of mine has been to decouple TAME from XML. Not because I want to move away from XML (if I did, I'd want S-expressions, not YAML, but I don't think the team would go for that). This decoupling would allow the use of a subset of the syntax of TAME in other places, like CSVMs and YAML test cases, for example, if appropriate. This approach makes sense: the grammar of TAME isn't XML, it's _embedded within_ XML. The XML layer has to be stripped to expose it. And so that's what NIR is now evolving into---the stripped, bare repsentation of TAME's language. That also has other benefits too down the line, like a REPL where you can use any number of syntaxes. I intend for NIR to be stack-based, which I'd find to be intuitive for manipulating and querying packages, but it could have any number of grammars, including Prolog-like for expressing Horn clauses and querying with a Prolog/Datalog-like syntax. But that's for the future... The next issue is that of attribute types. If we have a better language for NIR, then the types can be associated with the NIR tokens, rather than having to associate each symbol with raw type data, which doesn't make a whole lot of sense. That also allows for AIR to better infer types and determine what they ought to be, and further makes checking types after template application natural, since it's not part of NIR at all. It also means the template system can naturally apply to any sources. Now, if we take that final step further, and make attributes streaming instead of aggregating, we're back to a streaming pipeline where all aggregation takes place on the ASG (which also resolves the memcpy concerns worked around previously, also further simplifying `ele_parse` again, though it sucks that I wasted that time). And, without the symbol types getting in the way, since now NIR has types more fundamentally associated with tokens, we're able to interpolate on a token stream using simple SPairs, like I always hoped (and reverted back to in the previous commit). Oh, and what about that desugaring pass? There's the issue of how to represent such a thing in the type system---ideally we'd know statically that desugaring always lowers into a more primitive NIR that reduces the mapping that needs to be done to AIR. But that adds complexity, as mentioned above. The alternative is to just use the templat system, as I originally wanted to, and resolve shortcomings by augmenting the template system to be able to handle it. That not only keeps NIR and the compiler much simpler, but exposes more powerful tools to developers via TAME's metalanguage, if such a thing is appropriate. Anyway, this creates a system that's far more intuitive, and far simpler. It does kick the can to AIR, but that's okay, since it's also better positioned to deal with it. Everything I wrote above is a thought dump and has not been proof-read, so good luck! And lets hope this finally works out...it's actually feeling good this time. The journey was necessary to discover and justify what came out of it---everything I'm stripping away was like a cocoon, and within it is a more beautiful and more elegant TAME. DEV-13346	2022-12-01 11:09:25 -05:00
Mike Gerwitz	d195eedacb	tamer: nir: Sugared and plain flavors This introduces the concept of sugared NIR and provides the boilerplate for a desugaring pass. The earlier commits dealing with cleaning up the lowering pipeline were to support this work, in particular to ensure that reporting and recovery properly applied to this lowering operation without adding a ton more boilerplate. DEV-13158	2022-10-26 14:19:19 -04:00
Mike Gerwitz	dbe834b48a	tamer: tamec: Remove lowering pipeline refactoring comment I'm struggling to go much further yet without sorting out some other things first with regards to mutable `Context` and, in particular, the ASG. I'm going to pause on refactoring the lowering pipeline---it's been improved significantly with the recent work---and I will continue in the next few weeks. DEV-13158	2022-10-26 12:44:20 -04:00
Mike Gerwitz	7c4c0ebdda	tamer: parse::lower: Separate error types for lowering and return Lowering errors in tamec end up utilizing recovery and reporting, so there is a distinction between recoverable and unrecoverable errors. tameld aborts on the first error, since recovery is not currently supported (we'll want to add it, since tameld should output e.g. lists of unresolved externs). Note that tamec does not yet handle `FinalizeError` like tameld because it uses `Lower::lower`, which does not yet finalize (though it does in practice when it reaches the end of the stream and auto-finalizes, but that is widened into a `ParseError`). DEV-13158	2022-10-26 12:44:20 -04:00
Mike Gerwitz	7e62276907	tamer: Revert "tamer: diagnose::report::Report: {Mutable=>immutable} self reference" This reverts commit 85ec626fcd804eb2fac3fd6f0339182554f72cfd. This revert had to be modified to work alongside other changes. Interior mutability is fortunately no longer needed after the previous commit which allows reporting to occur in a single place in the lowering pipeline (at the terminal parser). DEV-13158	2022-10-26 12:44:18 -04:00
Mike Gerwitz	1c181fe546	tamer: parse::lower: Propagate widened errors to terminal parser The term "terminal parser" isn't formalized yet in the system, but is meant to refer to the innermost parser that is responsible for pulling tokens through the lowering pipeline. This approach is more of what one would expect when dealing with `Result`-like monads---we are effectively chaining the inner operation while propagating errors to short-circuit lowering and let the caller decide whether recovery ought to be permitted with diagnostic messages. This will become more clear as it is further refactored. This also means that the previous changes for introducing interior mutability for a shared mutable `Reporter` can be reverted, which is great, since that approach was antithetical to how the streaming pipeline operates (and introduces awkward mutable state into an otherwise-mostly-immutable system). DEV-13158	2022-10-26 12:32:51 -04:00
Mike Gerwitz	2ccdaf80fe	tamer: diagnose::report: Error tracking This extracts error tracking into the Reporter itself, which is already shared between lowering operations. This can then be used to display the number of errors. A new formatter (in tamer::fmt) will be added to handle the singular/plural conversion in place of "error(s)" in the future; I have more important things to work on right now. DEV-13158	2022-10-26 12:32:51 -04:00
Mike Gerwitz	f049da4496	tamer: tamec: Apply reporting (and continuing) to XirToXirf failure Previously these errors would immediately abort. This results in some duplicate code, but it's beginning to derive a common implementation. Check out the commits that follow; this is really an intermediate refactoring state. DEV-13158	2022-10-26 12:32:51 -04:00
Mike Gerwitz	733f44a616	tamer: diagnose::report::Report: {Mutable=>immutable} self reference VisualReporter now uses interior mutability so that we can hold multiple references to it for upcoming lowering pipeline changes. DEV-13158	2022-10-26 12:32:51 -04:00
Mike Gerwitz	a6e72b87f7	tamer: tamec: Extract compilation from main Another baby step. The small commits are intended to allow comprehension of what changes when looking at the diffs. This also removes a comment stating that errors do not fail compilation, since they most certainly do. DEV-13158	2022-10-26 12:32:51 -04:00
Mike Gerwitz	20ea83af1a	tamer: tamec: Extract source reading and writing This begins refactoring the lowering pipeline to begin to obviate abstraction boundaries. The lowering pipeline is the backbone of the system, and so it needs to become clear and self-documenting, which will take a little bit of work. DEV-13158	2022-10-26 12:32:51 -04:00
Mike Gerwitz	3456bd593a	tamer: tamec: Fail with non-zero status if any NIR parsing errors This is a quick-and-dirty change. The lowering pipeline needs a proper abstraction, but I'm about to be on vacation at the end of the week and would like to get NIR->AIR lowering started before I consider that abstraction further, so this will do for now. NIR parsing has been tested in production without failing for over a week. DEV-7145	2022-09-19 10:11:47 -04:00
Mike Gerwitz	419b24f251	tamer: Introduce NIR (accepting only) This introduces NIR, but only as an accepting grammar; it doesn't yet emit the NIR IR, beyond TODOs. This modifies `tamec` to, while copying XIR, also attempt to lower NIR to produce parser errors, if any. It does not yet fail compilation, as I just want to be cautious and observe that everything's working properly for a little while as people use it, before I potentially break builds. This is the culmination of months of supporting effort. The NIR grammar is derived from our existing TAME sources internally, which I use for now as a test case until I introduce test cases directly into TAMER later on (I'd do it now, if I hadn't spent so much time on this; I'll start introducing tests as I begin emitting NIR tokens). This is capable of fully parsing our largest system with >900 packages, as well as `core`. `tamec`'s lowering is a mess; that'll be cleaned up in future commits. The same can be said about `tameld`. NIR's grammar has some initial documentation, but this will improve over time as well. The generated docs still need some improvement, too, especially with generated identifiers; I just want to get this out here for testing. DEV-7145	2022-08-29 15:52:04 -04:00
Mike Gerwitz	8d92667388	tamer: Integrate xir::reader as a parser in the lowering pipeline This allows `XmlXirReader` to be used in a `Lower` operation, just as everything else, bringing me one step closer to a pipeline that can be concisely represented; this is finally beginning to unify in a clear way, though it is still a bit of a mess. This causes `XmlXirReader` to _act_ like a `parse::Parser` in that it yields a `ParsedResult`, but it does not use `parse::Parser` itself; that was the _original_ plan: convert it into a `ParseState` where `XmlXirReader` became a context, and force `Parser` to yield by feeding it a stream of tokens with `repeat`, but that ended up performing poorly relative to this change. I did some investigation, which I might write about in the future, but for now, this solution works just fine. DEV-7145	2022-06-02 10:30:44 -04:00
Mike Gerwitz	f218c452b9	tamer: iter::trip: Flatten Result The `*_iter_while_ok` functions now compose like monads, flattening `Result` at each step and drastically simplifying handling of error types. This also removes the bunch of `?`s at the end of the expression, and allows me to use `?` within the callback itself. I had originally not used `Result` as the return type of the callback because I was not entirely sure how I was going to use them, but it's now clear that I _always_ use `Result` as the return type, and so there's no use in trying to be too accommodating; it can always change in the future. This is desirable not just for cleanup, but because trying to refactor `asg_builder` into a pair of `Parser`s is really messy to chain without flattening, especially given some state that has to leak temporarily to the caller. More on that in a future commit. DEV-11864	2022-05-20 16:08:16 -04:00
Mike Gerwitz	0281dfdf0d	tamer: Remove wip-frontends feature flag We want the new system to be used so that we can start catching any problems that may arise. Further changes will be flagged as necessary. DEV-10936	2022-05-04 09:37:10 -04:00
Mike Gerwitz	1ad2fb1dc8	Copyright year update 2022 RSG (Ryan Specialty Group) recently announced a rename to Ryan Specialty (no "Group"), but I'm not sure if the legal name has been changed yet or not, so I'll wait on that.	2022-05-03 14:14:29 -04:00
Mike Gerwitz	3dbab881da	tamer: diagnose::report: Produce Report object Rather than writing to the provided `Write` object, this produces a `Report` object. While a lifetime still exists for the diagnostic data (labels, specifically), I was able to remove the other lifetime resulting from `ResolvedSpan` by transferring ownership of the data to the `Report` itself. Once actual source lines are integrated shortly, `Report` will include those as well. This has been a tedious process, but it's coming together. Hopefully these commits documenting the progressive and ugly refactoring are found useful by some reader in the future. DEV-12151	2022-04-27 15:00:30 -04:00
Mike Gerwitz	a22e8e79f7	tamer: diagnose: Integrate resolver for source lines This does not yet resolve columns, and omits the length of the span, but it's starting to come together. This is particularly exciting for me to see because I've been wanting line numbers in TAME error messages for over a decade. DEV-10935	2022-04-21 12:34:17 -04:00
Mike Gerwitz	725dc3fb54	tamer: tamec: Use diagnostic system for errors This is a POC, minimal-effort integration that also creates the TamecError sum type analogous to TameldError. I'll work on reducing the boilerplate in the future. A note regarding the type and boilerplate vs. dynamic dispatch, for any future readers: the purpose of this is to be explicit about the error types so that the system is self-documenting and it forces and understanding of its error conditions. `Box<dyn Error>` is basically "eh idk anything can happen!", which is not what I'm interested in having. DEV-10935	2022-04-20 09:42:11 -04:00
Mike Gerwitz	a1a4ad3e8e	tamer: Introduce context into XirReader tamec and tameld will now both introduce a `Context` to XIR, which will use it to create spans. Here's an example of an error, now that it's all working well together: $ target/release/tameld --emit xmle -o /dev/null path/to/package.xmlo error: invalid preproc:sym/@dim `9` at [/../path/to/package.xmlo offset 1175451-1175452] A future task will make this human-readable by producing line and column numbers, and perhaps even a snippet (if not now, then eventually). It's exciting to see this coming together finally. DEV-10934	2022-04-08 16:16:23 -04:00
Mike Gerwitz	99aacaf7ca	tamer: tamec: Replace copy with XIR parsing/writing When wip-frontends is on, this will parse the input file using XIR and then immediately output it again. This makes the necessary changes to be able to read every source file we have in our largest project, such that the output is identical after having been formatted with `xmllint --format -` (there are differences because e.g. whitespace between attributes is not yet maintained). This is performant too, with times remaining essentially identical despite the additional work. DEV-10413	2022-04-07 12:13:49 -04:00
Mike Gerwitz	ca6ef3ed36	tamer: frontend: Begin basic XML parsing The first step in the process is to emit the raw XML events that can then be immediately output again to echo the results into another file. This will then allow us to begin parsing the input incrementally, and begin to morph the output into a real `xmlo` file.	2021-07-27 00:37:13 -04:00
Mike Gerwitz	fb8422d670	tamer: Initial frontend concept This introduces the beginnings of frontends for TAMER, gated behind a `wip-features` flag. This will be introduced in stages: 1. Replace the existing copy with a parser-based copy (echo back out the tokens), when the flag is on. 2. Begin to parse portions of the source, augmenting the output xmlo (xmli at the moment). The XSLT-based compiler will be modified to skip compilation steps as necessary. As portions of the compilation are implemented in TAMER, they'll be placed behind their own feature flags and stabalized, which will incrementally remove the compilation steps from the XSLT-based system. The result should be substantial incremental performance improvements. Short-term, the priorities are for loading identifiers into an IR are (though the order may change): 1. Echo 2. Imports 3. Extern declarations. 4. Simple identifiers (e.g. param, const, template, etc). 5. Classifications. 6. Documentation expressions. 7. Calculation expressions. 8. Template applications. 9. Template definitions. 10. Inline templates. After each of those are done, the resulting xmlo (xmli) will have fully reconstructed the source document from the IR produced during parsing.	2021-07-23 22:24:08 -04:00
Mike Gerwitz	2e50af1220	Copyright year update 2021	2021-07-22 15:00:15 -04:00
Joseph Frazer	2c587e2d9d	[DEV-7147] Add "tamec" executable Add a stub executable that will eventually become a full-featured TAME compiler. The first implementation will only copy the source file to an intermediary file that will be compiled by the XSLT compiler.	2020-04-09 09:46:46 -04:00

49 Commits (61d556c89e587628f83f6acdf0db8f12788bf071)