2019-11-27 09:18:17 -05:00
|
|
|
// Proof-of-concept TAME linker
|
|
|
|
//
|
2023-01-17 23:09:25 -05:00
|
|
|
// Copyright (C) 2014-2023 Ryan Specialty, LLC.
|
2020-03-06 11:05:18 -05:00
|
|
|
//
|
|
|
|
// This file is part of TAME.
|
2019-11-27 09:18:17 -05:00
|
|
|
//
|
|
|
|
// This program is free software: you can redistribute it and/or modify
|
|
|
|
// it under the terms of the GNU General Public License as published by
|
|
|
|
// the Free Software Foundation, either version 3 of the License, or
|
|
|
|
// (at your option) any later version.
|
|
|
|
//
|
|
|
|
// This program is distributed in the hope that it will be useful,
|
|
|
|
// but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
// GNU General Public License for more details.
|
|
|
|
//
|
|
|
|
// You should have received a copy of the GNU General Public License
|
|
|
|
// along with this program. If not, see <http://www.gnu.org/licenses/>.
|
|
|
|
|
2021-10-11 23:54:24 -04:00
|
|
|
//! **This contains the remaining portions of the proof-of-concept linker.**
|
|
|
|
//! It is feature-complete and just needs final refactoring.
|
2019-11-27 09:18:17 -05:00
|
|
|
|
2021-10-12 09:42:09 -04:00
|
|
|
use super::xmle::{
|
|
|
|
lower::{sort, SortError},
|
|
|
|
xir::lower_iter,
|
2021-10-14 12:37:32 -04:00
|
|
|
XmleSections,
|
2020-04-06 22:07:39 -04:00
|
|
|
};
|
tamer: xir::escape: Remove XirString in favor of Escaper
This rewrites a good portion of the previous commit.
Rather than explicitly storing whether a given string has been escaped, we
can instead assume that all SymbolIds leaving or entering XIR are unescaped,
because there is no reason for any other part of the system to deal with
such details of XML documents.
Given that, we need only unescape on read and escape on write. This is
customary, so why didn't I do that to begin with?
The previous commit outlines the reason, mainly being an optimization for
the echo writer that is upcoming. However, this solution will end up being
better---it's not implemented yet, but we can have a caching layer, such
that the Escaper records a mapping between escaped and unescaped SymbolIds
to avoid work the next time around. If we share the Escaper between _all_
readers and the writer, the result is that
1. Duplicate strings between source files and object files (many of which
are read by both the linker and compiler) avoid re-unescaping; and
2. Writers can use this cache to avoid re-escaping when we've already seen
the escaped variant of the string during read.
The alternative would be a global cache, like the internment system, but I
did not find that to be appropriate here, since this is far less
fundamental and is much easier to compose.
DEV-11081
2021-11-12 13:59:14 -05:00
|
|
|
use crate::{
|
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air
This finally uses `parse` all the way up to aggregation into the ASG, as can
be seen by the mess in `poc`. This will be further simplified---I just need
to get this committed so that I can mentally get it off my plate. I've been
separating this commit into smaller commits, but there's a point where it's
just not worth the effort anymore. I don't like making large changes such
as this one.
There is still work to do here. First, it's worth re-mentioning that
`poc` means "proof-of-concept", and represents things that still need a
proper home/abstraction.
Secondly, `poc` is retrieving the context of two parsers---`LowerContext`
and `Asg`. The latter is desirable, since it's the final aggregation point,
but the former needs to be eliminated; in particular, packages need to be
worked into the ASG so that `found` can be removed.
Recursively loading `xmlo` files still happens in `poc`, but the compiler
will need this as well. Once packages are on the ASG, along with their
state, that responsibility can be generalized as well.
That will then simplify lowering even further, to the point where hopefully
everything has the same shape (once final aggregation has an abstraction),
after which we can then create a final abstraction to concisely stitch
everything together. Right now, Rust isn't able to infer `S` for
`Lower<S, LS>`, which is unfortunate, but we'll be able to help it along
with a more explicit abstraction.
DEV-11864
2022-05-27 13:51:29 -04:00
|
|
|
asg::{
|
2022-12-13 14:36:38 -05:00
|
|
|
air::{Air, AirAggregate},
|
2022-12-15 12:07:58 -05:00
|
|
|
Asg, AsgError, DefaultAsg, Object,
|
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air
This finally uses `parse` all the way up to aggregation into the ASG, as can
be seen by the mess in `poc`. This will be further simplified---I just need
to get this committed so that I can mentally get it off my plate. I've been
separating this commit into smaller commits, but there's a point where it's
just not worth the effort anymore. I don't like making large changes such
as this one.
There is still work to do here. First, it's worth re-mentioning that
`poc` means "proof-of-concept", and represents things that still need a
proper home/abstraction.
Secondly, `poc` is retrieving the context of two parsers---`LowerContext`
and `Asg`. The latter is desirable, since it's the final aggregation point,
but the former needs to be eliminated; in particular, packages need to be
worked into the ASG so that `found` can be removed.
Recursively loading `xmlo` files still happens in `poc`, but the compiler
will need this as well. Once packages are on the ASG, along with their
state, that responsibility can be generalized as well.
That will then simplify lowering even further, to the point where hopefully
everything has the same shape (once final aggregation has an abstraction),
after which we can then create a final abstraction to concisely stitch
everything together. Right now, Rust isn't able to infer `S` for
`Lower<S, LS>`, which is unfortunate, but we'll be able to help it along
with a more explicit abstraction.
DEV-11864
2022-05-27 13:51:29 -04:00
|
|
|
},
|
tamer: diagnose: Introduction of diagnostic system
This is a working concept that will continue to evolve. I wanted to start
with some basic output before getting too carried away, since there's a lot
of potential here.
This is heavily influenced by Rust's helpful diagnostic messages, but will
take some time to realize a lot of the things that Rust does. The next step
will be to resolve line and column numbers, and then possibly include
snippets and underline spans, placing the labels alongside them. I need to
balance this work with everything else I have going on.
This is a large commit, but it converts the existing Error Display impls
into Diagnostic. This separation is a bit verbose, so I'll see how this
ends up evolving.
Diagnostics are tied to Error at the moment, but I imagine in the future
that any object would be able to describe itself, error or not, which would
be useful in the future both for the Summary Page and for query
functionality, to help developers understand the systems they are writing
using TAME.
Output is integrated into tameld only in this commit; I'll add tamec
next. Examples of what this outputs are available in the test cases in this
commit.
DEV-10935
2022-04-13 14:41:54 -04:00
|
|
|
diagnose::{AnnotatedSpan, Diagnostic},
|
2021-10-14 12:37:32 -04:00
|
|
|
fs::{
|
|
|
|
Filesystem, FsCanonicalizer, PathFile, VisitOnceFile,
|
|
|
|
VisitOnceFilesystem,
|
|
|
|
},
|
|
|
|
ld::xmle::Sections,
|
2022-06-02 13:23:41 -04:00
|
|
|
obj::xmlo::{
|
|
|
|
XmloAirContext, XmloAirError, XmloError, XmloReader, XmloToAir,
|
|
|
|
XmloToken,
|
|
|
|
},
|
2022-10-26 11:29:26 -04:00
|
|
|
parse::{
|
|
|
|
FinalizeError, Lower, ParseError, Parsed, ParsedObject, UnknownToken,
|
|
|
|
},
|
tamer: asg: Track roots on graph
Previously, since the graph contained only identifiers, discovered roots
were stored in a separate vector and exposed to the caller. This not only
leaked details, but added complexity; this was left over from the
refactoring of the proof-of-concept linker some time ago.
This moves the root management into the ASG itself, mostly, with one item
being left over for now in the asg_builder (eligibility classifications).
There are two roots that were added automatically:
- __yield
- __worksheet
The former has been removed and is now expected to be explicitly mapped in
the return map, which is now enforced with an extern in `core/base`. This
is still special, in the sense that it is explicitly referenced by the
generated code, but there's nothing inherently special about it and I'll
continue to generalize it into oblivion in the future, such that the final
yield is just a convention.
`__worksheet` is the only symbol of type `IdentKind::Worksheet`, and so that
was generalized just as the meta and map entries were.
The goal in the future will be to have this more under the control of the
source language, and to consolodate individual roots under packages, so that
the _actual_ roots are few.
As far as the actual ASG goes: this introduces a single root node that is
used as the sole reference for reachability analysis and topological
sorting. The edges of that root node replace the vector that was removed.
DEV-11864
2022-05-17 10:42:05 -04:00
|
|
|
sym::{GlobalSymbolResolve, SymbolId},
|
tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary. I've been wanting to do this for a long time,
so it's nice seeing this come together. This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them. This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.
This just maintains compatibility with the dynamic dispatch that was
previous occurring. This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.
The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.
More to come.
DEV-10935
2022-04-11 15:15:04 -04:00
|
|
|
xir::{
|
tamer: Xirf::Text refinement
This teaches XIRF to optionally refine Text into RefinedText, which
determines whether the given SymbolId represents entirely whitespace.
This is something I've been putting off for some time, but now that I'm
parsing source language for NIR, it is necessary, in that we can only permit
whitespace Text nodes in certain contexts.
The idea is to capture the most common whitespace as preinterned
symbols. Note that this heuristic ought to be determined from scanning a
codebase, which I haven't done yet; this is just an initial list.
The fallback is to look up the string associated with the SymbolId and
perform a linear scan, aborting on the first non-whitespace character. This
combination of checks should be sufficiently performant for now considering
that this is only being run on source files, which really are not all that
large. (They become large when template-expanded.) I'll optimize further
if I notice it show up during profiling.
This also frees XIR itself from being concerned by Whitespace. Initially I
had used quick-xml's whitespace trimming, but it messed up my span
calculations, and those were a pain in the ass to implement to begin with,
since I had to resort to pointer arithmetic. I'd rather avoid tweaking it.
tameld will not check for whitespace, since it's not important---xmlo files,
if malformed, are the fault of the compiler; we can ignore text nodes except
in the context of code fragments, where they are never whitespace (unless
that's also a compiler bug).
Onward and yonward.
DEV-7145
2022-07-27 15:49:38 -04:00
|
|
|
flat::{Text, XirToXirf, XirToXirfError, XirfToken},
|
2022-06-02 13:41:24 -04:00
|
|
|
reader::XmlXirReader,
|
tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary. I've been wanting to do this for a long time,
so it's nice seeing this come together. This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them. This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.
This just maintains compatibility with the dynamic dispatch that was
previous occurring. This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.
The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.
More to come.
DEV-10935
2022-04-11 15:15:04 -04:00
|
|
|
writer::{Error as XirWriterError, XmlWriter},
|
|
|
|
DefaultEscaper, Error as XirError, Escaper, Token as XirToken,
|
|
|
|
},
|
tamer: xir::escape: Remove XirString in favor of Escaper
This rewrites a good portion of the previous commit.
Rather than explicitly storing whether a given string has been escaped, we
can instead assume that all SymbolIds leaving or entering XIR are unescaped,
because there is no reason for any other part of the system to deal with
such details of XML documents.
Given that, we need only unescape on read and escape on write. This is
customary, so why didn't I do that to begin with?
The previous commit outlines the reason, mainly being an optimization for
the echo writer that is upcoming. However, this solution will end up being
better---it's not implemented yet, but we can have a caching layer, such
that the Escaper records a mapping between escaped and unescaped SymbolIds
to avoid work the next time around. If we share the Escaper between _all_
readers and the writer, the result is that
1. Duplicate strings between source files and object files (many of which
are read by both the linker and compiler) avoid re-unescaping; and
2. Writers can use this cache to avoid re-escaping when we've already seen
the escaped variant of the string during read.
The alternative would be a global cache, like the internment system, but I
did not find that to be appropriate here, since this is far less
fundamental and is much easier to compose.
DEV-11081
2021-11-12 13:59:14 -05:00
|
|
|
};
|
2020-04-07 10:50:57 -04:00
|
|
|
use fxhash::FxBuildHasher;
|
2020-04-30 14:33:10 -04:00
|
|
|
use petgraph_graphml::GraphMl;
|
tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary. I've been wanting to do this for a long time,
so it's nice seeing this come together. This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them. This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.
This just maintains compatibility with the dynamic dispatch that was
previous occurring. This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.
The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.
More to come.
DEV-10935
2022-04-11 15:15:04 -04:00
|
|
|
use std::{
|
tamer: diagnose: Introduction of diagnostic system
This is a working concept that will continue to evolve. I wanted to start
with some basic output before getting too carried away, since there's a lot
of potential here.
This is heavily influenced by Rust's helpful diagnostic messages, but will
take some time to realize a lot of the things that Rust does. The next step
will be to resolve line and column numbers, and then possibly include
snippets and underline spans, placing the labels alongside them. I need to
balance this work with everything else I have going on.
This is a large commit, but it converts the existing Error Display impls
into Diagnostic. This separation is a bit verbose, so I'll see how this
ends up evolving.
Diagnostics are tied to Error at the moment, but I imagine in the future
that any object would be able to describe itself, error or not, which would
be useful in the future both for the Summary Page and for query
functionality, to help developers understand the systems they are writing
using TAME.
Output is integrated into tameld only in this commit; I'll add tamec
next. Examples of what this outputs are available in the test cases in this
commit.
DEV-10935
2022-04-13 14:41:54 -04:00
|
|
|
error::Error,
|
|
|
|
fmt::{self, Display},
|
tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary. I've been wanting to do this for a long time,
so it's nice seeing this come together. This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them. This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.
This just maintains compatibility with the dynamic dispatch that was
previous occurring. This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.
The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.
More to come.
DEV-10935
2022-04-11 15:15:04 -04:00
|
|
|
fs,
|
|
|
|
io::{self, BufReader, BufWriter, Write},
|
tamer: Integrate clippy
This invokes clippy as part of `make check` now, which I had previously
avoided doing (I'll elaborate on that below).
This commit represents the changes needed to resolve all the warnings
presented by clippy. Many changes have been made where I find the lints to
be useful and agreeable, but there are a number of lints, rationalized in
`src/lib.rs`, where I found the lints to be disagreeable. I have provided
rationale, primarily for those wondering why I desire to deviate from the
default lints, though it does feel backward to rationalize why certain lints
ought to be applied (the reverse should be true).
With that said, this did catch some legitimage issues, and it was also
helpful in getting some older code up-to-date with new language additions
that perhaps I used in new code but hadn't gone back and updated old code
for. My goal was to get clippy working without errors so that, in the
future, when others get into TAMER and are still getting used to Rust,
clippy is able to help guide them in the right direction.
One of the reasons I went without clippy for so long (though I admittedly
forgot I wasn't using it for a period of time) was because there were a
number of suggestions that I found disagreeable, and I didn't take the time
to go through them and determine what I wanted to follow. Furthermore, it
was hard to make that judgment when I was new to the language and lacked
the necessary experience to do so.
One thing I would like to comment further on is the use of `format!` with
`expect`, which is also what the diagnostic system convenience methods
do (which clippy does not cover). Because of all the work I've done trying
to understand Rust and looking at disassemblies and seeing what it
optimizes, I falsely assumed that Rust would convert such things into
conditionals in my otherwise-pure code...but apparently that's not the case,
when `format!` is involved.
I noticed that, after making the suggested fix with `get_ident`, Rust
proceeded to then inline it into each call site and then apply further
optimizations. It was also previously invoking the thread lock (for the
interner) unconditionally and invoking the `Display` implementation. That
is not at all what I intended for, despite knowing the eager semantics of
function calls in Rust.
Anyway, possibly more to come on that, I'm just tired of typing and need to
move on. I'll be returning to investigate further diagnostic messages soon.
2023-01-12 10:46:48 -05:00
|
|
|
path::Path,
|
tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary. I've been wanting to do this for a long time,
so it's nice seeing this come together. This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them. This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.
This just maintains compatibility with the dynamic dispatch that was
previous occurring. This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.
The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.
More to come.
DEV-10935
2022-04-11 15:15:04 -04:00
|
|
|
};
|
2019-11-27 09:18:17 -05:00
|
|
|
|
2022-05-12 15:44:32 -04:00
|
|
|
type LinkerAsg = DefaultAsg;
|
2020-03-06 12:44:22 -05:00
|
|
|
|
tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary. I've been wanting to do this for a long time,
so it's nice seeing this come together. This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them. This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.
This just maintains compatibility with the dynamic dispatch that was
previous occurring. This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.
The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.
More to come.
DEV-10935
2022-04-11 15:15:04 -04:00
|
|
|
pub fn xmle(package_path: &str, output: &str) -> Result<(), TameldError> {
|
2020-04-06 16:13:32 -04:00
|
|
|
let mut fs = VisitOnceFilesystem::new();
|
2021-11-12 16:07:57 -05:00
|
|
|
let escaper = DefaultEscaper::default();
|
2019-11-27 09:18:17 -05:00
|
|
|
|
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air
This finally uses `parse` all the way up to aggregation into the ASG, as can
be seen by the mess in `poc`. This will be further simplified---I just need
to get this committed so that I can mentally get it off my plate. I've been
separating this commit into smaller commits, but there's a point where it's
just not worth the effort anymore. I don't like making large changes such
as this one.
There is still work to do here. First, it's worth re-mentioning that
`poc` means "proof-of-concept", and represents things that still need a
proper home/abstraction.
Secondly, `poc` is retrieving the context of two parsers---`LowerContext`
and `Asg`. The latter is desirable, since it's the final aggregation point,
but the former needs to be eliminated; in particular, packages need to be
worked into the ASG so that `found` can be removed.
Recursively loading `xmlo` files still happens in `poc`, but the compiler
will need this as well. Once packages are on the ASG, along with their
state, that responsibility can be generalized as well.
That will then simplify lowering even further, to the point where hopefully
everything has the same shape (once final aggregation has an abstraction),
after which we can then create a final abstraction to concisely stitch
everything together. Right now, Rust isn't able to infer `S` for
`Lower<S, LS>`, which is unfortunate, but we'll be able to help it along
with a more explicit abstraction.
DEV-11864
2022-05-27 13:51:29 -04:00
|
|
|
let (depgraph, state) = load_xmlo(
|
2020-04-07 10:49:21 -04:00
|
|
|
package_path,
|
2020-04-06 16:13:32 -04:00
|
|
|
&mut fs,
|
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air
This finally uses `parse` all the way up to aggregation into the ASG, as can
be seen by the mess in `poc`. This will be further simplified---I just need
to get this committed so that I can mentally get it off my plate. I've been
separating this commit into smaller commits, but there's a point where it's
just not worth the effort anymore. I don't like making large changes such
as this one.
There is still work to do here. First, it's worth re-mentioning that
`poc` means "proof-of-concept", and represents things that still need a
proper home/abstraction.
Secondly, `poc` is retrieving the context of two parsers---`LowerContext`
and `Asg`. The latter is desirable, since it's the final aggregation point,
but the former needs to be eliminated; in particular, packages need to be
worked into the ASG so that `found` can be removed.
Recursively loading `xmlo` files still happens in `poc`, but the compiler
will need this as well. Once packages are on the ASG, along with their
state, that responsibility can be generalized as well.
That will then simplify lowering even further, to the point where hopefully
everything has the same shape (once final aggregation has an abstraction),
after which we can then create a final abstraction to concisely stitch
everything together. Right now, Rust isn't able to infer `S` for
`Lower<S, LS>`, which is unfortunate, but we'll be able to help it along
with a more explicit abstraction.
DEV-11864
2022-05-27 13:51:29 -04:00
|
|
|
LinkerAsg::with_capacity(65536, 65536),
|
tamer: xir::escape: Remove XirString in favor of Escaper
This rewrites a good portion of the previous commit.
Rather than explicitly storing whether a given string has been escaped, we
can instead assume that all SymbolIds leaving or entering XIR are unescaped,
because there is no reason for any other part of the system to deal with
such details of XML documents.
Given that, we need only unescape on read and escape on write. This is
customary, so why didn't I do that to begin with?
The previous commit outlines the reason, mainly being an optimization for
the echo writer that is upcoming. However, this solution will end up being
better---it's not implemented yet, but we can have a caching layer, such
that the Escaper records a mapping between escaped and unescaped SymbolIds
to avoid work the next time around. If we share the Escaper between _all_
readers and the writer, the result is that
1. Duplicate strings between source files and object files (many of which
are read by both the linker and compiler) avoid re-unescaping; and
2. Writers can use this cache to avoid re-escaping when we've already seen
the escaped variant of the string during read.
The alternative would be a global cache, like the internment system, but I
did not find that to be appropriate here, since this is far less
fundamental and is much easier to compose.
DEV-11081
2021-11-12 13:59:14 -05:00
|
|
|
&escaper,
|
2022-06-02 13:23:41 -04:00
|
|
|
XmloAirContext::default(),
|
2020-04-07 10:43:54 -04:00
|
|
|
)?;
|
|
|
|
|
2022-06-02 13:23:41 -04:00
|
|
|
let XmloAirContext {
|
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air
This finally uses `parse` all the way up to aggregation into the ASG, as can
be seen by the mess in `poc`. This will be further simplified---I just need
to get this committed so that I can mentally get it off my plate. I've been
separating this commit into smaller commits, but there's a point where it's
just not worth the effort anymore. I don't like making large changes such
as this one.
There is still work to do here. First, it's worth re-mentioning that
`poc` means "proof-of-concept", and represents things that still need a
proper home/abstraction.
Secondly, `poc` is retrieving the context of two parsers---`LowerContext`
and `Asg`. The latter is desirable, since it's the final aggregation point,
but the former needs to be eliminated; in particular, packages need to be
worked into the ASG so that `found` can be removed.
Recursively loading `xmlo` files still happens in `poc`, but the compiler
will need this as well. Once packages are on the ASG, along with their
state, that responsibility can be generalized as well.
That will then simplify lowering even further, to the point where hopefully
everything has the same shape (once final aggregation has an abstraction),
after which we can then create a final abstraction to concisely stitch
everything together. Right now, Rust isn't able to infer `S` for
`Lower<S, LS>`, which is unfortunate, but we'll be able to help it along
with a more explicit abstraction.
DEV-11864
2022-05-27 13:51:29 -04:00
|
|
|
prog_name: name,
|
2020-04-07 10:43:54 -04:00
|
|
|
relroot,
|
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air
This finally uses `parse` all the way up to aggregation into the ASG, as can
be seen by the mess in `poc`. This will be further simplified---I just need
to get this committed so that I can mentally get it off my plate. I've been
separating this commit into smaller commits, but there's a point where it's
just not worth the effort anymore. I don't like making large changes such
as this one.
There is still work to do here. First, it's worth re-mentioning that
`poc` means "proof-of-concept", and represents things that still need a
proper home/abstraction.
Secondly, `poc` is retrieving the context of two parsers---`LowerContext`
and `Asg`. The latter is desirable, since it's the final aggregation point,
but the former needs to be eliminated; in particular, packages need to be
worked into the ASG so that `found` can be removed.
Recursively loading `xmlo` files still happens in `poc`, but the compiler
will need this as well. Once packages are on the ASG, along with their
state, that responsibility can be generalized as well.
That will then simplify lowering even further, to the point where hopefully
everything has the same shape (once final aggregation has an abstraction),
after which we can then create a final abstraction to concisely stitch
everything together. Right now, Rust isn't able to infer `S` for
`Lower<S, LS>`, which is unfortunate, but we'll be able to help it along
with a more explicit abstraction.
DEV-11864
2022-05-27 13:51:29 -04:00
|
|
|
..
|
2020-04-07 10:43:54 -04:00
|
|
|
} = state;
|
2019-12-01 01:17:37 -05:00
|
|
|
|
2022-12-16 15:02:57 -05:00
|
|
|
let sorted = sort(&depgraph, Sections::new())?;
|
2019-11-27 09:18:17 -05:00
|
|
|
|
2020-01-14 16:26:36 -05:00
|
|
|
output_xmle(
|
2021-10-14 12:37:32 -04:00
|
|
|
sorted,
|
2020-01-14 16:26:36 -05:00
|
|
|
name.expect("missing root package name"),
|
|
|
|
relroot.expect("missing root package relroot"),
|
2020-03-04 15:31:20 -05:00
|
|
|
output,
|
tamer: xir::escape: Remove XirString in favor of Escaper
This rewrites a good portion of the previous commit.
Rather than explicitly storing whether a given string has been escaped, we
can instead assume that all SymbolIds leaving or entering XIR are unescaped,
because there is no reason for any other part of the system to deal with
such details of XML documents.
Given that, we need only unescape on read and escape on write. This is
customary, so why didn't I do that to begin with?
The previous commit outlines the reason, mainly being an optimization for
the echo writer that is upcoming. However, this solution will end up being
better---it's not implemented yet, but we can have a caching layer, such
that the Escaper records a mapping between escaped and unescaped SymbolIds
to avoid work the next time around. If we share the Escaper between _all_
readers and the writer, the result is that
1. Duplicate strings between source files and object files (many of which
are read by both the linker and compiler) avoid re-unescaping; and
2. Writers can use this cache to avoid re-escaping when we've already seen
the escaped variant of the string during read.
The alternative would be a global cache, like the internment system, but I
did not find that to be appropriate here, since this is far less
fundamental and is much easier to compose.
DEV-11081
2021-11-12 13:59:14 -05:00
|
|
|
&escaper,
|
2020-01-14 16:26:36 -05:00
|
|
|
)?;
|
2019-11-27 09:18:17 -05:00
|
|
|
|
|
|
|
Ok(())
|
|
|
|
}
|
|
|
|
|
2022-05-19 12:48:43 -04:00
|
|
|
// TODO: This needs to be further generalized.
|
tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary. I've been wanting to do this for a long time,
so it's nice seeing this come together. This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them. This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.
This just maintains compatibility with the dynamic dispatch that was
previous occurring. This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.
The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.
More to come.
DEV-10935
2022-04-11 15:15:04 -04:00
|
|
|
pub fn graphml(package_path: &str, output: &str) -> Result<(), TameldError> {
|
2020-04-30 14:33:10 -04:00
|
|
|
let mut fs = VisitOnceFilesystem::new();
|
tamer: xir::escape: Remove XirString in favor of Escaper
This rewrites a good portion of the previous commit.
Rather than explicitly storing whether a given string has been escaped, we
can instead assume that all SymbolIds leaving or entering XIR are unescaped,
because there is no reason for any other part of the system to deal with
such details of XML documents.
Given that, we need only unescape on read and escape on write. This is
customary, so why didn't I do that to begin with?
The previous commit outlines the reason, mainly being an optimization for
the echo writer that is upcoming. However, this solution will end up being
better---it's not implemented yet, but we can have a caching layer, such
that the Escaper records a mapping between escaped and unescaped SymbolIds
to avoid work the next time around. If we share the Escaper between _all_
readers and the writer, the result is that
1. Duplicate strings between source files and object files (many of which
are read by both the linker and compiler) avoid re-unescaping; and
2. Writers can use this cache to avoid re-escaping when we've already seen
the escaped variant of the string during read.
The alternative would be a global cache, like the internment system, but I
did not find that to be appropriate here, since this is far less
fundamental and is much easier to compose.
DEV-11081
2021-11-12 13:59:14 -05:00
|
|
|
let escaper = DefaultEscaper::default();
|
2020-04-30 14:33:10 -04:00
|
|
|
|
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air
This finally uses `parse` all the way up to aggregation into the ASG, as can
be seen by the mess in `poc`. This will be further simplified---I just need
to get this committed so that I can mentally get it off my plate. I've been
separating this commit into smaller commits, but there's a point where it's
just not worth the effort anymore. I don't like making large changes such
as this one.
There is still work to do here. First, it's worth re-mentioning that
`poc` means "proof-of-concept", and represents things that still need a
proper home/abstraction.
Secondly, `poc` is retrieving the context of two parsers---`LowerContext`
and `Asg`. The latter is desirable, since it's the final aggregation point,
but the former needs to be eliminated; in particular, packages need to be
worked into the ASG so that `found` can be removed.
Recursively loading `xmlo` files still happens in `poc`, but the compiler
will need this as well. Once packages are on the ASG, along with their
state, that responsibility can be generalized as well.
That will then simplify lowering even further, to the point where hopefully
everything has the same shape (once final aggregation has an abstraction),
after which we can then create a final abstraction to concisely stitch
everything together. Right now, Rust isn't able to infer `S` for
`Lower<S, LS>`, which is unfortunate, but we'll be able to help it along
with a more explicit abstraction.
DEV-11864
2022-05-27 13:51:29 -04:00
|
|
|
let (depgraph, _) = load_xmlo(
|
2020-04-30 14:33:10 -04:00
|
|
|
package_path,
|
|
|
|
&mut fs,
|
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air
This finally uses `parse` all the way up to aggregation into the ASG, as can
be seen by the mess in `poc`. This will be further simplified---I just need
to get this committed so that I can mentally get it off my plate. I've been
separating this commit into smaller commits, but there's a point where it's
just not worth the effort anymore. I don't like making large changes such
as this one.
There is still work to do here. First, it's worth re-mentioning that
`poc` means "proof-of-concept", and represents things that still need a
proper home/abstraction.
Secondly, `poc` is retrieving the context of two parsers---`LowerContext`
and `Asg`. The latter is desirable, since it's the final aggregation point,
but the former needs to be eliminated; in particular, packages need to be
worked into the ASG so that `found` can be removed.
Recursively loading `xmlo` files still happens in `poc`, but the compiler
will need this as well. Once packages are on the ASG, along with their
state, that responsibility can be generalized as well.
That will then simplify lowering even further, to the point where hopefully
everything has the same shape (once final aggregation has an abstraction),
after which we can then create a final abstraction to concisely stitch
everything together. Right now, Rust isn't able to infer `S` for
`Lower<S, LS>`, which is unfortunate, but we'll be able to help it along
with a more explicit abstraction.
DEV-11864
2022-05-27 13:51:29 -04:00
|
|
|
LinkerAsg::with_capacity(65536, 65536),
|
tamer: xir::escape: Remove XirString in favor of Escaper
This rewrites a good portion of the previous commit.
Rather than explicitly storing whether a given string has been escaped, we
can instead assume that all SymbolIds leaving or entering XIR are unescaped,
because there is no reason for any other part of the system to deal with
such details of XML documents.
Given that, we need only unescape on read and escape on write. This is
customary, so why didn't I do that to begin with?
The previous commit outlines the reason, mainly being an optimization for
the echo writer that is upcoming. However, this solution will end up being
better---it's not implemented yet, but we can have a caching layer, such
that the Escaper records a mapping between escaped and unescaped SymbolIds
to avoid work the next time around. If we share the Escaper between _all_
readers and the writer, the result is that
1. Duplicate strings between source files and object files (many of which
are read by both the linker and compiler) avoid re-unescaping; and
2. Writers can use this cache to avoid re-escaping when we've already seen
the escaped variant of the string during read.
The alternative would be a global cache, like the internment system, but I
did not find that to be appropriate here, since this is far less
fundamental and is much easier to compose.
DEV-11081
2021-11-12 13:59:14 -05:00
|
|
|
&escaper,
|
2022-06-02 13:23:41 -04:00
|
|
|
XmloAirContext::default(),
|
2020-04-30 14:33:10 -04:00
|
|
|
)?;
|
|
|
|
|
|
|
|
// if we move away from petgraph, we will need to abstract this away
|
|
|
|
let g = depgraph.into_inner();
|
|
|
|
let graphml =
|
|
|
|
GraphMl::new(&g)
|
|
|
|
.pretty_print(true)
|
|
|
|
.export_node_weights(Box::new(|node| {
|
2023-01-10 15:06:24 -05:00
|
|
|
let (name, kind, generated) = match node.get() {
|
|
|
|
Object::Ident(n) => {
|
2020-04-30 14:33:10 -04:00
|
|
|
let generated = match n.src() {
|
|
|
|
Some(src) => src.generated,
|
|
|
|
None => false,
|
|
|
|
};
|
|
|
|
|
|
|
|
(
|
2022-12-15 12:07:58 -05:00
|
|
|
n.name().symbol().lookup_str().into(),
|
2021-09-29 23:18:23 -04:00
|
|
|
n.kind().unwrap().as_sym(),
|
2020-04-30 14:33:10 -04:00
|
|
|
format!("{}", generated),
|
|
|
|
)
|
|
|
|
}
|
2022-05-19 12:48:43 -04:00
|
|
|
// TODO: We want these filtered.
|
2023-01-10 15:06:24 -05:00
|
|
|
_ => (
|
2022-05-19 12:48:43 -04:00
|
|
|
String::from("non-ident"),
|
|
|
|
"non-ident".into(),
|
|
|
|
"false".into(),
|
|
|
|
),
|
2020-04-30 14:33:10 -04:00
|
|
|
};
|
|
|
|
|
|
|
|
vec![
|
|
|
|
("label".into(), name.into()),
|
2021-10-11 12:58:48 -04:00
|
|
|
("kind".into(), kind.lookup_str().into()),
|
2020-04-30 14:33:10 -04:00
|
|
|
("generated".into(), generated.into()),
|
|
|
|
]
|
|
|
|
}));
|
|
|
|
|
|
|
|
fs::write(output, graphml.to_string())?;
|
|
|
|
|
|
|
|
Ok(())
|
|
|
|
}
|
|
|
|
|
tamer: Integrate clippy
This invokes clippy as part of `make check` now, which I had previously
avoided doing (I'll elaborate on that below).
This commit represents the changes needed to resolve all the warnings
presented by clippy. Many changes have been made where I find the lints to
be useful and agreeable, but there are a number of lints, rationalized in
`src/lib.rs`, where I found the lints to be disagreeable. I have provided
rationale, primarily for those wondering why I desire to deviate from the
default lints, though it does feel backward to rationalize why certain lints
ought to be applied (the reverse should be true).
With that said, this did catch some legitimage issues, and it was also
helpful in getting some older code up-to-date with new language additions
that perhaps I used in new code but hadn't gone back and updated old code
for. My goal was to get clippy working without errors so that, in the
future, when others get into TAMER and are still getting used to Rust,
clippy is able to help guide them in the right direction.
One of the reasons I went without clippy for so long (though I admittedly
forgot I wasn't using it for a period of time) was because there were a
number of suggestions that I found disagreeable, and I didn't take the time
to go through them and determine what I wanted to follow. Furthermore, it
was hard to make that judgment when I was new to the language and lacked
the necessary experience to do so.
One thing I would like to comment further on is the use of `format!` with
`expect`, which is also what the diagnostic system convenience methods
do (which clippy does not cover). Because of all the work I've done trying
to understand Rust and looking at disassemblies and seeing what it
optimizes, I falsely assumed that Rust would convert such things into
conditionals in my otherwise-pure code...but apparently that's not the case,
when `format!` is involved.
I noticed that, after making the suggested fix with `get_ident`, Rust
proceeded to then inline it into each call site and then apply further
optimizations. It was also previously invoking the thread lock (for the
interner) unconditionally and invoking the `Display` implementation. That
is not at all what I intended for, despite knowing the eager semantics of
function calls in Rust.
Anyway, possibly more to come on that, I'm just tired of typing and need to
move on. I'll be returning to investigate further diagnostic messages soon.
2023-01-12 10:46:48 -05:00
|
|
|
fn load_xmlo<P: AsRef<Path>, S: Escaper>(
|
2020-04-07 11:40:19 -04:00
|
|
|
path_str: P,
|
|
|
|
fs: &mut VisitOnceFilesystem<FsCanonicalizer, FxBuildHasher>,
|
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air
This finally uses `parse` all the way up to aggregation into the ASG, as can
be seen by the mess in `poc`. This will be further simplified---I just need
to get this committed so that I can mentally get it off my plate. I've been
separating this commit into smaller commits, but there's a point where it's
just not worth the effort anymore. I don't like making large changes such
as this one.
There is still work to do here. First, it's worth re-mentioning that
`poc` means "proof-of-concept", and represents things that still need a
proper home/abstraction.
Secondly, `poc` is retrieving the context of two parsers---`LowerContext`
and `Asg`. The latter is desirable, since it's the final aggregation point,
but the former needs to be eliminated; in particular, packages need to be
worked into the ASG so that `found` can be removed.
Recursively loading `xmlo` files still happens in `poc`, but the compiler
will need this as well. Once packages are on the ASG, along with their
state, that responsibility can be generalized as well.
That will then simplify lowering even further, to the point where hopefully
everything has the same shape (once final aggregation has an abstraction),
after which we can then create a final abstraction to concisely stitch
everything together. Right now, Rust isn't able to infer `S` for
`Lower<S, LS>`, which is unfortunate, but we'll be able to help it along
with a more explicit abstraction.
DEV-11864
2022-05-27 13:51:29 -04:00
|
|
|
asg: Asg,
|
tamer: xir::escape: Remove XirString in favor of Escaper
This rewrites a good portion of the previous commit.
Rather than explicitly storing whether a given string has been escaped, we
can instead assume that all SymbolIds leaving or entering XIR are unescaped,
because there is no reason for any other part of the system to deal with
such details of XML documents.
Given that, we need only unescape on read and escape on write. This is
customary, so why didn't I do that to begin with?
The previous commit outlines the reason, mainly being an optimization for
the echo writer that is upcoming. However, this solution will end up being
better---it's not implemented yet, but we can have a caching layer, such
that the Escaper records a mapping between escaped and unescaped SymbolIds
to avoid work the next time around. If we share the Escaper between _all_
readers and the writer, the result is that
1. Duplicate strings between source files and object files (many of which
are read by both the linker and compiler) avoid re-unescaping; and
2. Writers can use this cache to avoid re-escaping when we've already seen
the escaped variant of the string during read.
The alternative would be a global cache, like the internment system, but I
did not find that to be appropriate here, since this is far less
fundamental and is much easier to compose.
DEV-11081
2021-11-12 13:59:14 -05:00
|
|
|
escaper: &S,
|
2022-06-02 13:23:41 -04:00
|
|
|
state: XmloAirContext,
|
|
|
|
) -> Result<(Asg, XmloAirContext), TameldError> {
|
2022-04-08 16:16:23 -04:00
|
|
|
let PathFile(path, file, ctx): PathFile<BufReader<fs::File>> =
|
|
|
|
match fs.open(path_str)? {
|
|
|
|
VisitOnceFile::FirstVisit(file) => file,
|
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air
This finally uses `parse` all the way up to aggregation into the ASG, as can
be seen by the mess in `poc`. This will be further simplified---I just need
to get this committed so that I can mentally get it off my plate. I've been
separating this commit into smaller commits, but there's a point where it's
just not worth the effort anymore. I don't like making large changes such
as this one.
There is still work to do here. First, it's worth re-mentioning that
`poc` means "proof-of-concept", and represents things that still need a
proper home/abstraction.
Secondly, `poc` is retrieving the context of two parsers---`LowerContext`
and `Asg`. The latter is desirable, since it's the final aggregation point,
but the former needs to be eliminated; in particular, packages need to be
worked into the ASG so that `found` can be removed.
Recursively loading `xmlo` files still happens in `poc`, but the compiler
will need this as well. Once packages are on the ASG, along with their
state, that responsibility can be generalized as well.
That will then simplify lowering even further, to the point where hopefully
everything has the same shape (once final aggregation has an abstraction),
after which we can then create a final abstraction to concisely stitch
everything together. Right now, Rust isn't able to infer `S` for
`Lower<S, LS>`, which is unfortunate, but we'll be able to help it along
with a more explicit abstraction.
DEV-11864
2022-05-27 13:51:29 -04:00
|
|
|
VisitOnceFile::Visited => return Ok((asg, state)),
|
2022-04-08 16:16:23 -04:00
|
|
|
};
|
2020-04-06 22:07:39 -04:00
|
|
|
|
2022-10-25 23:35:57 -04:00
|
|
|
let src = &mut XmlXirReader::new(file, escaper, ctx)
|
|
|
|
.map(|result| result.map_err(TameldError::from));
|
|
|
|
|
2022-04-11 16:08:50 -04:00
|
|
|
// TODO: This entire block is a WIP and will be incrementally
|
|
|
|
// abstracted away.
|
2022-06-02 13:26:46 -04:00
|
|
|
let (mut asg, mut state) = Lower::<
|
|
|
|
ParsedObject<XirToken, XirError>,
|
2022-08-01 13:58:51 -04:00
|
|
|
XirToXirf<4, Text>,
|
2022-10-25 23:35:57 -04:00
|
|
|
_,
|
|
|
|
>::lower(src, |toks| {
|
|
|
|
Lower::<XirToXirf<4, Text>, XmloReader, _>::lower(toks, |xmlo| {
|
|
|
|
let mut iter = xmlo.scan(false, |st, rtok| match st {
|
|
|
|
true => None,
|
|
|
|
false => {
|
|
|
|
*st =
|
|
|
|
matches!(rtok, Ok(Parsed::Object(XmloToken::Eoh(..))));
|
|
|
|
Some(rtok)
|
|
|
|
}
|
|
|
|
});
|
|
|
|
|
|
|
|
Lower::<XmloReader, XmloToAir, _>::lower_with_context(
|
|
|
|
&mut iter,
|
|
|
|
state,
|
|
|
|
|air| {
|
|
|
|
let (_, asg) =
|
|
|
|
Lower::<XmloToAir, AirAggregate, _>::lower_with_context(
|
2022-06-02 13:23:41 -04:00
|
|
|
air,
|
|
|
|
asg,
|
|
|
|
|end| {
|
|
|
|
end.fold(
|
|
|
|
Result::<(), TameldError>::Ok(()),
|
|
|
|
|x, _| x,
|
|
|
|
)
|
|
|
|
},
|
|
|
|
)?;
|
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air
This finally uses `parse` all the way up to aggregation into the ASG, as can
be seen by the mess in `poc`. This will be further simplified---I just need
to get this committed so that I can mentally get it off my plate. I've been
separating this commit into smaller commits, but there's a point where it's
just not worth the effort anymore. I don't like making large changes such
as this one.
There is still work to do here. First, it's worth re-mentioning that
`poc` means "proof-of-concept", and represents things that still need a
proper home/abstraction.
Secondly, `poc` is retrieving the context of two parsers---`LowerContext`
and `Asg`. The latter is desirable, since it's the final aggregation point,
but the former needs to be eliminated; in particular, packages need to be
worked into the ASG so that `found` can be removed.
Recursively loading `xmlo` files still happens in `poc`, but the compiler
will need this as well. Once packages are on the ASG, along with their
state, that responsibility can be generalized as well.
That will then simplify lowering even further, to the point where hopefully
everything has the same shape (once final aggregation has an abstraction),
after which we can then create a final abstraction to concisely stitch
everything together. Right now, Rust isn't able to infer `S` for
`Lower<S, LS>`, which is unfortunate, but we'll be able to help it along
with a more explicit abstraction.
DEV-11864
2022-05-27 13:51:29 -04:00
|
|
|
|
2022-10-26 11:29:26 -04:00
|
|
|
Ok::<_, TameldError>(asg)
|
2022-10-25 23:35:57 -04:00
|
|
|
},
|
|
|
|
)
|
|
|
|
})
|
|
|
|
})?;
|
2020-01-12 22:59:16 -05:00
|
|
|
|
tamer: Integrate clippy
This invokes clippy as part of `make check` now, which I had previously
avoided doing (I'll elaborate on that below).
This commit represents the changes needed to resolve all the warnings
presented by clippy. Many changes have been made where I find the lints to
be useful and agreeable, but there are a number of lints, rationalized in
`src/lib.rs`, where I found the lints to be disagreeable. I have provided
rationale, primarily for those wondering why I desire to deviate from the
default lints, though it does feel backward to rationalize why certain lints
ought to be applied (the reverse should be true).
With that said, this did catch some legitimage issues, and it was also
helpful in getting some older code up-to-date with new language additions
that perhaps I used in new code but hadn't gone back and updated old code
for. My goal was to get clippy working without errors so that, in the
future, when others get into TAMER and are still getting used to Rust,
clippy is able to help guide them in the right direction.
One of the reasons I went without clippy for so long (though I admittedly
forgot I wasn't using it for a period of time) was because there were a
number of suggestions that I found disagreeable, and I didn't take the time
to go through them and determine what I wanted to follow. Furthermore, it
was hard to make that judgment when I was new to the language and lacked
the necessary experience to do so.
One thing I would like to comment further on is the use of `format!` with
`expect`, which is also what the diagnostic system convenience methods
do (which clippy does not cover). Because of all the work I've done trying
to understand Rust and looking at disassemblies and seeing what it
optimizes, I falsely assumed that Rust would convert such things into
conditionals in my otherwise-pure code...but apparently that's not the case,
when `format!` is involved.
I noticed that, after making the suggested fix with `get_ident`, Rust
proceeded to then inline it into each call site and then apply further
optimizations. It was also previously invoking the thread lock (for the
interner) unconditionally and invoking the `Display` implementation. That
is not at all what I intended for, despite knowing the eager semantics of
function calls in Rust.
Anyway, possibly more to come on that, I'm just tired of typing and need to
move on. I'll be returning to investigate further diagnostic messages soon.
2023-01-12 10:46:48 -05:00
|
|
|
let mut dir = path;
|
2019-11-27 09:18:17 -05:00
|
|
|
dir.pop();
|
|
|
|
|
2020-04-07 10:43:54 -04:00
|
|
|
let found = state.found.take().unwrap_or_default();
|
|
|
|
|
2019-11-27 09:18:17 -05:00
|
|
|
for relpath in found.iter() {
|
|
|
|
let mut path_buf = dir.clone();
|
tamer: Integrate clippy
This invokes clippy as part of `make check` now, which I had previously
avoided doing (I'll elaborate on that below).
This commit represents the changes needed to resolve all the warnings
presented by clippy. Many changes have been made where I find the lints to
be useful and agreeable, but there are a number of lints, rationalized in
`src/lib.rs`, where I found the lints to be disagreeable. I have provided
rationale, primarily for those wondering why I desire to deviate from the
default lints, though it does feel backward to rationalize why certain lints
ought to be applied (the reverse should be true).
With that said, this did catch some legitimage issues, and it was also
helpful in getting some older code up-to-date with new language additions
that perhaps I used in new code but hadn't gone back and updated old code
for. My goal was to get clippy working without errors so that, in the
future, when others get into TAMER and are still getting used to Rust,
clippy is able to help guide them in the right direction.
One of the reasons I went without clippy for so long (though I admittedly
forgot I wasn't using it for a period of time) was because there were a
number of suggestions that I found disagreeable, and I didn't take the time
to go through them and determine what I wanted to follow. Furthermore, it
was hard to make that judgment when I was new to the language and lacked
the necessary experience to do so.
One thing I would like to comment further on is the use of `format!` with
`expect`, which is also what the diagnostic system convenience methods
do (which clippy does not cover). Because of all the work I've done trying
to understand Rust and looking at disassemblies and seeing what it
optimizes, I falsely assumed that Rust would convert such things into
conditionals in my otherwise-pure code...but apparently that's not the case,
when `format!` is involved.
I noticed that, after making the suggested fix with `get_ident`, Rust
proceeded to then inline it into each call site and then apply further
optimizations. It was also previously invoking the thread lock (for the
interner) unconditionally and invoking the `Display` implementation. That
is not at all what I intended for, despite knowing the eager semantics of
function calls in Rust.
Anyway, possibly more to come on that, I'm just tired of typing and need to
move on. I'll be returning to investigate further diagnostic messages soon.
2023-01-12 10:46:48 -05:00
|
|
|
path_buf.push(relpath.lookup_str());
|
2019-11-27 09:18:17 -05:00
|
|
|
path_buf.set_extension("xmlo");
|
|
|
|
|
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air
This finally uses `parse` all the way up to aggregation into the ASG, as can
be seen by the mess in `poc`. This will be further simplified---I just need
to get this committed so that I can mentally get it off my plate. I've been
separating this commit into smaller commits, but there's a point where it's
just not worth the effort anymore. I don't like making large changes such
as this one.
There is still work to do here. First, it's worth re-mentioning that
`poc` means "proof-of-concept", and represents things that still need a
proper home/abstraction.
Secondly, `poc` is retrieving the context of two parsers---`LowerContext`
and `Asg`. The latter is desirable, since it's the final aggregation point,
but the former needs to be eliminated; in particular, packages need to be
worked into the ASG so that `found` can be removed.
Recursively loading `xmlo` files still happens in `poc`, but the compiler
will need this as well. Once packages are on the ASG, along with their
state, that responsibility can be generalized as well.
That will then simplify lowering even further, to the point where hopefully
everything has the same shape (once final aggregation has an abstraction),
after which we can then create a final abstraction to concisely stitch
everything together. Right now, Rust isn't able to infer `S` for
`Lower<S, LS>`, which is unfortunate, but we'll be able to help it along
with a more explicit abstraction.
DEV-11864
2022-05-27 13:51:29 -04:00
|
|
|
(asg, state) = load_xmlo(path_buf, fs, asg, escaper, state)?;
|
2019-11-27 09:18:17 -05:00
|
|
|
}
|
|
|
|
|
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air
This finally uses `parse` all the way up to aggregation into the ASG, as can
be seen by the mess in `poc`. This will be further simplified---I just need
to get this committed so that I can mentally get it off my plate. I've been
separating this commit into smaller commits, but there's a point where it's
just not worth the effort anymore. I don't like making large changes such
as this one.
There is still work to do here. First, it's worth re-mentioning that
`poc` means "proof-of-concept", and represents things that still need a
proper home/abstraction.
Secondly, `poc` is retrieving the context of two parsers---`LowerContext`
and `Asg`. The latter is desirable, since it's the final aggregation point,
but the former needs to be eliminated; in particular, packages need to be
worked into the ASG so that `found` can be removed.
Recursively loading `xmlo` files still happens in `poc`, but the compiler
will need this as well. Once packages are on the ASG, along with their
state, that responsibility can be generalized as well.
That will then simplify lowering even further, to the point where hopefully
everything has the same shape (once final aggregation has an abstraction),
after which we can then create a final abstraction to concisely stitch
everything together. Right now, Rust isn't able to infer `S` for
`Lower<S, LS>`, which is unfortunate, but we'll be able to help it along
with a more explicit abstraction.
DEV-11864
2022-05-27 13:51:29 -04:00
|
|
|
Ok((asg, state))
|
2019-11-27 09:18:17 -05:00
|
|
|
}
|
|
|
|
|
tamer: xir::escape: Remove XirString in favor of Escaper
This rewrites a good portion of the previous commit.
Rather than explicitly storing whether a given string has been escaped, we
can instead assume that all SymbolIds leaving or entering XIR are unescaped,
because there is no reason for any other part of the system to deal with
such details of XML documents.
Given that, we need only unescape on read and escape on write. This is
customary, so why didn't I do that to begin with?
The previous commit outlines the reason, mainly being an optimization for
the echo writer that is upcoming. However, this solution will end up being
better---it's not implemented yet, but we can have a caching layer, such
that the Escaper records a mapping between escaped and unescaped SymbolIds
to avoid work the next time around. If we share the Escaper between _all_
readers and the writer, the result is that
1. Duplicate strings between source files and object files (many of which
are read by both the linker and compiler) avoid re-unescaping; and
2. Writers can use this cache to avoid re-escaping when we've already seen
the escaped variant of the string during read.
The alternative would be a global cache, like the internment system, but I
did not find that to be appropriate here, since this is far less
fundamental and is much easier to compose.
DEV-11081
2021-11-12 13:59:14 -05:00
|
|
|
fn output_xmle<'a, X: XmleSections<'a>, S: Escaper>(
|
|
|
|
sorted: X,
|
2021-09-23 14:52:53 -04:00
|
|
|
name: SymbolId,
|
2021-10-11 16:00:19 -04:00
|
|
|
relroot: SymbolId,
|
2020-03-04 15:31:20 -05:00
|
|
|
output: &str,
|
tamer: xir::escape: Remove XirString in favor of Escaper
This rewrites a good portion of the previous commit.
Rather than explicitly storing whether a given string has been escaped, we
can instead assume that all SymbolIds leaving or entering XIR are unescaped,
because there is no reason for any other part of the system to deal with
such details of XML documents.
Given that, we need only unescape on read and escape on write. This is
customary, so why didn't I do that to begin with?
The previous commit outlines the reason, mainly being an optimization for
the echo writer that is upcoming. However, this solution will end up being
better---it's not implemented yet, but we can have a caching layer, such
that the Escaper records a mapping between escaped and unescaped SymbolIds
to avoid work the next time around. If we share the Escaper between _all_
readers and the writer, the result is that
1. Duplicate strings between source files and object files (many of which
are read by both the linker and compiler) avoid re-unescaping; and
2. Writers can use this cache to avoid re-escaping when we've already seen
the escaped variant of the string during read.
The alternative would be a global cache, like the internment system, but I
did not find that to be appropriate here, since this is far less
fundamental and is much easier to compose.
DEV-11081
2021-11-12 13:59:14 -05:00
|
|
|
escaper: &S,
|
tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary. I've been wanting to do this for a long time,
so it's nice seeing this come together. This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them. This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.
This just maintains compatibility with the dynamic dispatch that was
previous occurring. This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.
The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.
More to come.
DEV-10935
2022-04-11 15:15:04 -04:00
|
|
|
) -> Result<(), TameldError> {
|
2020-03-04 15:31:20 -05:00
|
|
|
let file = fs::File::create(output)?;
|
2021-10-08 21:57:51 -04:00
|
|
|
let mut buf = BufWriter::new(file);
|
2021-09-28 14:52:31 -04:00
|
|
|
|
tamer: xir::escape: Remove XirString in favor of Escaper
This rewrites a good portion of the previous commit.
Rather than explicitly storing whether a given string has been escaped, we
can instead assume that all SymbolIds leaving or entering XIR are unescaped,
because there is no reason for any other part of the system to deal with
such details of XML documents.
Given that, we need only unescape on read and escape on write. This is
customary, so why didn't I do that to begin with?
The previous commit outlines the reason, mainly being an optimization for
the echo writer that is upcoming. However, this solution will end up being
better---it's not implemented yet, but we can have a caching layer, such
that the Escaper records a mapping between escaped and unescaped SymbolIds
to avoid work the next time around. If we share the Escaper between _all_
readers and the writer, the result is that
1. Duplicate strings between source files and object files (many of which
are read by both the linker and compiler) avoid re-unescaping; and
2. Writers can use this cache to avoid re-escaping when we've already seen
the escaped variant of the string during read.
The alternative would be a global cache, like the internment system, but I
did not find that to be appropriate here, since this is far less
fundamental and is much easier to compose.
DEV-11081
2021-11-12 13:59:14 -05:00
|
|
|
lower_iter(sorted, name, relroot).write(
|
|
|
|
&mut buf,
|
|
|
|
Default::default(),
|
|
|
|
escaper,
|
|
|
|
)?;
|
2021-10-08 21:57:51 -04:00
|
|
|
buf.flush()?;
|
2020-01-13 16:41:06 -05:00
|
|
|
|
|
|
|
Ok(())
|
2019-11-27 09:18:17 -05:00
|
|
|
}
|
|
|
|
|
tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary. I've been wanting to do this for a long time,
so it's nice seeing this come together. This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them. This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.
This just maintains compatibility with the dynamic dispatch that was
previous occurring. This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.
The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.
More to come.
DEV-10935
2022-04-11 15:15:04 -04:00
|
|
|
// TODO: This, like everything else here, needs a home.
|
|
|
|
// TODO: Better encapsulation for `*ParseError` types.
|
|
|
|
/// Linker (`tameld`) error.
|
|
|
|
///
|
|
|
|
/// This represents the aggregation of all possible errors that can occur
|
|
|
|
/// during link-time.
|
|
|
|
/// This cannot include panics,
|
|
|
|
/// but efforts have been made to reduce panics to situations that
|
|
|
|
/// represent the equivalent of assertions.
|
|
|
|
#[derive(Debug)]
|
|
|
|
pub enum TameldError {
|
|
|
|
Io(io::Error),
|
|
|
|
SortError(SortError),
|
2022-06-02 10:30:44 -04:00
|
|
|
XirParseError(ParseError<UnknownToken, XirError>),
|
2022-06-02 13:41:24 -04:00
|
|
|
XirfParseError(ParseError<XirToken, XirToXirfError>),
|
tamer: Xirf::Text refinement
This teaches XIRF to optionally refine Text into RefinedText, which
determines whether the given SymbolId represents entirely whitespace.
This is something I've been putting off for some time, but now that I'm
parsing source language for NIR, it is necessary, in that we can only permit
whitespace Text nodes in certain contexts.
The idea is to capture the most common whitespace as preinterned
symbols. Note that this heuristic ought to be determined from scanning a
codebase, which I haven't done yet; this is just an initial list.
The fallback is to look up the string associated with the SymbolId and
perform a linear scan, aborting on the first non-whitespace character. This
combination of checks should be sufficiently performant for now considering
that this is only being run on source files, which really are not all that
large. (They become large when template-expanded.) I'll optimize further
if I notice it show up during profiling.
This also frees XIR itself from being concerned by Whitespace. Initially I
had used quick-xml's whitespace trimming, but it messed up my span
calculations, and those were a pain in the ass to implement to begin with,
since I had to resort to pointer arithmetic. I'd rather avoid tweaking it.
tameld will not check for whitespace, since it's not important---xmlo files,
if malformed, are the fault of the compiler; we can ignore text nodes except
in the context of code fragments, where they are never whitespace (unless
that's also a compiler bug).
Onward and yonward.
DEV-7145
2022-07-27 15:49:38 -04:00
|
|
|
XmloParseError(ParseError<XirfToken<Text>, XmloError>),
|
2022-06-02 13:23:41 -04:00
|
|
|
XmloLowerError(ParseError<XmloToken, XmloAirError>),
|
2022-12-13 14:36:38 -05:00
|
|
|
AirLowerError(ParseError<Air, AsgError>),
|
tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary. I've been wanting to do this for a long time,
so it's nice seeing this come together. This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them. This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.
This just maintains compatibility with the dynamic dispatch that was
previous occurring. This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.
The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.
More to come.
DEV-10935
2022-04-11 15:15:04 -04:00
|
|
|
XirWriterError(XirWriterError),
|
2022-10-26 11:29:26 -04:00
|
|
|
FinalizeError(FinalizeError),
|
tamer: diagnose: Introduction of diagnostic system
This is a working concept that will continue to evolve. I wanted to start
with some basic output before getting too carried away, since there's a lot
of potential here.
This is heavily influenced by Rust's helpful diagnostic messages, but will
take some time to realize a lot of the things that Rust does. The next step
will be to resolve line and column numbers, and then possibly include
snippets and underline spans, placing the labels alongside them. I need to
balance this work with everything else I have going on.
This is a large commit, but it converts the existing Error Display impls
into Diagnostic. This separation is a bit verbose, so I'll see how this
ends up evolving.
Diagnostics are tied to Error at the moment, but I imagine in the future
that any object would be able to describe itself, error or not, which would
be useful in the future both for the Summary Page and for query
functionality, to help developers understand the systems they are writing
using TAME.
Output is integrated into tameld only in this commit; I'll add tamec
next. Examples of what this outputs are available in the test cases in this
commit.
DEV-10935
2022-04-13 14:41:54 -04:00
|
|
|
Fmt(fmt::Error),
|
tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary. I've been wanting to do this for a long time,
so it's nice seeing this come together. This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them. This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.
This just maintains compatibility with the dynamic dispatch that was
previous occurring. This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.
The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.
More to come.
DEV-10935
2022-04-11 15:15:04 -04:00
|
|
|
}
|
|
|
|
|
|
|
|
impl From<io::Error> for TameldError {
|
|
|
|
fn from(e: io::Error) -> Self {
|
|
|
|
Self::Io(e)
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
impl From<SortError> for TameldError {
|
|
|
|
fn from(e: SortError) -> Self {
|
|
|
|
Self::SortError(e)
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2022-06-02 10:30:44 -04:00
|
|
|
impl From<ParseError<UnknownToken, XirError>> for TameldError {
|
|
|
|
fn from(e: ParseError<UnknownToken, XirError>) -> Self {
|
|
|
|
Self::XirParseError(e)
|
tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary. I've been wanting to do this for a long time,
so it's nice seeing this come together. This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them. This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.
This just maintains compatibility with the dynamic dispatch that was
previous occurring. This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.
The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.
More to come.
DEV-10935
2022-04-11 15:15:04 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
tamer: Xirf::Text refinement
This teaches XIRF to optionally refine Text into RefinedText, which
determines whether the given SymbolId represents entirely whitespace.
This is something I've been putting off for some time, but now that I'm
parsing source language for NIR, it is necessary, in that we can only permit
whitespace Text nodes in certain contexts.
The idea is to capture the most common whitespace as preinterned
symbols. Note that this heuristic ought to be determined from scanning a
codebase, which I haven't done yet; this is just an initial list.
The fallback is to look up the string associated with the SymbolId and
perform a linear scan, aborting on the first non-whitespace character. This
combination of checks should be sufficiently performant for now considering
that this is only being run on source files, which really are not all that
large. (They become large when template-expanded.) I'll optimize further
if I notice it show up during profiling.
This also frees XIR itself from being concerned by Whitespace. Initially I
had used quick-xml's whitespace trimming, but it messed up my span
calculations, and those were a pain in the ass to implement to begin with,
since I had to resort to pointer arithmetic. I'd rather avoid tweaking it.
tameld will not check for whitespace, since it's not important---xmlo files,
if malformed, are the fault of the compiler; we can ignore text nodes except
in the context of code fragments, where they are never whitespace (unless
that's also a compiler bug).
Onward and yonward.
DEV-7145
2022-07-27 15:49:38 -04:00
|
|
|
impl From<ParseError<XirfToken<Text>, XmloError>> for TameldError {
|
|
|
|
fn from(e: ParseError<XirfToken<Text>, XmloError>) -> Self {
|
tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary. I've been wanting to do this for a long time,
so it's nice seeing this come together. This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them. This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.
This just maintains compatibility with the dynamic dispatch that was
previous occurring. This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.
The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.
More to come.
DEV-10935
2022-04-11 15:15:04 -04:00
|
|
|
Self::XmloParseError(e)
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2022-06-02 13:41:24 -04:00
|
|
|
impl From<ParseError<XirToken, XirToXirfError>> for TameldError {
|
|
|
|
fn from(e: ParseError<XirToken, XirToXirfError>) -> Self {
|
tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary. I've been wanting to do this for a long time,
so it's nice seeing this come together. This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them. This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.
This just maintains compatibility with the dynamic dispatch that was
previous occurring. This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.
The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.
More to come.
DEV-10935
2022-04-11 15:15:04 -04:00
|
|
|
Self::XirfParseError(e)
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2022-06-02 13:23:41 -04:00
|
|
|
impl From<ParseError<XmloToken, XmloAirError>> for TameldError {
|
|
|
|
fn from(e: ParseError<XmloToken, XmloAirError>) -> Self {
|
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air
This finally uses `parse` all the way up to aggregation into the ASG, as can
be seen by the mess in `poc`. This will be further simplified---I just need
to get this committed so that I can mentally get it off my plate. I've been
separating this commit into smaller commits, but there's a point where it's
just not worth the effort anymore. I don't like making large changes such
as this one.
There is still work to do here. First, it's worth re-mentioning that
`poc` means "proof-of-concept", and represents things that still need a
proper home/abstraction.
Secondly, `poc` is retrieving the context of two parsers---`LowerContext`
and `Asg`. The latter is desirable, since it's the final aggregation point,
but the former needs to be eliminated; in particular, packages need to be
worked into the ASG so that `found` can be removed.
Recursively loading `xmlo` files still happens in `poc`, but the compiler
will need this as well. Once packages are on the ASG, along with their
state, that responsibility can be generalized as well.
That will then simplify lowering even further, to the point where hopefully
everything has the same shape (once final aggregation has an abstraction),
after which we can then create a final abstraction to concisely stitch
everything together. Right now, Rust isn't able to infer `S` for
`Lower<S, LS>`, which is unfortunate, but we'll be able to help it along
with a more explicit abstraction.
DEV-11864
2022-05-27 13:51:29 -04:00
|
|
|
Self::XmloLowerError(e)
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2022-12-13 14:36:38 -05:00
|
|
|
impl From<ParseError<Air, AsgError>> for TameldError {
|
|
|
|
fn from(e: ParseError<Air, AsgError>) -> Self {
|
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air
This finally uses `parse` all the way up to aggregation into the ASG, as can
be seen by the mess in `poc`. This will be further simplified---I just need
to get this committed so that I can mentally get it off my plate. I've been
separating this commit into smaller commits, but there's a point where it's
just not worth the effort anymore. I don't like making large changes such
as this one.
There is still work to do here. First, it's worth re-mentioning that
`poc` means "proof-of-concept", and represents things that still need a
proper home/abstraction.
Secondly, `poc` is retrieving the context of two parsers---`LowerContext`
and `Asg`. The latter is desirable, since it's the final aggregation point,
but the former needs to be eliminated; in particular, packages need to be
worked into the ASG so that `found` can be removed.
Recursively loading `xmlo` files still happens in `poc`, but the compiler
will need this as well. Once packages are on the ASG, along with their
state, that responsibility can be generalized as well.
That will then simplify lowering even further, to the point where hopefully
everything has the same shape (once final aggregation has an abstraction),
after which we can then create a final abstraction to concisely stitch
everything together. Right now, Rust isn't able to infer `S` for
`Lower<S, LS>`, which is unfortunate, but we'll be able to help it along
with a more explicit abstraction.
DEV-11864
2022-05-27 13:51:29 -04:00
|
|
|
Self::AirLowerError(e)
|
tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary. I've been wanting to do this for a long time,
so it's nice seeing this come together. This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them. This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.
This just maintains compatibility with the dynamic dispatch that was
previous occurring. This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.
The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.
More to come.
DEV-10935
2022-04-11 15:15:04 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2022-10-26 11:29:26 -04:00
|
|
|
impl From<FinalizeError> for TameldError {
|
|
|
|
fn from(e: FinalizeError) -> Self {
|
|
|
|
Self::FinalizeError(e)
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary. I've been wanting to do this for a long time,
so it's nice seeing this come together. This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them. This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.
This just maintains compatibility with the dynamic dispatch that was
previous occurring. This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.
The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.
More to come.
DEV-10935
2022-04-11 15:15:04 -04:00
|
|
|
impl From<XirWriterError> for TameldError {
|
|
|
|
fn from(e: XirWriterError) -> Self {
|
|
|
|
Self::XirWriterError(e)
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
tamer: diagnose: Introduction of diagnostic system
This is a working concept that will continue to evolve. I wanted to start
with some basic output before getting too carried away, since there's a lot
of potential here.
This is heavily influenced by Rust's helpful diagnostic messages, but will
take some time to realize a lot of the things that Rust does. The next step
will be to resolve line and column numbers, and then possibly include
snippets and underline spans, placing the labels alongside them. I need to
balance this work with everything else I have going on.
This is a large commit, but it converts the existing Error Display impls
into Diagnostic. This separation is a bit verbose, so I'll see how this
ends up evolving.
Diagnostics are tied to Error at the moment, but I imagine in the future
that any object would be able to describe itself, error or not, which would
be useful in the future both for the Summary Page and for query
functionality, to help developers understand the systems they are writing
using TAME.
Output is integrated into tameld only in this commit; I'll add tamec
next. Examples of what this outputs are available in the test cases in this
commit.
DEV-10935
2022-04-13 14:41:54 -04:00
|
|
|
impl From<fmt::Error> for TameldError {
|
|
|
|
fn from(e: fmt::Error) -> Self {
|
|
|
|
Self::Fmt(e)
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary. I've been wanting to do this for a long time,
so it's nice seeing this come together. This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them. This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.
This just maintains compatibility with the dynamic dispatch that was
previous occurring. This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.
The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.
More to come.
DEV-10935
2022-04-11 15:15:04 -04:00
|
|
|
impl Display for TameldError {
|
|
|
|
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
|
|
|
match self {
|
|
|
|
Self::Io(e) => Display::fmt(e, f),
|
|
|
|
Self::SortError(e) => Display::fmt(e, f),
|
2022-06-02 10:30:44 -04:00
|
|
|
Self::XirParseError(e) => Display::fmt(e, f),
|
tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary. I've been wanting to do this for a long time,
so it's nice seeing this come together. This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them. This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.
This just maintains compatibility with the dynamic dispatch that was
previous occurring. This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.
The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.
More to come.
DEV-10935
2022-04-11 15:15:04 -04:00
|
|
|
Self::XirfParseError(e) => Display::fmt(e, f),
|
|
|
|
Self::XmloParseError(e) => Display::fmt(e, f),
|
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air
This finally uses `parse` all the way up to aggregation into the ASG, as can
be seen by the mess in `poc`. This will be further simplified---I just need
to get this committed so that I can mentally get it off my plate. I've been
separating this commit into smaller commits, but there's a point where it's
just not worth the effort anymore. I don't like making large changes such
as this one.
There is still work to do here. First, it's worth re-mentioning that
`poc` means "proof-of-concept", and represents things that still need a
proper home/abstraction.
Secondly, `poc` is retrieving the context of two parsers---`LowerContext`
and `Asg`. The latter is desirable, since it's the final aggregation point,
but the former needs to be eliminated; in particular, packages need to be
worked into the ASG so that `found` can be removed.
Recursively loading `xmlo` files still happens in `poc`, but the compiler
will need this as well. Once packages are on the ASG, along with their
state, that responsibility can be generalized as well.
That will then simplify lowering even further, to the point where hopefully
everything has the same shape (once final aggregation has an abstraction),
after which we can then create a final abstraction to concisely stitch
everything together. Right now, Rust isn't able to infer `S` for
`Lower<S, LS>`, which is unfortunate, but we'll be able to help it along
with a more explicit abstraction.
DEV-11864
2022-05-27 13:51:29 -04:00
|
|
|
Self::XmloLowerError(e) => Display::fmt(e, f),
|
|
|
|
Self::AirLowerError(e) => Display::fmt(e, f),
|
tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary. I've been wanting to do this for a long time,
so it's nice seeing this come together. This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them. This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.
This just maintains compatibility with the dynamic dispatch that was
previous occurring. This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.
The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.
More to come.
DEV-10935
2022-04-11 15:15:04 -04:00
|
|
|
Self::XirWriterError(e) => Display::fmt(e, f),
|
2022-10-26 11:29:26 -04:00
|
|
|
Self::FinalizeError(e) => Display::fmt(e, f),
|
tamer: diagnose: Introduction of diagnostic system
This is a working concept that will continue to evolve. I wanted to start
with some basic output before getting too carried away, since there's a lot
of potential here.
This is heavily influenced by Rust's helpful diagnostic messages, but will
take some time to realize a lot of the things that Rust does. The next step
will be to resolve line and column numbers, and then possibly include
snippets and underline spans, placing the labels alongside them. I need to
balance this work with everything else I have going on.
This is a large commit, but it converts the existing Error Display impls
into Diagnostic. This separation is a bit verbose, so I'll see how this
ends up evolving.
Diagnostics are tied to Error at the moment, but I imagine in the future
that any object would be able to describe itself, error or not, which would
be useful in the future both for the Summary Page and for query
functionality, to help developers understand the systems they are writing
using TAME.
Output is integrated into tameld only in this commit; I'll add tamec
next. Examples of what this outputs are available in the test cases in this
commit.
DEV-10935
2022-04-13 14:41:54 -04:00
|
|
|
Self::Fmt(e) => Display::fmt(e, f),
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
impl Error for TameldError {
|
|
|
|
fn source(&self) -> Option<&(dyn Error + 'static)> {
|
|
|
|
match self {
|
|
|
|
Self::Io(e) => Some(e),
|
|
|
|
Self::SortError(e) => Some(e),
|
2022-06-02 10:30:44 -04:00
|
|
|
Self::XirParseError(e) => Some(e),
|
tamer: diagnose: Introduction of diagnostic system
This is a working concept that will continue to evolve. I wanted to start
with some basic output before getting too carried away, since there's a lot
of potential here.
This is heavily influenced by Rust's helpful diagnostic messages, but will
take some time to realize a lot of the things that Rust does. The next step
will be to resolve line and column numbers, and then possibly include
snippets and underline spans, placing the labels alongside them. I need to
balance this work with everything else I have going on.
This is a large commit, but it converts the existing Error Display impls
into Diagnostic. This separation is a bit verbose, so I'll see how this
ends up evolving.
Diagnostics are tied to Error at the moment, but I imagine in the future
that any object would be able to describe itself, error or not, which would
be useful in the future both for the Summary Page and for query
functionality, to help developers understand the systems they are writing
using TAME.
Output is integrated into tameld only in this commit; I'll add tamec
next. Examples of what this outputs are available in the test cases in this
commit.
DEV-10935
2022-04-13 14:41:54 -04:00
|
|
|
Self::XirfParseError(e) => Some(e),
|
|
|
|
Self::XmloParseError(e) => Some(e),
|
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air
This finally uses `parse` all the way up to aggregation into the ASG, as can
be seen by the mess in `poc`. This will be further simplified---I just need
to get this committed so that I can mentally get it off my plate. I've been
separating this commit into smaller commits, but there's a point where it's
just not worth the effort anymore. I don't like making large changes such
as this one.
There is still work to do here. First, it's worth re-mentioning that
`poc` means "proof-of-concept", and represents things that still need a
proper home/abstraction.
Secondly, `poc` is retrieving the context of two parsers---`LowerContext`
and `Asg`. The latter is desirable, since it's the final aggregation point,
but the former needs to be eliminated; in particular, packages need to be
worked into the ASG so that `found` can be removed.
Recursively loading `xmlo` files still happens in `poc`, but the compiler
will need this as well. Once packages are on the ASG, along with their
state, that responsibility can be generalized as well.
That will then simplify lowering even further, to the point where hopefully
everything has the same shape (once final aggregation has an abstraction),
after which we can then create a final abstraction to concisely stitch
everything together. Right now, Rust isn't able to infer `S` for
`Lower<S, LS>`, which is unfortunate, but we'll be able to help it along
with a more explicit abstraction.
DEV-11864
2022-05-27 13:51:29 -04:00
|
|
|
Self::XmloLowerError(e) => Some(e),
|
|
|
|
Self::AirLowerError(e) => Some(e),
|
tamer: diagnose: Introduction of diagnostic system
This is a working concept that will continue to evolve. I wanted to start
with some basic output before getting too carried away, since there's a lot
of potential here.
This is heavily influenced by Rust's helpful diagnostic messages, but will
take some time to realize a lot of the things that Rust does. The next step
will be to resolve line and column numbers, and then possibly include
snippets and underline spans, placing the labels alongside them. I need to
balance this work with everything else I have going on.
This is a large commit, but it converts the existing Error Display impls
into Diagnostic. This separation is a bit verbose, so I'll see how this
ends up evolving.
Diagnostics are tied to Error at the moment, but I imagine in the future
that any object would be able to describe itself, error or not, which would
be useful in the future both for the Summary Page and for query
functionality, to help developers understand the systems they are writing
using TAME.
Output is integrated into tameld only in this commit; I'll add tamec
next. Examples of what this outputs are available in the test cases in this
commit.
DEV-10935
2022-04-13 14:41:54 -04:00
|
|
|
Self::XirWriterError(e) => Some(e),
|
2022-10-26 11:29:26 -04:00
|
|
|
Self::FinalizeError(e) => Some(e),
|
tamer: diagnose: Introduction of diagnostic system
This is a working concept that will continue to evolve. I wanted to start
with some basic output before getting too carried away, since there's a lot
of potential here.
This is heavily influenced by Rust's helpful diagnostic messages, but will
take some time to realize a lot of the things that Rust does. The next step
will be to resolve line and column numbers, and then possibly include
snippets and underline spans, placing the labels alongside them. I need to
balance this work with everything else I have going on.
This is a large commit, but it converts the existing Error Display impls
into Diagnostic. This separation is a bit verbose, so I'll see how this
ends up evolving.
Diagnostics are tied to Error at the moment, but I imagine in the future
that any object would be able to describe itself, error or not, which would
be useful in the future both for the Summary Page and for query
functionality, to help developers understand the systems they are writing
using TAME.
Output is integrated into tameld only in this commit; I'll add tamec
next. Examples of what this outputs are available in the test cases in this
commit.
DEV-10935
2022-04-13 14:41:54 -04:00
|
|
|
Self::Fmt(e) => Some(e),
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
impl Diagnostic for TameldError {
|
|
|
|
fn describe(&self) -> Vec<AnnotatedSpan> {
|
|
|
|
match self {
|
2022-06-02 10:30:44 -04:00
|
|
|
Self::XirParseError(e) => e.describe(),
|
tamer: diagnose: Introduction of diagnostic system
This is a working concept that will continue to evolve. I wanted to start
with some basic output before getting too carried away, since there's a lot
of potential here.
This is heavily influenced by Rust's helpful diagnostic messages, but will
take some time to realize a lot of the things that Rust does. The next step
will be to resolve line and column numbers, and then possibly include
snippets and underline spans, placing the labels alongside them. I need to
balance this work with everything else I have going on.
This is a large commit, but it converts the existing Error Display impls
into Diagnostic. This separation is a bit verbose, so I'll see how this
ends up evolving.
Diagnostics are tied to Error at the moment, but I imagine in the future
that any object would be able to describe itself, error or not, which would
be useful in the future both for the Summary Page and for query
functionality, to help developers understand the systems they are writing
using TAME.
Output is integrated into tameld only in this commit; I'll add tamec
next. Examples of what this outputs are available in the test cases in this
commit.
DEV-10935
2022-04-13 14:41:54 -04:00
|
|
|
Self::XirfParseError(e) => e.describe(),
|
|
|
|
Self::XmloParseError(e) => e.describe(),
|
tamer: Refactor asg_builder into obj::xmlo::lower and asg::air
This finally uses `parse` all the way up to aggregation into the ASG, as can
be seen by the mess in `poc`. This will be further simplified---I just need
to get this committed so that I can mentally get it off my plate. I've been
separating this commit into smaller commits, but there's a point where it's
just not worth the effort anymore. I don't like making large changes such
as this one.
There is still work to do here. First, it's worth re-mentioning that
`poc` means "proof-of-concept", and represents things that still need a
proper home/abstraction.
Secondly, `poc` is retrieving the context of two parsers---`LowerContext`
and `Asg`. The latter is desirable, since it's the final aggregation point,
but the former needs to be eliminated; in particular, packages need to be
worked into the ASG so that `found` can be removed.
Recursively loading `xmlo` files still happens in `poc`, but the compiler
will need this as well. Once packages are on the ASG, along with their
state, that responsibility can be generalized as well.
That will then simplify lowering even further, to the point where hopefully
everything has the same shape (once final aggregation has an abstraction),
after which we can then create a final abstraction to concisely stitch
everything together. Right now, Rust isn't able to infer `S` for
`Lower<S, LS>`, which is unfortunate, but we'll be able to help it along
with a more explicit abstraction.
DEV-11864
2022-05-27 13:51:29 -04:00
|
|
|
Self::XmloLowerError(e) => e.describe(),
|
|
|
|
Self::AirLowerError(e) => e.describe(),
|
2022-10-26 11:29:26 -04:00
|
|
|
Self::FinalizeError(e) => e.describe(),
|
2022-12-15 12:07:58 -05:00
|
|
|
Self::SortError(e) => e.describe(),
|
tamer: diagnose: Introduction of diagnostic system
This is a working concept that will continue to evolve. I wanted to start
with some basic output before getting too carried away, since there's a lot
of potential here.
This is heavily influenced by Rust's helpful diagnostic messages, but will
take some time to realize a lot of the things that Rust does. The next step
will be to resolve line and column numbers, and then possibly include
snippets and underline spans, placing the labels alongside them. I need to
balance this work with everything else I have going on.
This is a large commit, but it converts the existing Error Display impls
into Diagnostic. This separation is a bit verbose, so I'll see how this
ends up evolving.
Diagnostics are tied to Error at the moment, but I imagine in the future
that any object would be able to describe itself, error or not, which would
be useful in the future both for the Summary Page and for query
functionality, to help developers understand the systems they are writing
using TAME.
Output is integrated into tameld only in this commit; I'll add tamec
next. Examples of what this outputs are available in the test cases in this
commit.
DEV-10935
2022-04-13 14:41:54 -04:00
|
|
|
|
2022-12-16 15:02:57 -05:00
|
|
|
Self::Io(_) | Self::XirWriterError(_) | Self::Fmt(_) => vec![],
|
tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary. I've been wanting to do this for a long time,
so it's nice seeing this come together. This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them. This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.
This just maintains compatibility with the dynamic dispatch that was
previous occurring. This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.
The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.
More to come.
DEV-10935
2022-04-11 15:15:04 -04:00
|
|
|
}
|
|
|
|
}
|
2019-11-27 09:18:17 -05:00
|
|
|
}
|