Compare commits

...

1416 Commits
v3.2.0 ... main

Author SHA1 Message Date
Mike Gerwitz d889aca13a tamer: asg::graph::AsgRelMut: API cleanup
This does two things:

  1. Removes callback; it didn't add anything of practical value.
     The operation will simply be performed as long as no error is provided
     by the callee.
  2. Consolodates three arguments into `ProposedRel`.  This makes blocks in
     `object_rel!` less verbose and boilerplate-y.

I'll probably implement `TplShape::Unknown` via the dynamic `Ident` `Tpl`
edge before continuing with any cleanup.  This is getting pretty close to
reasonable for future implementations.

DEV-13163
2023-07-27 03:33:06 -04:00
Mike Gerwitz 579575a358 tamer: asg::graph::object: ObjectIndex::try_map_obj_inner
Continuing to clean house and make things more concise, not just for `tpl`
but for all future changes.

DEV-13163
2023-07-27 03:07:33 -04:00
Mike Gerwitz 66512bf20d tamer: f: impl_mono_map! macro
This helps to remove some boilerplate.  Testing this out in
`asg::graph::object::tpl` before applying it to other things; really `Map`
can just go away entirely then since it can be implemented in terms of
`TryMap`, but maybe it should stick around for manual impls (implementing
`TryMap` manually is more work).

DEV-13163
2023-07-27 02:56:29 -04:00
Mike Gerwitz 3c9e1add20 tamer: f: Add TryMap
This implements TryMap and utilizes it in `asg::graph::object::tpl`.

DEV-13163
2023-07-27 01:44:12 -04:00
Mike Gerwitz 38c0161257 tamer: f::{Functor=>Map}: It's not really a functor
At least not how most people expect functors to be.  I'm really just using
this as a map with powerful inference properties that make writing code more
pleasent.

And I need fallible methods now too.

DEV-13163
2023-07-26 16:43:09 -04:00
Mike Gerwitz e14854a555 tamer: asg::air::tpl: Resolve shape against inner template application
Things are starting to get interesting, and this shows how caching
information about template shape (rather than having to query the graph any
time we want to discover it) makes it easy to compose shapes.

This does not yet handle the unknown case.  Before I do that, I'll want to
do some refactoring to address duplication in the `tpl` module.

DEV-13163
2023-07-26 16:09:17 -04:00
Mike Gerwitz c19ecba6ef tamer: asg::air::object::tpl: Reject multi-expression shape
This enforces the new constraint that templates expanding into an `Expr`
context must only inline a single `Expr`.

Perhaps in the future we'll support explicit splicing, like `,@` in
Lisp.  But this new restriction is intended for two purposes:

  - To make templates more predictable (if you have a list of expressions
    inlined then they will act differently depending on the type of
    expression that they are inlined into, which means that more defensive
    programming would otherwise be required); and
  - To make expansion easier, since we're going to have to set aside an
    expansion workspace ahead of time to ensure ordering (Petgraph can't
    replace edges in-place).  If we support multi-expansion, we'd have to
    handle associativity in all expression contexts.

This'll become more clear in future commits.

It's nice to see all this hard work coming together now, though; it's easy
now to perform static analysis on the system, and any part of the graph
construction can throw errors with rich diagnostic information and still
recover properly.  And, importantly, the system enforces its own state, and
the compiler helps us with that (the previous commits).

DEV-13163
2023-07-26 04:03:52 -04:00
Mike Gerwitz aec721f4fa tamer: asg::graph: Work AsgRelMut specialization into `object_rel!` macro
This formalizes the previous commit a bit more and adds documentation
explaining why it exists and how it works.  Look there for more
information.

This has been a lot of setup work.  Hopefully things are now easier in the
future.  And now we have nice declarative type-level hooks into the graph!

DEV-13163
2023-07-26 04:03:49 -04:00
Mike Gerwitz 37c962a7ee tamer: asg::graph::object::tpl::TplShape: Introduce template "shapes"
This change is the first to utilize matching on edges to determine the state
of the template (to begin to derive its shape).

But this is notable for my finally caving on `min_specialization`.

The commit contains a bunch of rationale for why I introduced it.  I've been
sitting on trying it for _years_.  I had hoped for further progress in
determining a stabalization path, but that doesn't seem to be happening.

The reason I caved is because _not_ using it is a significant barrier to
utilizing robust types in various scenarios.  I've been having to work
around that with significant efforts to write boilerplate code to match on
types and branch to various static paths accordingly.  It makes it really
expensive to make certain types of changes, and it make the code really
difficult to understand once you start to peel back abstractions that try to
hide it.

I'll see how this goes and, if it goes well, begin to replace old methods
with specialization.

See the next commit for some cleanup.  I purposefully left this a bit of a
mess (at the bottom of `asg::graph::object::tpl`) to emphasize what I'm
doing and why I introduced it.

DEV-13163
2023-07-25 15:28:53 -04:00
Mike Gerwitz 4168c579fd tamer: asg::graph::Asg{Object=>Rel}Mut: Trait-level target type
This allows for a declarative matching on edge targets using the trait
system, rather than having to convert the type to a runtime value to match
on (which doesn't make a whole lot of sense).

See a commit to follow shortly (with Tpl) for an example use case.

DEV-13163
2023-07-25 11:15:41 -04:00
Mike Gerwitz 2ecc143e02 tamer: asg::graph::pre_add_edge: Use pre-narrowed source ObjectIndex
Since we're statically invoking a particular ObjectKind's method, we already
know the source type.  Let's pre-narrow it for their (my) convenience.

DEV-13163
2023-07-25 10:55:08 -04:00
Mike Gerwitz 28b83ad6a3 tamer: asg::graph::AsgObjectMut: Allow objects to assert ownership over relationships
There's a lot to say about this; it's been a bit of a struggle figuring out
what I wanted to do here.

First: this allows objects to use `AsgObjectMut` to control whether an edge
is permitted to be added, or to cache information about an edge that is
about to be added.  But no object does that yet; it just uses the default
trait implementation, and so this _does not change any current
behavior_.  It also is approximately equivalent cycle-count-wise, according
to Valgrind (within ~100 cycles out of hundreds of millions on large package
tests).

Adding edges to the graph is still infallible _after having received
permission_ from an `ObjectIndexRelTo`, but the object is free to reject the
edge with an `AsgError`.

As an example of where this will be useful: the template system needs to
keep track of what is in the body of a template as it is defined.  But the
`TplAirAggregate` parser is sidelined while expressions in the body are
parsed, and edges are added to a dynamic source using
`ObjectIndexRelTo`.  Consequently, we cannot rely on a static API to cache
information; we have to be able to react dynamically.  This will allow `Tpl`
objects to know any time edges are added and, therefore, determine their
shape as the graph is being built, rather than having to traverse the tree
after encountering a close.

(I _could_ change this, but `ObjectIndexRelTo` removes a significant amount
of complexity for the caller, so I'd rather not.)

I did explore other options.  I rejected the first one, then rejected this
one, then rejected the first one again before returning back to this one
after having previously sidelined the entire thing, because of the above
example.  The core point is: I need confidence that the graph isn't being
changed in ways that I forgot about, and because of the complexity of the
system and the heavy refactoring that I do, I need the compiler's help;
otherwise I risk introducing subtle bugs as objects get out of sync with the
actual state of the graph.

(I wish the graph supported these things directly, but that's a project well
outside the scope of my TAMER work.  So I have to make do, as I have been
all this time, by layering atop of Petgraph.)

(...I'm beginning to ramble.)

(...beginning?)

Anyway: my other rejected idea was to provide attestation via the
`ObjectIndex` APIs to force callers to go through those APIs to add an edge
to the graph; it would use sealed objects that are inaccessible to any
modules other than the objects, and assert that the caller is able to
provide a zero-sized object of that sealed type.

The problem with this is...exactly what was mentioned above:
`ObjectIndexRelTo` is dynamic.  We don't always know the source object type
statically, and so we cannot make those static assertions.

I could have tried the same tricks to store attestation at some other time,
but what a confusing mess it would be.

And so here we are.

Most of this work is cleaning up the callers---adding edges is now fallible,
from the `ObjectIndex` API standpoint, and so AIR needed to be set up to
handle those failures.  There _aren't_ any failures yet, but again, since
things are dynamic, they could appear at any moment.  Furthermore, since
ref/def is commutative (things can be defined and referenced in any order),
there could be surprise errors on edge additions in places that might not
otherwise expect it in the future.  We're now ready for that, and I'll be
able to e.g. traverse incoming edges on a `Missing->Transparent` definition
to notify dependents.

This project is going to be the end of me.  As interesting as it is.

I can see why Rust just chose to require macro definitions _before_ use.  So
much less work.

DEV-13163
2023-07-24 16:41:32 -04:00
Mike Gerwitz e414782def tamer: asg::graph: Encapsulate edge additions
AIR is no longer able to explicitly add edges without going through an
object-specific `ObjectIndex` API.  `Asg::add_edge` was already private, but
`ObjectIndex::add_edge_{to,from}` was not.

The problem is that I want to augment the graph with other invariants, such
as caches.  I'd normally have this built into the graph system itself, but I
don't have the time for the engineering effort to extend or replace
Petgraph, so I'm going to build atop of it.

To have confidence in any sort of caching, I need assurances that the graph
can't change out from underneath an object.  This gets _close_ to
accomplishing that, but I'm still uncomfortable:

  - We're one `pub` addition away from breaking these invariants; and
  - Other `Object` types can still manipulates one-anothers' edges.

So this is a first step that at least proves encapsulation within
`asg::graph`, but ideally we'd have the system enforce, statically, that
`Objects` own their _outgoing_ edges, and no other `Object` is able to
manipulate them.  This would ensure that any accidental future changes, or
bugs, will cause compilation failures rather than e.g. allowing caches to
get out of sync with the graph.

DEV-13163
2023-07-21 10:21:57 -04:00
Mike Gerwitz 0f93f3a498 tamer: NIR->xmli interpolation and template param
The fixpoint tests for `meta-interp` are finally working.  I could have
broken this up more, but I'm exhausted with this process, so, you get what
you get.

NIR will now recognize basic `<text>` and `<param-value>` nodes (note the
caveat for `<text>` in the comment, for now), and I finally include abstract
binding in the lowering pipeline.  `xmli` output is also now able to cope
with metavariables with a single lexical association, and continues to
become more of a mess.

DEV-13163
2023-07-18 12:31:28 -04:00
Mike Gerwitz 85b08eb45e tamer: nir::interp: Do not include original specification in generated desc
This is a really obvious problem in retrospect, which makes me feel rather
silly.

The output was useful, but I don't have time to deal with this any further
right now.  The comments in the commit explain the problem---that the output
ends up being interpolated as part of the fixpoint test, in an incorrect
context, and so the code that we generate is invalid.  Also goes to show why
the fixpoint tests are important.

(Yes, they're still disabled for meta-interp, I'm trying to get them
enabled.)

DEV-13163
2023-07-18 11:17:51 -04:00
Mike Gerwitz 507669cb30 tamer: asg::graph::object::ObjectIndexRefined: New narrowing type
The provided documentation provides rationale, and the use case is the
ontree change.  I was uncomfortable without the exhaustive match, and I was
further annoyed by the lack of easy `ObjectIndex` narrowing.

DEV-13163
2023-07-18 10:31:33 -04:00
Mike Gerwitz 5a301c1548 tamer: asg::graph::visit::ontree: Source ordering of ontological tree
This introduces the ability to specify an edge ordering for the ontological
tree traversal.  `tree_reconstruction` will now use a
`SourceCompatibleTreeEdgeOrder`, which will traverse the graph in an order
that will result in a properly ordered source reconstruction.  This is
needed for template headers, because interpolation causes
metavariables (exposed as template params) to be mixed into the body.

There's a lot of information here, including some TODOs on possible
improvements.  I used the unstable `is_sorted` to output how many template
were already sorted, based on one of our very large packages internally that
uses templates extensively, and found that none of the desugared shorthand
template expansions were already ordered.  If I tweak that a bit, then
nearly all templates will already be ordered, reducing the work that needs
to be done, leaving only template definitions with interpolation to be
concerned about, which is infrequent relative to everything else.

DEV-13163
2023-07-18 10:31:31 -04:00
Mike Gerwitz b30018c23b tamer: xmli reconstruction of desugared interpolated metavars
Well, this is both good news and bad news.

The good news is that this finally produces the expected output and
reconstructs sources from interpolated values on the ASG.  Yay!

...the bad news is that it's wrong.  Notice how the fixpoint test is
disabled.

So, my plan was originally to commit it like this first and see if I was
comfortable relaxing the convention that `<param>` nodes had to appear in
the header.  That's nice to do, that's cleaner to do, but would the
XSLT-based compiler really care?  I had to investigate.

Well, turns out that TAMER does care.  Because, well over a decade ago, I
re-used `<param>`, which could represent not only a template param, but also
a global param, or a function param.

So, XML->NIR considers all `<param>` nodes at the head of a template to be
template parameters.  But after the first non-header element, we transition
to another state that allows it to be pretty much anything.

And so, I can't relax that restriction.

And because of that, I can't just stream the tree to the xmli generator,
I'll have to queue up nodes and order them.

Oh well, I tried.

DEV-13163
2023-07-17 14:20:05 -04:00
Mike Gerwitz 85892caeb2 tamer: asg: Root abstract identifiers in active container
I'm not sure how I overlooked this previously, and I didn't notice until
trying to generate xmli output.  I think I distracted myself with the
use of dangling status, which was not appropriate, and that has since
changed so that we have a dedicated concept.

This introduces the term "instantiation", or more specifically "lexical
instantiation".  This is more specific and meaningful than simply
"expansion", which is what occurs during instantiation.  I'll try to adjust
terminology and make things more consistent as I go.

DEV-13163
2023-07-17 14:20:04 -04:00
Mike Gerwitz 760223f0c9 tamer: asg::air: Extract abstract definition into context
This logic ought to live alongside other definition logic...which in turn
needs its own extraction, but that's a separate concern.

This makes the definition of abstract identifiers very similar to
concrete.  But, treating these as dangling, even if that's technically true,
has to change---we still want an edge drawn to the abstract identifier via
e.g. a template since we want the graph to mirror the structure of what it
will expand into concretely.  I didn't notice this problem until trying to
generate the xmli for it.

So, see the commit to follow.

DEV-13163
2023-07-13 11:16:10 -04:00
Mike Gerwitz b4b85a5e85 tamer: asg::air: Support Meta::ConcatList with lexemes and refs
This handles the common cases for meta, which includes what interpolation
desugars into.  Most of this work was in testing and reasoning about the
issue; `asg::graph::visit:ontree::test` has a good summary of the structure
of the graph that results.

The last remaining steps to make this work end-to-end is for NIR->AIR to
lower `Nir::Ref` into `Air::BindIdent`, and then for `asg::graph::xmli` to
reconstruct concatenation lists.  I'll then be able to commit the xmli test
case I've been sitting on, whose errors have been guiding my development.

DEV-13163
2023-07-13 10:48:45 -04:00
Mike Gerwitz d2d29d8957 tamer: asg::air::meta: Use term "metalinguistic" over "metasyntactic"
The term "metasyntactic" made sense literally---it's a variable in a
metalanguage that expands into a context that is able to contribute to the
language's syntax.  But, the term has a different conventional use in
programming that is misleading.

The term "metalinguistic" is used in mathematics, to describe a metalanguage
or schema atop of a language.  This is more fitting.

DEV-13163
2023-07-13 10:48:45 -04:00
Mike Gerwitz b4bbc0d8f0 tamer: asg::air: Use new parse::util::spair function to reduce test ceremony
This makes `SPair` construction more concise, getting rid of the `into`
invocations.  For now I have only made this change in AIR's tests, since
that's what I'm working on and I want to observe how this convention
evolves.  This may also encourage other changes, e.g. placing spans within
the `toks` array, rather than having to jump around the test for them.

The comment for `spair` mentions why this is a test-only function.  But it
also shows how dangerous `impl Into<SymbolId> for &str` can be, since it
seems so innocuous---it uses a global interner.  I'll be interested to see a
year from now if I decided to forego that impl in favor of explicit
internment, since I'm not sure it's worth the convenience anymore.

DEV-13163
2023-07-13 10:48:44 -04:00
Mike Gerwitz 8a10f8bbbe tamer: asg::air: Remove unncessary vec![] usage in tests
This has been bothering me for quite a long time, and is just more test
cleanup before I introduce more.  I suspect this came from habit with the
previous Rust edition where `into_iter()` on arrays was a much more verbose
operation.

To be clear: this change isn't for performance.  It's about not doing
something silly when it's unnecessary, which also sets a bad example for
others.

There are many other tests in other modules that will need updating at some
point.

DEV-13163
2023-07-13 10:48:27 -04:00
Mike Gerwitz 2e33e9e93e tamer: asg::air: Remove `Air::` token variant prefixes from tests
This just removes noise from test, as has become standard in various other
tests in TAMER.

DEV-13163
2023-07-13 10:48:27 -04:00
Mike Gerwitz 24ee041373 tamer: asg::air: Support abstract biding of `Expr`s
This produces a representation of abstract identifiers on the graph, for
`Expr`s at least.  The next step will probably be to get this working
end-to-end in the xmli output before extending it to the other remaining
bindable contexts.

DEV-13163
2023-07-13 10:48:26 -04:00
Mike Gerwitz a144730981 tamer: nir::abstract_bind: Require @-padding of metavariable names
This enforces the naming convention that is utilized to infer whether an
identifier binding must be translated to an abstract binding.

This does not yet place any restrictions on other characters in identifier
names; both the placement of and flexibility of that has yet to be
decided.  This change is sufficient enough to make abstract binding
translation reliable.

DEV-13163
2023-07-10 10:28:01 -04:00
Mike Gerwitz 8449a2b759 tamer: parse::prelude: Include Display, Debug, and Error-related exports
Cut down on the import boilerplate some more for `ParseState`s.

DEV-13163
2023-07-10 10:27:57 -04:00
Mike Gerwitz 8685527feb tamer: nir: New token BindIdentMeta
The previous commit made me uncomfortable; we're already parsing with great
precision (and effort!) the grammar of NIR, and know for certain whether
we're in a metavariable binding context, so it makes no sense to have to try
to guess at another point in the lowering pipeline.

This introduces a new token to retain that information from XIR->NIR
lowering and then re-simplifies the lowering operation that was just
introduced in the previous commit (`AbstractBindTranslate`).

DEV-13163
2023-06-28 09:48:15 -04:00
Mike Gerwitz 7314562671 tamer: nir::abstract_bind: New lowering operation
This builds upon the concepts of the previous commit to translate identifier
binding into an abstract binding if it utilizes a symbol that follows a
metavariable naming convention.

See the provided documentation for more information.

This commit _does not_ integrate this into the lowering pipeline yet, since
the abstract identifiers are still rejected (as TODOs) by AIR.

DEV-13163
2023-06-28 09:48:14 -04:00
Mike Gerwitz 15071a1824 tamer: nir: Interpolate concrete binds into abstract binds
This introduces the notion of an abstract identifier, where the previous
identifiers are concrete.  This serves as a compromise to either introducing
a new object type (another `Ident`), or having every `Ident` name be defined
by a `Meta` edge, which would bloat the graph significantly.

This change causes interpolation within a bind context to desugar into a new
`BindIdentAbstract` token, but AIR will throw an error if it encounters it
for now; that implementation will come soon.

This does not yet handle non-interpolation cases,
e.g. `<classify as="@foo@">`.  This is a well-established shorthand for
`as="{@foo@}"`, but is unfortunately ambiguous in the context of
metavariable definitions (template parameters).  This language ambiguity
will have to be handled here, and will have to fall back to today's behavior
of assuming concrete in that `param/@name` context but abstract every else,
unless of course interpolation is triggered using `{}` to disambiguate (as
in `<param name="{@foo@}"`).

I was going to handle the short-hand meta binding case as part of
interpolation, but I decided it may be appropriate for its own lowering
operation, since it is intended to work regardless of whether interpolation
takes place; it's a _translation_ of a binding into an abstract one, and it
can clearly delineate the awkward syntactic rules that we have to inherit,
as mentioned above.

DEV-13163
2023-06-27 12:48:19 -04:00
Mike Gerwitz 828d8918a3 tamer::asg::graph::object::ident::Ident::name: Produce Option
This prepares to make the name of an `Ident` optional to support abstract
identifiers derived from metavariables.

This is an unfortunate change to have to prepare for, since it complicates
how Idents are interpreted, but the alternative (a new object type) is not
good either.  We'll see how this evolves.

DEV-13163
2023-06-26 15:37:08 -04:00
Mike Gerwitz 6b54eafd70 tamer: asg::air: Hoist metavars in expressions
This is intended to support NIR's lexical interpolation, which expands in
place into metavariables.

This commit does not yet contain the NIR portion (or xmli system test)
because Meta needs to be able to handle concatenation first; that's next.

DEV-13163
2023-06-20 15:14:38 -04:00
Mike Gerwitz d10bf00f5d tamer: Initial template/param support through xmli
This introduces template/param and regenerates it in the xmli output.  Note
that this does not check that applications reference known params; that's a
later phase.

DEV-13163
2023-06-14 16:38:05 -04:00
Mike Gerwitz 9887abd037 tamer: nir::air: Include mention of .experimental file in TODO help
The previous commit introduced support for a `.experimental` file to tigger
`xmlo-experimental`.  This modifies the error message for unsupported
features to make mention of it to help to the user track down the problem.

DEV-13162
2023-06-14 13:03:21 -04:00
Mike Gerwitz a9bbb87612 build-aux/Makefile.am: Introduce .experimental files
If a source file is paired with a `.experimental` file (for example,
`foo.xml` has a silbing `foo.experimental` file), then it will be
precompiled using `--emit xmlo-experimental` instead of `--emit
xmlo`.  Further, the contents of the experimental file may contain
`#`-prefixed comments describing why it exists, as well as additional
options to pass to `tamec`.

For example, if this is an experimental file:

```

--foo
--bar=baz
```

Then the tamec invocation will contain:

  tamec [...] --emit xmlo-experimental --foo --bar=baz -o foo.xmli

This allows for package-level conditional compilation with new features so
that I am able to focus on packages that will provide the most meaningful
benefits to our team, whether they be performance or features.

DEV-13162
2023-06-14 12:02:57 -04:00
Mike Gerwitz 7487bdccc3 tamer: nir::air: Recoverable error instead of panic for TODO tokens
Now that the feature flag for the parser is a command line option, it is
useful to be able to run it on any package and see what errors arise, to use
as a guide for development with the goal of getting a particular package to
compile.

This converts the TODO panic into a recoverable error so that the parser can
spit out as many errors as it can.

DEV-13162
2023-06-14 10:24:50 -04:00
Mike Gerwitz 454f5f4d04 tamer: Initial clarifying pipeline docs
This provides some initial information to help guide a user to discover how
TAMER works, though either the source code or the generated
documentation.  This will improve over time, since all of the high-level
abstractions are still under development.

DEV-13162
2023-06-13 23:43:04 -04:00
Mike Gerwitz 9eeb18bda2 tamer: Replace wip-asg-derived-xmli flag with command line option
This introduces `xmlo-experimental` for `--emit`, allowing the new parser to
be toggled selectively for individual packages.  This has a few notable
benefits:

  1. We'll be able to conditionally compile packages as they are
     supported (TAMER will target specific packages in our system to try to
     achieve certain results more quickly);

  2. This cleans up the code a bit by removing awkward gated logic, allowing
     natural abstractions to form; and

  3. Removing the compile-time feature flag ensures that the new features
     are always built and tested; there are fewer configuration combinations
     to test.

DEV-13162
2023-06-13 23:23:51 -04:00
Mike Gerwitz 341af3fdaf tamer: nir::air: Dynamic configuration in place of static wip-asg-derived-xmli flag
This flag should have never been sprinkled here; it makes the system much
harder to understand.

But, this is working toward a command-line tamec option to toggle NIR
lowering on/off for various packages.

DEV-13162
2023-06-13 15:07:03 -04:00
Mike Gerwitz 61d556c89e tamer: pipeline: Generate recoverable sum error types
This was a significant undertaking, with a few thrown-out approaches.  The
documentation describes what approach was taken, but I'd also like to
provide some insight into the approaches that I rejected for various
reasons, or because they simply didn't work.

The problem that this commit tries to solve is encapsulation of error
types.

Prior to the introduction of the lowering pipeline macro
`lower_pipeline!`, all pipelines were written by hand using `Lower` and
specifying the applicable types.  This included creating sum types to
accommodate each of the errors so that `Lower` could widen automatically.

The introduction of the `lower_pipeline!` macro resolved the boilerplate and
type complexity concerns for the parsers by allowing the pipeline to be
concisely declared.  However, it still accepted an error sum type `ER` for
recoverable errors, which meant that we had to break a level of
encapsulation, peering into the pipeline to know both what parsers were in
play and what their error types were.

These error sum types were also the source of a lot of tedious boilerplate
that made adding new parsers to the pipeline unnecessarily unpleasant;
the purpose of the macro is to make composition both easy and clear, and
error types were undermining it.

Another benefit of sum types per pipeline is that callers need only
aggregate those pipeline types, if they care about them, rather than every
error type used as a component of the pipeline.

So, this commit generates the error types.  Doing so was non-trivial.

Associated Types and Lifetimes
------------------------------
Error types are associated with their `ParseState` as
`ParseState::Error`.  As described in this commit, TAMER's approach to
errors is that they never contain non-static lifetimes; interning and
copying are used to that effect.  And, indeed, no errors in TAMER have
lifetimes.

But, some `ParseState`s may.  In this case, `AsgTreeToXirf`:

```
impl<'a> ParseState for AsgTreeToXirf<'a> {
  // [...]
  type Error = AsgTreeToXirfError;
  // [...]
}
```

Even though `AsgTreeToXirfError` does not have a lifetime, the `ParseState`
it is associated with _does_`.  So to reference that type, we must use
`<AsgTreeToXirf<'a> as ParseState>::Error`.  So if we have a sum type:

```
enum Sum<'a> {
  //     ^^ oh no!                  vv
  AsgTreeToXirfError(<AsgTreeToXirf<'a> as ParseState>::Error),
}
```

There's no way to elide or make anonymous that lifetime, since it's not
used, at the time of writing.  `for<'a>` also cannot be used in this
context.

The solution in this commit is to use a macro (`lower_error_sum`) to rewrite
lifetimes: to `'static`:

```
enum Sum {
  AsgTreeToXirfError(<AsgTreeToXirf<'static> as ParseState>::Error),
}
```

The `Error` associated type will resolve to `AsgTreeToXirfError` all the
same either way, since it has no lifetimes of its own, letalone any
referencing trait bounds.

That's not to say that we _couldn't_ support lifetimes as long as they're
attached to context, but we have no need to at the moment, and it adds
significant cognitive overhead.  Further, the diagnostic system doesn't deal
in lifetimes, and so would need reworking as well.  Not worth it.

An alternative solution to this that was rejected is an explicitly `Error`
type in the macro application:

```
// in the lowering pipeline
|> AsgTreeToXirf<'a> {  // lifetime
    type Error = AsgTreeToXirfError;   // no lifetime
}
```

But this requires peeling back the `ParseState` to see what its error is and
_duplicate_ it here.  Silly, and it breaks encapsulation, since the lowering
pipeline is supposed to return its own error type.

Yet another option considered was to standardize a submodule convention
whereby each `ParseState` would have a module exporting `Error`, among other
types.  This would decouple it from the parent type.  However, we still have
the duplication between that and an associated type.  Further, there's no
way to enforce this convention (effectively a module API)---the macro would
just fail in obscure ways, at least with `macro_rules!`.  It would have been
an ugly kluge.

Overlapping Error Types
-----------------------
Another concern with generating the sum type, resolved in a previous commit,
was overlapping error types, which prohibited `impl From<E> for ER`
generation.

The problem with that a number of `ParseState`s used `Infallible` as their
`Error` type.  This was resolved in a previous commit by creating
Infallible-like newtypes (variantless enums).

This was not the only option.  `From` fits naturally into how TAMER handles
sum types, and fits naturally into `Lower`'s `WidenedError`.  The
alternative is generating explicit `map_err`s in `lower_pipeline!`.  This
would have allowed for overlapping error types because the _caller_ knows
what the correct target variant is in the sum type.

The problem with an explicit `map_err` is that it places more power in
`lower_pipeline!`, which is _supposed_ to be a macro that simply removes
boilerplate; it's not supposed to increase expressiveness.  It's also not
fun dealing with complexity in macros; they're much more confusing that
normal code.

With the decided-upon approach (newtypes + `From`), hand-written `Lower`
pipelines are just as expressive---just more verbose---as `lower_pipeline!`,
and handles widening for you.  Rust's type system will also handle the
complexity of widening automatically for us without us having to reason
about it in the macro.  This is not always desirable, but in this case, I
feel that it is.
2023-06-13 14:49:43 -04:00
Mike Gerwitz 31f6a102eb tamer: pipeline::macro: Partially applied pipeline
This configures the pipeline and returns a closure that can then be provided
with the source and sink.

The next obvious step would be to curry the source and sink.

But I wanted to commit this before I take a different (but equivalent)
approach that makes the pipeline operations more explicit and helps to guide
the user (developer) in developing and composing them.  The FP approach is
less boilerplate, but is also more general and provides less
guidance.  Given that composition at the topmost levels of the system,
especially with all the types involved, is one of the most confusing aspects
of the system---and one of the most important to get right and make clear,
since it's intended to elucidate the entire system at a high level, and
guide the reader.  Well, it does a poor job at that now, but that's the
ultimate goal.

In essence---brutally general abstractions make sense at lower levels, but
the complexity at higher levels benefits from rigid guardrails, even though
it does not necessitate it.

DEV-13162
2023-06-13 10:02:51 -04:00
Mike Gerwitz 26c4076579 tamer: obj::xmlo::reader: Emit token after symbol dependencies
This will allow a tamec xmlo reading pipeline to stop before fragment
loading.

DEV-13162
2023-06-12 12:37:12 -04:00
Mike Gerwitz 0b9e91b936 tamer: obj::xmlo::reader::XmloReader: Remove generics
This cleanup is an interesting one, because I think the present me may
disagree with the past me.

The use of generics here to compose the parser from smaller parsers was due
to how I wrote my object-oriented code in other languages: where a class was
an independently tested unit.  I was trying to reproduce the same here,
utilizing generics in the same way that one would use compoisition via
object constructors in other languages.

But it's been a long time since then, and I've come to settle on different
standards in Rust.  The components of `XmloReader` really are just
implementation details.  As I find myself about to want to modify its
behavior, I don't _want_ to compose `XmloReader` from _different_ parsers;
that may result in an invalid parse.  There's one correct way to parse an
xmlo file.

If I want to parse the file differently, then `XmloReader` ought to expose
a way of doing so.  This is more rigid, but that rigidity buys us confidence
that the system has been explicitly designed to support those
operations.  And that confidence gives us peace of mind knowing that the
system won't compose in ways that we don't intend for it to.

Of course, I _could_ design the system to compose in generic ways.  But
that's an over-generalization that I don't think will be helpful; it's not
only a greater cognitive burden, but it's also a lot more work to ensure
that invariants are properly upheld and to design an API that will ensure
that parsing is always correct.  It's simply not worth it.

So, this makes `XmloReader` consistent with other parsers now, like
`AirAggregate` and nir::parse (ele_parse).  This prepares for a change to
make `XmloReader` configurable to avoid loading fragments from object files,
since that's very wasteful for `tamec`.

DEV-13162
2023-06-12 12:37:12 -04:00
Mike Gerwitz 1bb25b05b3 tamer: Newtypes for all Infallible ParseState errors
More information will be presented in the commit that follows to generalize
these, but this sets the stage.

The recently-introduced pipeline macro takes care of most of the job of a
declarative pipeline, but it's still leaky, since it requires that the
_caller_ create error sum types.  This not only exposes implementation
details and so undermines the goal of making pipelines easy to declare and
compose, but it's also one of the last major components of boilerplate for
the lowering pipeline.

My previous attempts at generating error sum types automatically for
pipelines ran into a problem because of overlapping `impl`s for the various
`<S as ParseState>::Error` types; this resolves that issue via
newtypes.  I had considered other approaches, including explicitly
generating code to `map_err` as part of the lowering pipeline, but in the
end this is the easier way to reason about things that also keeps manual
`Lower` pipelines on the same level of expressiveness as the pipeline macro;
I want to restrict its unique capabilities as much as possible to
elimination of boilerplate and nothing more.

DEV-13162
2023-06-12 12:33:22 -04:00
Mike Gerwitz 672cc54c14 compiler/js.xsl: Derive supplier name from base package name
At or around 00492ace01, I modified packages
to output canonical `@name`s, which contains a leading forward
slash.  Previously, names omitted that slash.  I did not believe that this
caused any problems.

It seems that the XSLT-based `standalones` system utilizes this package name
to derive a supplier name, which is supposed to be the filename of the
package without any path.  Since the package name changed from
`suppliers/foo` to `/suppliers/foo`, for example, this was now producing
"suppliers/name" instead of "name".

Of course, it was never a good idea to strip off only the first path
component.  But, this is how it has been since TAME was originally created
well over a decade ago.

I did not catch this since I was diff'ing the output of the xmle files, not
the final JS files.  I had thought that was sufficient, given what I was
changing, but I was wrong.

DEV-14502
2023-06-08 16:46:18 -04:00
Mike Gerwitz d6e9ec7207 tamer: nightly pin: Describe problems with adt_const_param's ConstParamTy
See commit for description of the problem, describing why I'm not yet
upgrading to a currently nightly version.

DEV-14476
2023-06-06 11:00:44 -04:00
Mike Gerwitz 6769f0c280 tamer: Support nightly Rust toolchain pinning
I had never intended to avoid pinning nightly.  This is an unfortunate thing
to have to do---require a _specific_ version of a compiler to build your
software; it's madness.  But the unstable features utilized by TAMER (as
rationalized in `src/lib.rs`) are still worth the effort.

It's not _actually_ that case that we need a specific version of the
compiler, granted; this is outlined in `rust-toolchain.toml`'s
rationale.  You should look there for more information; my approach still
utilizes explicit channels via cargo.  Unfortunately, I had hard-coded it
previously, putting me in a bit of a bind an unable to override the behavior
without modifying the software.

The reason for this change is that `adt_const_params` has a BC break
involving the introduction of `ConstParamTy`.  This is only the second time
I've been bitten by a nightly BC break; the other was the renaming of
`int_log`'s API, as mentioned in
709291b107.  This pinning will in fact
mitigate those future issues---TAMER will be able to resolve the issue at
its leisure, and will further be able to continue to build earlier commits
in the future by simply re-bootstrapping with the committed nightly
version.

If you're curious of my rationale for wanting to inhibit toolchain
downloading during build, or use system libraries, have a look at GNU Guix's
approach to building software safely and reproducibly.  In particular,
dependencies are also built from source (rather than downloading binaries
from external sources), and builds take place in network-isolated
containers.  The `TAMER_RUST_TOOLCHAIN` configure parameter is meant to
facilitate these situations by giving more flexibility to packagers.

DEV-14476
2023-06-05 16:42:31 -04:00
Mike Gerwitz 1706c55645 tamer: nir::air: Feature-flag SYM_TRUE
The code utilizing this is flagged, and so the build would output warnings
saying that it was not used.  This resolves that (I've been aware of it for
far too long; I'm developing behind the `wip-asg-derived-xmli` flag where I
don't usually see it).

DEV-13162
2023-06-05 16:27:56 -04:00
Mike Gerwitz 93cc0d2ce1 tamer: pipeline::macro::lower_pipeline: Doc generation
This generates some documentation helping to describe the lowering pipeline,
since the function type signature can be daunting to those unfamiliar with
it (and I'm sure to the future me too).

DEV-13162
2023-06-05 13:44:49 -04:00
Mike Gerwitz 65c1b2d083 tamer: pipeline: Remove explicit source token type specification
Like the previous commit's removal of the error type, this eliminates the
explicit source token type since we're able to infer it from the pipeline
definition.

DEV-13162
2023-06-05 13:44:49 -04:00
Mike Gerwitz 0e0f3e658d tamer: pipeline: Remove explicit error specification in pipeline definition
It does not matter what the error of the source is as long as the caller is
able to deal with it, especially given that the particular error is a
property of the source, which is under control of the caller.

DEV-13162
2023-06-05 13:44:49 -04:00
Mike Gerwitz 3800109530 tamer: pipeline: Extract macro into own module
The macro is off-putting and more complicated than the pipeline definitions
themselves (of course), so this tucks it away so that readers are able to
more easily observe the definitions that they're probably looking for
without feeling compelled to try to understand the macro definition.

DEV-13162
2023-06-05 13:44:49 -04:00
Mike Gerwitz 6a99ee3cb3 tamer: pipeline::lower_xmli: Use `lower_pipeline!`
All lowering pipelines are now using `lower_pipeline!`.  Finally.

The macro does require some refactoring and documentation, but it's working,
and we now have three pipelines whose definitions are smaller than a single
one was previously.  I've been hoping to do this for many months, so it's
nice to finally see this come to fruition.

I had been putting it off, but doing so has made it difficult to compose
other parts of the system, not knowing what abstractions I'll have at my
disposal.

DEV-13162
2023-06-05 13:44:49 -04:00
Mike Gerwitz 109ba5f797 tamer: pipeline::lower_xmli: Generalize sink like other pipelines
This makes the sink similar to other pipelines without creating a new
ParseState, and so will allow for integrating into the `lower_pipeline!`
abstraction.

DEV-13162
2023-06-05 13:44:49 -04:00
Mike Gerwitz 9c6b00a124 tamer: pipeline: Initial concept for declarative pipeline definition
This has been the ultimate goal for the pipeline for some time---the ability
to declaratively define the lowering pipeline in a way that is clear,
concise, and is correct by definition.

The reason that the lowering pipeline required so much boilerplate was
because of the robust types involved, which ensures that everything in the
pipeline is compatible with one-another---it's not possible to construct a
pipeline that will not work.

Of course, there is nuance involved in some cases---I didn't want to include
the `until` clause, which makes it fail the "obviously correct" criterion,
but that can be improved over time.

This only abstracts away `load_xmlo` and `parse_package_xml`; next I'll have
to evolve the abstraction to support lifetimes for `lower_xmli`'s
`AsgTreeToXirf`.  That pipeline also ends with a custom sink that really
ought to become its own parser, but I don't want to jump down that rabbit
hole right now, so we may just support custom sinks for now with the intent
of removing it in the future.

This has been a long time coming.  The ultimate goal is that you should be
able to look at the parser pipelines to have a clear, high-level overview of
how everything fits together.  I'm not generating documentation yet, but
that'll help serve as a guide as well.

DEV-13162
2023-06-05 13:44:49 -04:00
Mike Gerwitz f34f2644e9 tamer: pipeline: Allow reporting on entire Result
The report acts as the sink for `load_xmlo` and `parse_package_xml`.  At the
moment, the type is `()`, and so there's nothing to report on but the
error.  But the idea is to add logging via `AirAggregate::Object`, which is
currently just `()`.

This change therefore is only a refactoring---it changes no functionality
but sets up for future changes.

This also introduces consistency with `lower_xmli` in use of `terminal` for
the final operation.

DEV-13162
2023-06-05 13:44:49 -04:00
Mike Gerwitz 0f20f22cfb tamer: diagnose::Diagnostic: Remove Error trait bound
Diagnostic events need not be errors.  While that was the original intent,
it'd also be nice to be able to use the diagnostic system for any type of
logging, where the verbosity level would determine the type of report that
is output (whether source information should be provided).

Then we could have e.g. AirAggregate produce events describing what actions
are occurring, which could be much more useful than a trace in many
contexts, and would be able to operate via a runtime toggle/filter without
having an adverse effect on performance (since the diagnostic rendering
itself is the hit; the underlying data are cheap).

Anyway---I'm addressing this now to generalize the reporter in the lowering
pipeline, so that it can report on not just errors but anything.

DEV-13162
2023-06-05 13:44:49 -04:00
Mike Gerwitz 2bf3122402 tamer: pipeline::load_xmlo: Hoist context decomposition and format
This formats the pipeline to mirror the style of
`parse_package_xml`.  Based on the previous commits, the end goal (though
not necessarily now) will be to derive a concise abstraction for all the
lowering pipelines, which means first factoring them into a common form.

DEV-13162
2023-06-05 13:44:49 -04:00
Mike Gerwitz b5187de5dc tamer: pipeline::load_xmlo: Accept reporter
This makes the API of `load_xmlo` much closer to `parse_package_xml`, both
accepting a reporter and distinguishing between recoverable and
unrecoverable errors.

The linker still does not use a reporter and still fails on the first
error, as before; I wanted to keep this change small.

DEV-13162
2023-06-05 13:44:49 -04:00
Mike Gerwitz 896fb3a0e5 tamer: asg::air::ir::AirPkg::PkgImport: New token
This allows us to drop `AirIdent::IdentRef`, which in turn allows dropping
`AirIdent` entirely from `AirPkgAggregate`.

This is also a more appropriate abstraction; having to track all the ways in
which `IdentRef` was used can be confusing.  This means that `AirIdent` is
true to its name---used only for identifiers.  The new token type makes it
very clear where package imports are recognized, and it's also easier to
search for.

DEV-13162
2023-06-05 13:44:49 -04:00
Mike Gerwitz 1f2315436c tamer: tamec: Extract xmli lowering into pipeline module
This is the same idea as the previous two commits: get all the lowering
pipelines into the same place so that we can observe commonalities and
attempt to derive an appropriate abstraction.

`lower_xmli` could have invoked `tree_reconstruction` itself, since it has
all the information that it needs to do so, but the idea is that these will
accept sources from the caller.  This also demonstrates that sinks need to
be flexible.  In an ideal abstraction, perhaps this would be able to produce
an iterator that accepts the first token type and yields the last, which can
then be directed to a sink, but that's not compatible with how the lowering
operations currently work, which requires a single value to be
returned.  But if it did work that way, then they'd be able to compose just
as any other parser.

Maybe for the future.

DEV-13162
2023-06-05 13:44:48 -04:00
Mike Gerwitz 9e8b809c14 tamer: tamec: Extract package parsing into pipeline module
The previous commit extracted xmlo loading, because that will be a common
operation between `tamec` and `tameld`.  This extracts parsing, which will
only used by `tamec` for now, though components of the pipeline are similar
to xmlo loading.

Not only does it need to be removed from `tamec` and better abstracted, but
the intent now is to get all of these things into one place so that the
patterns are obviated and a better abstraction can be created to remove all
of this boilerplate and type complexity.

Furthermore, xmlo loading needs to use reporting and recovery, so having
`parse_package_xml` here will help show how to make that happen easily.  I'm
pleased that it ended up being trivial to extract error reporting from the
lowering pipeline as a simple (mutable) callback.  I'm not pleased about
the side-effects, but, this works well for now given how the system works
today.

DEV-13162
2023-06-05 13:44:46 -04:00
Mike Gerwitz 57a805b495 tamer: src::pipeline: Eliminate most error type references
Just cleaning up a bit, removing some unnecessary types, since there are so
many involved.

DEV-13162
2023-05-25 16:58:44 -04:00
Mike Gerwitz ea6259570e tamer: ld::poc: Extract xmlo loading pipeline into new pipeline module
I want to clean this up a bit further.  The motivation is that we need this
for imports in `tamec`.

Eventually this will be cleaned up to the point where it's declarative and
easy to understand---there's a mess of types involved now and, when
something goes wrong, it can be brutally confusing.

DEV-13162
2023-05-25 16:38:41 -04:00
Mike Gerwitz 4ac8bf5981 tamer: asg::air: Isolate scope boundary rules
This extracts and decouples the boundary rules from the stack frames
themselves, which not only clarifies what the rules are (and makes them
match the scope diagrams), but paves the way for future isolation.

DEV-13162
2023-05-24 15:31:10 -04:00
Mike Gerwitz 294caaa35a tamer: asg::graph::object: Remove declare_local
This was used for metavariable declaration before scoping was sorted
out.  That was just resolved, and so this is no longer needed (and is indeed
not desirable, since it side-steps the scope index and so will not be found
except by `lookup_local_linear`).

DEV-13162
2023-05-24 14:56:22 -04:00
Mike Gerwitz 19a5ec1e0f tamer: asg: Reduce Debug output of `Asg` and `AirAggregateCtx`
The ASG had its output reduced previously but I had apparently stashed it; I
found it while trying to clean up after so many failed or partial attempts
and the various scoping changes.

The most fundamental issue is that there's too much information: it's very
difficult to interrogate so I seldom look at it, and it slows down Parser
trace output to the point where it's useless on even one of our smallest
systems, generating 1.5GiB of output for a graph of ~10k
objects (via tameld).

DEV-13162
2023-05-23 16:15:38 -04:00
Mike Gerwitz ac9b7f620e tamer: asg::air: Remove comment about AIR being light
It used to be, but has not been for quite some time.  Further, it _does_
replace (encapsulate, rather), the ASG's API.

DEV-13162
2023-05-23 14:46:09 -04:00
Mike Gerwitz e8335c57d4 tamer: asg::air::ir::AirMeta: Remove `Tpl` prefix from tokens
Cleanup from the previous commit.

DEV-13162
2023-05-23 14:44:16 -04:00
Mike Gerwitz c12bf439ae tamer: asg::air: Index scope of local metavariables
The scope system works with the AIR stack frames, expecting all parent
environments to be on that stack.  Since metavariables were (awkwardly) part
of the template parser, that didn't happen.

This change extracts metavariable parsing (with some remaining TODOs) into
its own parser, so that `AirTplAggregate` will be on the stack; then it's a
simple matter of using the existing `AirAggregateCtx` methods to define a
variable and index its shadow scope, which addresses TODOs in the existing
scope test cases.

This also involved separating the tokens from `AirTpl` into `AirMeta`; they
need to be renamed, which will happen in a following commit, since this is
large enough as it is.

Another change that had to be included here, which I wish I could have just
done separately if it wasn't too much work, was to permit overlapping
identifier shadows.  Local variables have to cast a shadow so that we can
figure out if they would in turn shadow an identifier (which would be an
error), but they don't conflict with one-another if they don't have a
shared (visible) scope.

`AirAggregate` can be simplified even further, e.g. to eliminate the
expression stack and just use the ctx stack (which didn't previously exist),
but I need to continue; I'll return to it.

DEV-13162
2023-05-23 14:38:01 -04:00
Mike Gerwitz da4d7f83ea tamer: asg::air::test::scope: Explicitly assert against indexed identifier span
That was being done automatically before this change, but the change that
I'm about to introduce for metavariables will require this distinction, at
the very least to emphasize the behavior of the indexing.

See the next commit for more information.

(The next commit has a bit too much going on, so I wanted to at least
attempt to separate things where it wasn't much work to do so.)

DEV-13162
2023-05-23 11:59:45 -04:00
Mike Gerwitz 434365543e .gitlab-ci.yml: build: Clean before build
The motivating factor here is some out of date or corrupted rustc cache,
however we really ought to be doing fresh builds for TAME; it doesn't add
enough time that it's worth sacrificing assurances.
2023-05-19 13:43:41 -04:00
Mike Gerwitz e940fc5aa0 tamer: asg: Move index from Asg to AirAggregateCtx
This finally removes the awkward index from the ASG.  This will need much
more documentation and a better organized abstraction, but in the meantime,
previous commit dive into some of the rationale.

In essence: it only really makes sense to have indexing on the ASG itself if
it is used to cache queries or other expensive operations.  But that is not
what we were using it for---it was used for caching _lexical_ properties,
which are useful only during parsing for the sake of forming relationships
on the graph.  Once those relationships have formed, different types of
indexes will be useful in different lowering, optimization, or querying
contexts.

This formalizes that, and in doing so, ensures that the index is will always
be accurate relative to the content of the ASG.  Once the index becomes
separated from it---through the `AirAggregateCtx::finish` operation---then
it is discarded and the ASG exposed.

This is also important because the index is incomplete---it contains only
the information necessary for the parser to carry out its task.

This change was a long time coming, and has reduced ASG to its essence.

DEV-13162
2023-05-19 13:38:17 -04:00
Mike Gerwitz 7857460c1d tamer: Re-use prior AirAggreagteCtx for subsequent parsers
A new AirAggregate parser is utilized for each package import.  This
prevents us from moving the index from `Asg` onto `AirAggregateCtx` because
the index would be dropped between each import.

This allows re-using that context and solves for problems that result from
attempting to do so, as explained in the new
`resume_previous_parsing_context` test case.

But, it's now clear that there's a missing abstraction, and that reasoning
about this problem at the topmost level of the compiler/linker in terms of
internal parsing details like "context" is not appropriate.  What we're
doing is suspending parsing and resuming it later on for another package,
aggregating into the same destination (ASG + index).  An abstraction ought
to be formed in terms of that.

DEV-13162
2023-05-19 13:38:15 -04:00
Mike Gerwitz 92214c7e05 tamer: asg::air: Include package in opaque identifier scope
This was the remaining of my stashed changes that I had mentioned in a
previous commit, but is accomplished differently than I had prototyped.  My
initial approach was a bit too klugey: to accept as an argument in various
scope contexts the active parser, as if it were the top stack frame.  This
was prototyped before the `AirPkgAggregate` parser was even created.

So we've since created a Pkg parser and now an opaque parser for opaque
idents.  There may be other opaque objects in the future.

Because of this change, the parent `AirPkgAggregate` gets stored on the
stack and just naturally becomes part of the lexical scope determination,
and so everything Just Works!

This commit was _supposed_ to be moving the index from `Asg` onto
`AirAggregateCtx`, but I wasn't able to do that because that context is
re-created for each package import currently.

DEV-13162
2023-05-19 09:46:36 -04:00
Mike Gerwitz e266d42c48 tamer: asg::air::AirAggregateCtx: Tuple struct => named struct
As evidenced by this change, the tuple syntax was no longer serving us
well.  But the real reason for this change is to prepare for the addition of
a fourth field: the index, taken from `Asg`.

DEV-13162
2023-05-18 01:32:28 -04:00
Mike Gerwitz 8b3dfe9149 tamer: asg::air: Utilize AirAggregateCtx for index lookups
This change means that `asg::air` is now the only module that directly
invokes index-related methods on `Asg`.  This clears the way, finally, to
removing the index from `Asg` entirely.

Not only does this result in a less awkward architecture, it also ensures
that lookups are forced to go through the system that understands and
controls lexical scoping, which will be able to give the correct answer.

Of course, the caveat is that the "correct" answer depends on what's
currently on the stack, depending on what type of lookup is being performed,
but those details are still encapsulated within the `asg::air` module and
its tests.

DEV-13162
2023-05-18 01:10:23 -04:00
Mike Gerwitz 94bbc2d725 tamer: asg::air: Root AirIdent operations using AirAggregateCtx
This is the culmination of a great deal of work over the past few
weeks.  Indeed, this change has been prototyped a number of different ways
and has lived in a stash of mine, in one form or another, for a few weeks.

This is not done just yet---I have to finish moving the index out of Asg,
and then clean up a little bit more---but this is a significant
simplification of the system.  It was very difficult to reason about prior
approaches, and this finally moves toward doing something that I wasn't sure
if I'd be able to do successfully: formalize scope using AirAggregate's
stack and encapsulate indexing as something that is _supplemental_ to the
graph, rather than an integral component of it.

This _does not yet_ index the AirIdent operation on the package itself
because the active state is not part of the stack; that is one of the
remaining changes I still have stashed.  It will be needed shortly for
package imports.

This rationale will have to appear in docs, which I intend to write soon,
but: this means that `Asg` contains _resolved_ data and itself has no
concept of scope.  The state of the ASG immediately after parsing _can_ be
used to derive what the scope _must_ be (and indeed that's what
`asg::air::test::scope::derive_scopes_from_asg` does), but once we start
performing optimizations, that will no longer be true in all cases.

This means that lexical scope is a property of parsing, which, well, seems
kind of obvious from its name.  But the awkwardness was that, if we consider
scope to be purely a parse-time thing---used only to construct the
relationships on the graph and then be discarded---then how do we query for
information on the graph?  We'd have to walk the graph in search of an
identifier, which is slow.

But when do we need to do such a thing?  For tests, it doesn't matter if
it's a little bit slow, and the graphs aren't all that large.  And for
operations like template expansion and optimizations, if they need access to
a particular index, then we'll be sure to generate or provide the
appropriate one.  If we need a central database of identifiers for tooling
in the future, we'll create one then.  No general-purpose identifier lookup
_is_ actually needed.

And with that, `Asg::lookup_or_missing` is removed.  It has been around
since the beginning of the ASG, when the linker was just a prototype, so
it's the end of TAMER's early era as I was trying to discover exactly what I
wanted the ASG to represent.

DEV-13162
2023-05-17 12:23:36 -04:00
Mike Gerwitz 716e217c9f tamer: asg: Restrict index-related operations to AIR
This is in the same spirit as previous commits modifying (or removing)
tests and benchmarks related to accessing the ASG and its indexes directly.

With this change, only `asg::air` uses the indexing and lookup methods on
`Asg`.  This will allow me to extract the index from `Asg` entirely and have
`Air` solely responsible for lookup; the graph will be responsible only for,
well, being a graph.  Indexing is an optimization strategy.

More information in the commit to follow.  But notice how this moving
environment-related concerns away from `Asg` and into AIR, and how the
remaining environment concerns are index-related.

But there is one remaining barrier: to fully move the indexing away from
`Asg`, we have to use an alternative (and complete)
abstraction---AirAggregateCtx with its ability to resolve and introduce
scope based on the stack.  The `AirIdent` token subset doesn't yet do that,
and all the work up to this point was in prepartion for doing that.  Since
introducing indexing at Root a few commits ago, it's now possible to
proceed.

DEV-13162
2023-05-17 11:37:03 -04:00
Mike Gerwitz 92eb991df3 tamer: benches: Remove asg and asg_lower_xmle microbenchmarks
These benchmarks were useful as TAMER was in its infancy and I was trying to
gain an intuition for working with Rust.  But they are now out of date, and
there are better ways to measure TAMER's performance, including running it
on real-world data (which wasn't possible previously) and through profiling
tools like Valgrind.

With that said, these types of benchmarks _would_ be useful for helping to
dig down into improvements that could be made, at a glance.  The problem is,
they aren't testing anything new, and they're also testing something I'm
about to extract from `Asg`.  It is not worth the ongoing maintenance cost.

So benchmarks may be reintroduced in the future if they are found to be
valuable.

DEV-13162
2023-05-17 11:14:00 -04:00
Mike Gerwitz b61e1ce952 tamer: asg::air: Common asg_from_toks for tests
The previous commit introduced a duplicate `asg_from_toks`; this just makes
it available publicly for any tests that might utilize AIR to lower the
barrier to writing such tests and provide some guidance in doing so.

DEV-13162
2023-05-17 10:57:10 -04:00
Mike Gerwitz 79fa10f26b tamer: ld::xmle::lower::test: Use AIR (decouple from Asg and index)
This uses AIR---the ASG's proper public interface now---to construct the
graph for tests, just as all the other modern tests do.  This is change
works towards encapsulating index operations (both creation and lookups) so
that the index can be moved off of Asg and into AIR, where it belongs.  More
information on that and rationale to come.

DEV-13162
2023-05-17 10:50:57 -04:00
Mike Gerwitz b238366bee tamer: asg::air::test::scope: Individual test per scope assertion
This separates the previous scope assertions per identifier into individual
tests, so that a single failure will not preempt all others.

DEV-13162
2023-05-17 00:25:31 -04:00
Mike Gerwitz bfbaa6e528 tamer: asg::air::test::scope: More useful panic function/line
This will panic within the test function so that RUST_BACKTRACE is not
necessary.

It also prepares for the changes coming up next.

DEV-13162
2023-05-16 23:49:23 -04:00
Mike Gerwitz ba38a3c1ba tamer: src::asg::air: Pool identifiers into global environment
This, finally, introduces identifier pooling in the global environment,
represented by `Root`.  All package-level identifiers will be scoped as
such, which at the moment means anything that's not within a template.

As mentioned in recent commits, this does require additional cleanup to
finalize, and some more test will make additional rationale more clear.

It's also worth noting the intent of storing the `ObjectIndex<Root>`---not
only does it mean that the active root can be derived solely from the
current parsing state, but it also means that in the future we can
contribute to any, potentially multiple, roots.  I had previously used Neo4J
to effectively diff two dependency graphs between versions in the current
XSLT-based TAMER; I'd like to be able to do that with TAMER in the future,
which is an important concept when considering automated data migration, as
well as querying for the effects of changes.

More to come.  I'm hoping this is finally nearing a conclusion and I can
finally tie everything together with package imports.  `AirIdent` will be
introduced into the mix soon now too, now that this commit is able to root
them.

DEV-13162
2023-05-16 23:28:47 -04:00
Mike Gerwitz 1cf5488756 tamer: asg::air::EnvScopeKind::Pool: Remove (Visible is sufficient)
At least, I think so.

See previous commit for more information, and the commit that follows for
actually using it at Root.

DEV-13146
2023-05-16 23:28:45 -04:00
Mike Gerwitz 33f34bf244 tamer: asg: Initial identifier scoping
Okay, this is finally distilling into something fairly simple and
reasonable, but I'm not quite there yet.

In particular, the responsibility is simply between `Asg` (as the owner of
the index) and `AirAggregateCtx` (as the owner of the stack frames from
which environments and scope are derived).  This was inevitable and I was
waiting for it, but now I have a good idea of how to clean it up and
proceed.

This also doesn't index in root yet (`active_rooting_oi` is still `None` for
`Root`), and I think I may remove `Pool` and just make it `Visible` at that
point, since it won't be going any further anyway.  I don't think the
distinction is meaningful and will just complicate implementations.

The tests also need some more cleanup---the assertions ideally would live in
independent tests, and the assertion failure is in a function call rather
than the test (function) itself, so requires a Rust backtrace to locate the
line number of (unless you look at the failure data).

So I suppose this is more of a mental synchronization point than
anything.  Nothing's broken, though.

DEV-13162
2023-05-16 14:58:21 -04:00
Mike Gerwitz 9fb2169a06 tamer: asg::air: Begin to introduce explicit scope testing
There's a lot of documentation on this in the commit itself, but this stems
from

  a) frustration with trying to understand how the system needs to operate
     with all of the objects involved; and
  b) recognizing that if I'm having difficulty, then others reading the
     system later on (including myself) and possibly looking to improve upon
     it are going to have a whole lot of trouble.

Identifier scope is something I've been mulling over for years, and more
formally for the past couple of months.  This finally begins to formalize
that, out of frustration with package imports.  But it will be a weight
lifted off of me as well, with issues of scope always looming.

This demonstrates a declarative means of testing for scope by scanning the
entire graph in tests to determine where an identifier has been
scoped.  Since no such scoping has been implemented yet, the tests
demonstrate how they will look, but otherwise just test for current
behavior.  There is more existing behavior to check, and further there will
be _references_ to check, as they'll also leave a trail of scope indexing
behind as part of the resolution process.

See the documentation introduced by this commit for more information on
that part of this commit.

Introducing the graph scanning, with the ASG's static assurances, required
more lowering of dynamic types into the static types required by the
API.  This was itself a confusing challenge that, while not all that bad in
retrospect, was something that I initially had some trouble with.  The
documentation includes clarifying remarks that hopefully make it all
understandable.

DEV-13162
2023-05-12 14:07:29 -04:00
Mike Gerwitz 00ff660008 tamer: asg::air: Begin lexical identifier resolution from bottom up
This begins demonstrating that the root will be utilized for identifier
lookup and indexing, as it was originally for TAME and is currently for the
linker.

This was _not_ the original plan---the plan was to have identifiers indexed
only at the package level, at least until we need a global lookup for
something else---but that plan was upended by how externs are currently
handled.  So, for now, we need a global scope.

(Externs are resolved by the linker in such a way that _any_ package that
happens to be imported transitively may resolve the import.  This is a
global environment, which I had hoped to get rid of, and which will need to
eventually go away (possibly along with externs) to support loading multiple
programs into the graph simultaneously for cross-program analysis.)

This commit renames the base state for `AirAggregate` to emphasize the fact,
especially when observing it in the `AirStack`, and changes
`AirAggregateCtx::lookup_lexical_or_missing` to resolve from the _bottom_ of
the stack upward, rather than reverse, to prove that the system still
operates correctly with this change in place.

The reason for this direction change is to simplify lookup in the most
general case of non-local identifiers, which are almost all of them in
practice---they'll be immediately resolved at the root once they're
indexed.  This can be done because I determined that I will _not_ support
shadowing; rationale for that will come later, but TAME is intended to be a
language suitable for non-programmer audiences as well.  Note that
identifiers will be resolved lexically within templates in TAMER, unlike
TAME, which means that the expansion context will _not_ be considered when
checking for shadowing, so templates will still be able to compose without a
problem so long as they do not shadow in their definition context.  (I'll
have to consider how that affects template-generating templates later on,
but that's an ambiguous construction in TAME today anyway.)

This _does not_ yet index anything at the root where it wasn't already being
indexed explicitly.

DEV-13162
2023-05-10 14:43:33 -04:00
Mike Gerwitz dd6a6dd196 tamer: asg::air::ir::AirPkg::PkgStart: Require name
This requires the name as part of the package definition, which in turn
removes a state (and all the combinations resulting from it) from
AirAggregate, which results in significant complexity reduction for a very
complex part of the system.

Pushing this complexity outward results in a reduction of overall
complexity, and obviates the question of where NIR will receive a generated
name.

DEV-13162
2023-05-10 13:57:45 -04:00
Mike Gerwitz 7a6aef00b2 tamer: nir::air::NirToAir: Note about intent to refactor
The comment speaks for itself.

My concern is that this will be especially off-putting to people looking at
TAMER and wondering how one could possibly work with this system.

DEV-13162
2023-05-10 13:57:41 -04:00
Mike Gerwitz 4510de38ed tamer: asg::air::pkg: Extract AirPkgAggregate from AirAggregate
This is something I've wanted to do for some time, but the system is
becoming hard enough to reason about (with some attempted future changes)
that I require the consistency afforded by this change.

It's not entirely done---as noted by the TODO for `UnnamedPkg`---but it's
close, and then `AirAggregate` will just be a delegating superstate, like
`ele_parse!`.

Importantly, this also puts a package parser on the stack, which will work
better with the stack-based scoping system being developed.  It will also
make it easier to fall back to a base case that I had really wanted to
avoid, and will have more information on in the future: root indexing for a
shared global environment for package-level identifiers.  (Imports are still
package-scoped, but only in appearance, by contributing to the global
environment of the compilation unit during import.  Well, it doesn't do that
yet.  The XSLT compiler works in that way.)

DEV-13162
2023-05-10 13:57:39 -04:00
Mike Gerwitz ebdae9ac38 tamer: ld::xmle::lower: Sort only rooted Idents
This is one of many changes that have been lingering that I need to start to
break apart in an attempt to commit the confusing and disappointing
conclusion to this package loading madness.

More information to come.

DEV-13162
2023-05-09 15:20:39 -04:00
Mike Gerwitz 5f275fb801 tamer: asg::air::air: Eliminate todo! for unexpected AirIdent
I had apparently forgotten about this, because I didn't benefit from the
exhaustiveness check; this needs to be eliminated so that this doesn't
happen again, and to provide a proper non-panicking error.

DEV-13162
2023-05-09 12:35:06 -04:00
Mike Gerwitz 4ec4857360 Revert "tamer: asg::air::ir::AirBind::RefIdent: New optional canonical name"
This reverts commit da7fe96254e425bc7b75f8cf454465b71e27e372.

I'm a fool---this would be pursuant to a future plan that removes AirIdent
opaque tokens.  But for now, I need it on IdentDecl and others, which
currently has a `Source` (that I want to go away, as just mentioned), which
contains the same information.

So maybe more to come on this...

DEV-13162
2023-05-09 12:35:06 -04:00
Mike Gerwitz 572337505c tamer: asg::air::ir::AirBind::RefIdent: New optional canonical name
This allows for a canonical package name to be optionally provided to
explicitly resolve a reference against, avoiding a lexical lookup.

This change doesn't actually utilize this new value yet; it just
retains BC.  The new argument will be used for the linker, since it already
knows the package that defined an identifier while reading the object file's
symbol table.  It will also be used by tamec for the same purposes while
processing package imports.

DEV-13162

-- squashed with --

tamer: asg::air::ir::RefIdent: CanonicalName=SPair

The use of CanonicalName created an asymmetry between RefIdent and
BindIdent.  The hope was to move CanonicalName instantiation outside of AIR
and into NIR, but doing so would be confusing and awkward without doing
something with BindIdent.

I don't have the time to deal with that for now, so let's observe how the
system continues to evolve and see whether hoisting it out makes sense in the
end.  For now, this works just fine and I need to move on with the actual
goal of finishing package imports so that I can expand templates.

DEV-13162
2023-05-09 12:35:06 -04:00
Mike Gerwitz 00492ace01 tamer: obj::xmlo: Bind packages to canonical name
NOTE: This fixes the aforementioned commit that caused the linker to
temporarily fail (670c5d3a5d at time of
writing).  This does introduce an extra forward slash into
`l:dep/preproc:sym/@src`, but that does not appear to cause any
problems.  That will eventually go away, so I'm not going to bother with it
any further.

As the `xmlo` file is lowered into AIR, the name will be prefixed with a
leading slash (if necessary, which it is atm) and will emit an
`Air::BindIdent`.

This means that packages will be properly indexed by their canonical name on
load, which will be important when we share this with tamec.

DEV-13162
2023-05-09 12:34:12 -04:00
Mike Gerwitz 13bac8382f tamer: obj::xmlo::{air,reader}::test: Format test cases
Simple reformatting that's consistent with other more recent tests, before I
go making changes.

DEV-13162
2023-05-05 10:26:58 -04:00
Mike Gerwitz 48bcb0cdab tamer: asg: Integrate package CanonicalName
This change requires every package to have a canonical name, and performs
namespec canonicalization on imports.

Since all package names are canonicalized, this opens the door to being able
to index package names at import, allowing the object to be shared on the
graph and properly reference a package after it has been resolved.

Note that the system tests' canonicalization is relative to the hard-coded
`/TODO` presently; that will change in the near future once `tamec`
generates names from the provided path.

DEV-13162
2023-05-05 10:26:58 -04:00
Mike Gerwitz a9d0f43684 tamer: src::asg::graph::object::pkg::name: New module
This introduces, but does not yet integrate, `CanonicalName`, which not only
represents canonicalized package names, but handles namespec resolution.

The term "namespec" is motivated by Git's use of *spec (e.g. refspec)
referring to various ways of specifying a particular object.  Names look
like paths, and are derived from them, but they _are not paths_.  Their
resolution is a purely lexical operation, and they include a number of
restrictions to simplify their clarity and handling.  I expect them to
evolve more in the future, and I've had ideas to do so for quite some time.

In particular, resolving packages in this way and then loading the from the
filesystem relative to the project root will ensure that
traversing (conceptually) to a parent directory will not operate
unintuitively with symlinks.  The path will always resolve unambigiously.

(With that said, if the symlink is to a shared directory with different
directory structures, that doesn't solve the compilation problem---we'll
have to move object files into a project-specific build directory to handle
that.)

Span Slicing
------------
Okay, it's worth commenting on the horridity of the path name slicing that
goes on here.  Care has been taken to ensure that spans will be able to be
properly sliced in all relevant contexts, and there are plenty of words
devoted to that in the documentation committed here.

But there is a more fundamental problem here that I regret not having solved
earlier, because I don't have the time for it right now: while we do have
SPair, it makes no guarantees that the span associated with the corresponding
SymbolId is actually the span that matches the original source lexeme.  In
fact, it's often not.

This is a problem when we want to slice up a symbol in an SPair and produce
a sensible span.  If it _is_ a source lexeme with its original span, that's
no problem.  But if it's _not_, then the two are not in sync, and slicing up
the span won't produce something that actually makes sense to the user.  Or,
worse (or maybe it's not worse?), it may cause a panic if the slicing is out
of bounds.

The solution in the future might be to store explicitly the state of an
SPair, or call it Lexeme, or something, so that we know the conditions under
which slicing is safe.  If I ever have time for that in this project.

But the result of the lack of a proper abstraction really shows here: this
is some of the most confusing code in TAMER, and it's really not doing
anything all that complicated.  It is disproportionately confusing.

DEV-13162
2023-05-05 10:26:56 -04:00
Mike Gerwitz e3a68aaf9e tamer: span: Introduce rslice, slice_{head,tail}
These are used by an upcoming commit, where I'll have much more to say about
the topic of span slicing.

DEV-13162
2023-05-05 10:25:35 -04:00
Mike Gerwitz 670c5d3a5d tamer: asg::graph: Require name for non-imports
NOTE: This temporarily breaks `tameld`.  It is fixed in a future commit when
names are bound.  This was an oversight when breaking apart changes into
separate commits, because the linker does not yet have system tests like
tamec does.

This is preparing for a full transition to requiring a canonical package
name.  The previous `Unnamed` variant has been removed and `AirAggregate`
will provide a default `WS_EMPTY` name, as `Pkg` had done before.

The intent of this change is to allow for consulting the index before a
new `Pkg` object is created on the graph, but we're not quite ready for that
yet.

Well, that's not entirely true---the linker can be ready for that.  But the
compiler needs to canonicalize import paths relative to the active package
canonical name, which it can't even do yet because tamec isn't generating a
name.

So maybe the linker will be first; it's useful to have that in a separate
commit anyway to emphasize the change.

DEV-13162
2023-05-05 10:24:47 -04:00
Mike Gerwitz 799f2c6d96 tamer: tameld: Produce first error
...this has apparently been consuming errors for some time.  This would
cause the parser to enter an invalid state in some cases and terminate.

This would _not_ permit an invalid link, as the graph would not be correct,
but it was masking the actual error.

This part of linker is in dire need of tests.  This also ought to be
replaced with tamec's approach of reporting all errors.

DEV-13162
2023-05-04 16:04:52 -04:00
Mike Gerwitz 7cfe6a6f8d tamer: asg::graph: Index Root->Pkg with canonical names
The previous commit introduced canonical names, and this uses them to index.

The next step will be to utilize those names to look up packages on
definition rather than creating a new package node, so that references to
yet-to-be-defined (or yet-to-be-imported) packages can be resolved on the
graph.

DEV-13162
2023-05-02 16:15:07 -04:00
Mike Gerwitz 92c9c9ba2f tamer: asg: Introduce package canonical name concept
This is already a concept in the XSLT-based compiler, where each package has
a `package/@name` generated from its path.  The same will happen with tamec.

Before we can load packages into the graph, we need canonical identifiers so
that they can be indexed.  The next commit will handle indexing using this
information.

DEV-13162
2023-05-02 16:08:39 -04:00
Mike Gerwitz 56ab671363 tamer: nir::interp: Ignore Text tokens
The documentation explains the intent here---existing LaTeX documentation.

The intent was to simply copy the documentation into a LaTeX document based
on the lvspec package that I had created long ago.  Of course, that's not
appropriate---we're a DSL and should provide first-class support for
documentation that will compile properly into the target format, whether it
be LaTeX, HTML, JS, or anything else.

DEV-13162
2023-04-30 15:06:58 -04:00
Mike Gerwitz 068804b397 tamer: Remove {ret}map:___{head,tail}
These have been a pain in the ass since TAMER began.

It seemed like a good idea at the time to have static code generated in this
way, but the lack of explicit dependencies just makes this a mess and works
against the operating theory of the system.

Furthermore, the _same_ static fragments were generated for each and every
map package.

There is still a post-link step (standalones) handled in XSLT; the
previously-static code has been moved there.  This will eventually be
integrated into tameld itself, once TAMER has facilities for JS generation.

(This was discovered while trying to parent identifiers to packages.)

DEV-13162
2023-04-30 15:06:47 -04:00
Mike Gerwitz 77ada079e1 tamer: asg::graph::Asg.graph: Finally encapsulate
With the previous commit using a visitor implemented within the `asg`
module, we can now finally encapsulate the graph.  This is a wonderfully
liberating, long-awaited change, since I have been fighting with the lack of
encapsulation for some time; it has made certain changes challenging and has
made the system more difficult to reason about.  It also made it impossible
to assert that invariants were _actually_ properly enforced, if things could
just peer into and modify the graph directly, out from underneath the API
that provides those assurances.

This also removes our dependency on Petgraph outside of the `asg`
module.  There are no plans to migrate away from it currently; we'll see how
the graph continues to evolve over time and what redundancies are introduced
with our data structures.  It may render petgraph unnecessary.

Interestingly, because my DFS implementation is so similar to Petgraph's,
the emitted ordering is _identical_ between this commit and the previous.

DEV-13162
2023-04-28 15:36:07 -04:00
Mike Gerwitz 78c1a9136e tamer: ld::xmle::lower: Use asg::graph::visit::topo::topo_sort
This integrates the new topological sort, replacing the previous
implementation in the linker.

This will now allow encapsulating the graph, finally, and ensures that
future changes can be fully maintained within the `asg` module.

More cleanup will come over time.

DEV-13162
2023-04-28 15:26:47 -04:00
Mike Gerwitz 9b53a5e176 tamer: asg::graph::visit::topo: Cut cycles
This commit includes plenty of documentation, so you should look there.

It's desirable to describe the sorting that TAME performs as a topological
sort, since that's the end result we want.  This uses the ontology to
determine what to do to the graph when a cycle is encountered.  So
technically we're sorting a graph with cycles, but you can equivalently view
this as first transforming the graph to cut all cycles and then sorting it.

For the sake of trivia, the term "cut" is used for two reasons: (1) it's an
intuitive visualization, and (2) the term "cut" has precedence in logic
programming (e.g. Prolog), where it (`!`) is used to prevent
backtracking.  We're also preventing backtracking, via a back edge, which
would produce a cycle.

DEV-13162
2023-04-28 14:33:48 -04:00
Mike Gerwitz c2c1434afe tamer: asg::graph::visit::topo: Cycle detection
This introduces cycle detection, but it does not yet filter ontologically
permitted cycles, which will be needed prior to utilizing this in `tameld`.

There's a considerable amount of documentation here.  While the
implementation is fairly simple, there are important algorithmic decisions,
both in the DFS construction and the derivation of the cycle path from data
that already exists.

This also supports recovery (by ignoring cycles), which can then be utilized
to find more cycles and other errors in the system.

DEV-13162
2023-04-27 16:28:57 -04:00
Mike Gerwitz e3094e0bad tamer: asg::graph::visit::topo: Introduce topological sort
This is an initial implementation that does not yet produce errors on
cycles.  Documentation is not yet complete.

The implementation is fairly basic, and similar to Petgraph's DFS.

A terminology note: the DFS will be ontology-aware (or at least aware of
edge metadata) to avoid traversing edges that would introduce cycles in
situations where they are permitted, which effectively performs a
topological sort on an implicitly _filtered_ graph.

This will end up replacing ld::xmle::lower::sort.

DEV-13162
2023-04-26 09:51:45 -04:00
Mike Gerwitz be05fbb833 tamer: asg::graph::visit{=>::ontree}: Move into submodule
This reorganization makes way for more traversals.

DEV-13162
2023-04-24 13:51:04 -04:00
Mike Gerwitz 42aa5bd407 tamer: asg::graph: Root->Ident {tree=>cross} edge
tameld isn't yet adding edges to Idents from their associated Pkg (see
previous commit), but this formalizes how the ontology will interpret such a
relationship.  The idea is that Idents are always owned by Pkgs, but they
may be optionally explicitly rooted, which will be used by a particular type
of DFS walk that is about to be written, which can ignore Root->Pkg and
focus instead on cross edges to Idents.

Though it's not lost on me that now that I'll be introducing a DFS for the
linker, the terms "cross" and "tree" edge now become ambiguous; I used to
call them "ontological X edge", but I had fallen out of that habit; perhaps
I need to reintroduce that rigor.

DEV-13162
2023-04-24 09:44:02 -04:00
Mike Gerwitz 48d9bca3b7 tamer: obj::xmlo: Add Pkg nodes for identifiers
This modifies the xmlo reader, xmlo->AIR lowering, and AIR->ASG to introduce
a package for identifiers.  It does not yet, however, add edges from the
package to the identifier.

Once edges are added, the DFS will change in undesirable ways, which will
require a new implementation.  This is desirable to decouple from Petgraph
anyway, and then will be able to restore the prior single-pass sort+cycle
check.

That will also encapsulate visiting behavior within the `asg::graph` module
and, in turn, allow encapsulating `Asg.graph` finally.

DEV-13162
2023-04-21 16:24:11 -04:00
Mike Gerwitz 34dad122fd tamer: asg::air::test: Remove `Air` and `Parsed` enum prefixes from variants
Same spirit as previous commits.  This is committed separately from the
changes that follow to eliminate its distraction.

DEv-13162
2023-04-21 15:37:27 -04:00
Mike Gerwitz 1f371b6ba6 tamer: obj::xmlo::air: More explicit dead states
This doesn't go far enough, but it elaborates a bit---the existing was far
too much of a catch-all.  It's important to take advantage of exhaustiveness
checks to ensure each transition is properly accounted for.

This parser is going to get more work over time, including right now, so I'm
not going to go too deep into this yet, but it's be useful (as a reader) to
compare it to e.g. asg::air's parsers' explicit enumeration of states and
favoring of explicit errors over dead state transitions.

DEV-13162
2023-04-21 11:48:56 -04:00
Mike Gerwitz dd47fc564d tamer: obj::xmlo: More concise identifiers
This follows conventions of other, more recently written, systems.

DEV-13162
2023-04-21 11:48:55 -04:00
Mike Gerwitz e13817c203 tamer: obj::xmlo::air::test: Extract into own file 2023-04-20 16:46:32 -04:00
Mike Gerwitz 6f68292df5 tamer: asg::graph::{index_identifier=>index}: Generalize
This may now index _any_ type of object, in preparation for indexing package
import paths.  In practice, this only makes sense (at least currently) for
`Pkg` and `Ident`.

This generalization also applies to `Asg::lookup_or_missing`.

DEV-13162
2023-04-20 16:46:30 -04:00
Mike Gerwitz f183600c3a tamer: asg: Move Ident-specific methods off of Asg
Historically, the ASG was better described as a "dependency graph",
containing only identifiers (which are simply called "symbols" in the
XSLT-based compiler).  Consequently, it was appropriate for the graph to
have operations specific to identifiers.  (Indeed, that's the only type of
object the graph supported.)

Much has changed since then.  This cleans things up, and makes parenting
identifiers to root an _explicit_ operation.  This will make it easier to
move forward with handling of scope, and importing identifiers into
packages, and removing `Source`, and so on.

DEV-13162
2023-04-19 12:40:35 -04:00
Mike Gerwitz 46551ee298 tamer: ld::xmle::lower::test: Extract into own file
DEV-13162
2023-04-19 12:40:35 -04:00
Mike Gerwitz 778e90c81d tamer: asg::air: Index package identifiers on `Pkg` rather than `Root`
I've been torturing myself trying to figure out how I want to generalize
indexing, lookups, and value numbering in a way that is appropriate for this
project (that is, not over-engineered relative to my needs).

Before I can do much of anything, though, I need to stop having indexing
only as a `Root` thing (previously it wasn't even tied to `Root`).  This
makes that change for tamec, but temporarily removes scoping concerns until
I can add more specific types of indexing.

Not only does this allow cleaning up some `Ident`-specific stuff from `Asg`,
but the cleanup also helps to show that portions of the system aren't still
using Root-based globals.

The linker (`tameld`) still uses the old `global` methods for now; those
will eventually go away, but this needs to change to unify both tamec and
tameld once we get to imports as part of the compiler.

DEV-13162
2023-04-19 12:40:34 -04:00
Mike Gerwitz c367666d8e tamer: nir::tplshort: Generate @desc for generated template
This is required by the XSLT-based compiler, since the `xmli` we're
producing acts as a new source file.

DEV-13708
2023-04-13 09:30:27 -04:00
Mike Gerwitz 590c4b7b06 tamer: NIR->xmli: Support template/@desc
This is needed to then support `@desc` for shorthand desugaring; it's
required by the XSLT-based compiler (and will eventually be required by
TAMER too).

DEV-13708
2023-04-12 15:53:16 -04:00
Mike Gerwitz 5dd77e7b41 tame: rater.xsd: templateName: Permit multiple leading/trailing underscores
This is needed by TAMER's template desugaring.  The XSD is superceded by
`nir::parse`, but can't go away until TAMER fully supplants the XSLT-based
compiler.

...and after all this time, I still never got rid of the duplicate XSD.  Or
even recall which one is the duplicate.

DEV-13708
2023-04-12 14:54:00 -04:00
Mike Gerwitz 2325eb1b2f tame: preproc/template.xsl: param-copy: Utilize TAMER application convention
TAMER desugars shorthand template application bodies (`@values@`) into _the
name of a closed template_ whose body should be expanded into place.  This
change recognizes that convention, and makes use of it.

Desugaring is part of `nir::tplshort`.

DEV-13708
2023-04-12 14:52:06 -04:00
Mike Gerwitz b7aae207c2 tamer: Rust v1.{68=>70}: Stabalized nonzero_min_max and is_some_and
These two features have been stabalized in Rust 1.70.
2023-04-12 12:04:13 -04:00
Mike Gerwitz af43e35567 tamer: nir::air: Reject Todo* tokens
XIRF->Nir produces `Todo` and `TodoAttr` tokens for many different
things.  The previous approach was to ignore those things so that I could
begin adding portions of packages to the graph and observe how that goes.

But now that I'm starting to be able to compile certain packages that
utilize only small subsets of TAME features, I need to have confidence that
I'm fully parsing them.  This means rejecting tokens that I haven't yet
gotten to.

DEV-13708
2023-04-12 12:04:13 -04:00
Mike Gerwitz e88800af42 tamer: asg: Basic `Doc::Text` support
This supports arbitrary documentation as sibling text (mixed content, in XML
terms).  The motivation behind this change is to permit existing system
tests to succeed when `Todo | TodoAttr` are both rejected, rather than
having to ignore this.

TAME has always had a philosophy of literate documentation, however it was
never fully realized.  This just maintains the status quo; the text is
unstructured, and maybe will be parsed in the future.

Unfortunately, this does _not_ include the output in the `xmli` file or the
system tests.  The reason has nothing to do with TAMER---`xmllint` does not
format the output when there is mixed content, it seems, and I need to move
on for now; I'll consider my options in the future.  But, it's available on
the graph and ready to go.

DEV-13708
2023-04-12 12:04:12 -04:00
Mike Gerwitz 647e0ccbbd tamer: Re-introduce literal parsing for xmlns in NIR for PackageStmt
This _only_ re-introduces for PackageStmt since that's all I have tests for
at present.  More will be re-added later.

They were previously removed when the attribute parsing was upended in
`ele_parse!`.

This does lose the attribute name, compared to before; that'll ideally be
re-added, and I'll explore options for doing so later, since I also want
them in other contexts.  But it needs to be done generically (not
XML-related).

This had to be done before blowing up on TODOs, or system tests would fail.

DEV-13708
2023-04-12 11:59:49 -04:00
Mike Gerwitz acafe91ab9 tamer: nir::Nir::Todo: Add Span
This is in preparation for throwing errors (with diagnostic information) on
yet-to-be-supported tokens, so that I can confidently compile individual
packages without worrying that something is just being ignored.

This makes obvious that `ele_parse!` had a different design in mind
previously, and it's now resulting in a lot of boilerplate; I'll address
that in the future once I'm certain requirements have been settled on, since
I've spent far too much time on it to waste more.

DEV-13708
2023-04-12 11:59:49 -04:00
Mike Gerwitz 9cb6195046 tamer: asg: Add basic Doc support (for @desc)
This introduces a new `Doc` object that can be owned by `Expr` (only atm)
and contain what it describes as a concise independent clause.  This
construction is not enforced, and is only really obvious today via the
Summary Pages.

There's a lot of latent and unrealized potential in TAME's documentation
philosophy that was never realized, so this will certainly evolve over
time.  But for now, the primary purpose was to get `@desc` working on things
like classifications so that `xmli` output can compile for certain
packages.

DEV-13708
2023-04-12 11:59:48 -04:00
Mike Gerwitz 0163391498 tamer: asg::graph::object::prelude: New module to reduce imports
These are used by virtually every `ObjectKind`; I've been meaning to do this
for a while, but now that I'm about to introduce a new one (`Doc`), let's
just get it out of the way.

DEV-13708
2023-04-07 09:56:50 -04:00
Mike Gerwitz f4653790da tamer: NIR->xmli: Represent package imports
This doesn't do the actual hard work yet of resolving and loading a package,
but it does place it on the graph and re-derive it into the xmli output.

DEV-13708
2023-04-07 09:44:16 -04:00
Mike Gerwitz 82e228009d tamer: NIR->xmli: Basic match support
This introduces `<match on="foo" />` and `<match on="foo" value="bar" />`,
which are both equality predicates.  Other types of predicates are not yet
supported.

This change is a bit messy and leaves a bit to be desired.  `NirToAir` is
quite messy and needs some cleanup.  There's also the issue of introducing
XML-specific errors in NIR so that users know what things like "subject"
mean, but not being able to do so yet because NIR is agnostic to the source
document type; another layer of abstraction is needed.

But, my priority is first to get derivation of a particularly
expensive (generated) package in our internal systems working first.

DEV-13708
2023-04-06 22:40:18 -04:00
Mike Gerwitz 1f2ead7f9b tamer: nir: Introduce disambiguating RefSubject
The alternative I was floating was a tagged `Ref` (that is, with an enum
within it), but I settled on this for now, in part for a more concise
notation with the mapping in nir::parse.

We'll see how this evolves.  For now, it's not important with the only thing
that uses ref in nir::parse, which is template application.

This was introduced for `match`, which is to come shortly.

DEV-13708
2023-04-06 10:28:27 -04:00
Mike Gerwitz 7b2acb65c5 tamer: nir::air::test: Formatting and enum prefix elision
This just makes easier to read and more concise.  I'm about to add a number
of tests and the verbosity was off-putting.

DEV-13708
2023-04-06 09:33:44 -04:00
Mike Gerwitz e8371c452e tamer: Remove wip-nir-to-air feature flag in favor of existing wip-asg-derived-xmli
The latter has always enabled the former, and there's really no reason I'd
enable one but not the other at this point.  It's just confusing.

DEV-13708
2023-04-05 22:28:30 -04:00
Mike Gerwitz c0e5b1d750 tamer: asg::air: Template application within expressions
This recognizes template application within expressions.  Since expressions
can occur within templates, this can occur arbitrarily deeply.

And with that, we have the core of the template system represented on the
graph.  Of course, there are some glaring scoping issues to be resolved, but
those aren't unique to template application.

DEV-13708
2023-04-05 15:49:25 -04:00
Mike Gerwitz daa8c6967b tamer: asg: Initial nested template supported
I had hoped this would be considerably easier to implement, but there are
some confounding factors.

First of all: this accomplishes the initial task of getting nested template
applications and definitions re-output in the `xmli` file.  But to do so
successfully, some assumptions had to be made.

The primary issue is that of scope.  The old (XSLT-based) TAME relied on the
output JS to handle lexical scope for it at runtime in most situations.  In
the case of the template system, when scoping/shadowing were needed, complex
and buggy XPaths were used to make a best effort.  The equivalent here would
be a graph traversal, which is not ideal.

I had begun going down the rabbit hole of formalizing lexical scope for
TAMER with environments, but I want to get this committed and working first;
I've been holding onto this and breaking off changes for some time now.

DEV-13708
2023-04-05 15:46:44 -04:00
Mike Gerwitz a738a05461 tamer: asg::graph::object::rel: Hash impls for ObjectIndexTo{,Tree}
All ObjectIndex-like objects hash using only the underlying identifier,
which ultimately boils down to a `NodeIndex` (petgraph), which is just a
u32.  And so in that sense, the only purpose we have for hashing it is to
(a) reduce the space required to store mappings, and (b) compose with other
`Hash`es.

DEV-13708
2023-04-05 15:46:42 -04:00
Mike Gerwitz 3660c15d5a tamer: asg::graph::rel::ObjectIndexTreeRelTo: New trait and related
This creates another trait and struct `ObjectIndexToTree` that assert a
stronger invariant than `ObjectIndexRelTo`---that not only does it uphold
the invariants of `ObjectIndexRelTo`, but also that it represents a _tree_
edge, which indicates _ownership_ rather than just a reference.

This will be used to statically infer what can serve as a scope boundary for
upcoming changes.  Specifically, anything that can own an `Ident` introduces
a new level of scope.

DEV-13708
2023-04-04 14:33:34 -04:00
Mike Gerwitz f1495f8cf4 tamer: asg::graph::object: Move `lookup_local_linear` to `ObjectIndexRelTo`
This allows this method to be used on anything that is able to relate to an
identifier, which is needed for the changes being made for the template
system.

This linear lookup is actually going away (as hinted at by preceding
commits); this is extracted as part of a larger change and I wanted to get
it committed to make it easier to follow upcoming changes.

DEV-13708
2023-04-03 16:14:31 -04:00
Mike Gerwitz 02dba0d63a tamer: asg::graph::Asg: Index by (SymbolId, NodeIndex) pair
The prior commit begins to explain the end goal of being able to index
identifiers outside of the global environment.

This change continues to index things as before, but introduces a new key
based on the pair of the symbol id together with a node that is _part of_
its target environment.  The only environment utilized at the moment (in this
commit) is that of the root node (which is the global scope), in both
indexing and lookup.  Future commits will extend this, and contain more
information about and rationale for the implementation.

The new general index methods are restricted to `pub(super)` until an
abstraction can be put in place that is responsible for environment
indexing; that's a responsibility that is currently handled by
`AirAggregateCtx` for tamec, and the linker has no scoping
requirements since all of that has already been dealt with.

DEV-13708
2023-04-03 16:14:30 -04:00
Mike Gerwitz 5b0a4561a2 Revert "Revert "tamer: asg::graph::index: Use FxHashMap in place of Vec""
This reverts commit 1b7eac337cd5909c01ede3a5b3fba577898d5961.

This is a revert of the previous revert, just so that I (and you) have
references to prior rationale.

This was previously reverted because it wasn't worth doing, but now we have
a situation where we need to begin implementing lexical scoping rules for
nested containers (packages and templates).  In particular, as you'll see in
the commits that follow, we need to be able to look up an identifier that
may have been created as Missing at one level of scope (certain types of
blocks), but then define it at another level.

Or, even more simply at this point, since I'm not yet doing anything
sophisticated with scope: we're only indexing in the global environment, and
we need to be able to index elsewhere too.

The next commit will go into more information, but suffice it to say for now
that indexing is going to get more complicated than a SymbolId.

Sticking with FxHash for now; we don't need a stable hash now.

DEV-13708
2023-04-03 15:15:54 -04:00
Mike Gerwitz 6d35e8776c tamer: asg::air: InvalidExpansionContext in place of TODO
There are no such invalid expansion contexts yet, but this gets rid of the
final remaining TODO from introducing the stack.  With the existing feature
set, at least.

DEV-13708
2023-03-31 14:23:26 -04:00
Mike Gerwitz e3d60750a9 tamer: asg::air: Errors for rooting_ci() TODOs
This eliminates the TODOs that existed when looking for an OI for rooting an
identifier.

The change to `rooting_ci` is ridiculous, but I want to get other things
done before I jump down the rabbit hole of generalizing that (indexing local
identifiers).  Though I have an approach in mind.

DEV-13708
2023-03-31 13:57:11 -04:00
Mike Gerwitz a33d0c4ea5 tamer: asg::air: Consolate nested PkgStart
Just some continued cleanup.

Unfortunately, we have sacrificed knowing a package OI must exist
statically, even though one will always be available.

DEV-13708
2023-03-30 22:28:22 -04:00
Mike Gerwitz 0e0b72ff5f tamer: asg::air: Generalize control transfer convention
The diff should make this refactoring obvious.  The provided documentation
explains why it operates the way that it does.

DEV-13708
2023-03-30 16:38:03 -04:00
Mike Gerwitz 558f1c96b1 tamer: asg::air: Extra AirExpr parsing from AirTplAggregate
This has AirAggregate preempt Expr parsing in the same way as templates,
rather than having `AirTplAggregate` concern itself with expression
tokens.  This continues to simplify `AirTplAggregate`, which was getting
quite complex not too long ago.

A pattern is now emerging for the call/ret convention for preemption.  That
was intentional, but it's nice to see it manifest so obviously before I
abstract it away.

DEV-13708
2023-03-30 15:44:14 -04:00
Mike Gerwitz f29e3cfce1 tamer: asg::air: Use StateStack
This was extracted from xir::parse::ele in previous commits.  The
conventions help to ensure that pushing and returns are being performed
correctly.  The abstraction will continue to evolve.

This ends up using `Ready` as the dead state.  I need to determine if this
is ideal, and if so, maybe just use `Default`, otherwise yield an error.

DEV-13708
2023-03-30 15:44:14 -04:00
Mike Gerwitz e6c6028b37 tamer: xir::parse::ele: Move StateStack into parse::state
This will be utilized by `AirAggregate`.

DEV-13708
2023-03-30 15:44:12 -04:00
Mike Gerwitz 11a4fdfb26 tamer: xir::parse::ele::StateStack: {Array=>}Vec
The use of ArrayVec doesn't buy us anything anymore.  There is no difference
in performance through my own benchmarking (at least on our systems), and
the game has changed since this was written: the size of the states is much
smaller since we're no longer aggregating attributes.  Further, the use of
ArrayVec during development was also to keep memory allocation away from
various parts of the code, which simplified analysis of the binary that was
produced.  Maybe it also reduced memory contention, but clearly that has no
observable impact.

The use of `Vec` removes the arbitrary bound, though I still kept one around
just in case something goes wrong, so TAMER will terminate.  Even though the
token stream is bounded in size, lookahead does create recursion, and the
system cannot (as written) prove that it doesn't.

This is preparing for extracting `StateStack` into `parse` for use with
`AirAggregate`.

DEV-13708
2023-03-30 10:17:15 -04:00
Mike Gerwitz d091103983 tamer: asg::air::tpl: Remove Expr delegation (move to parent)
`AirAggregate` now handles all delegation to `AirExprAggregate`.  This is
possible because `AirAggregate` is now the superstate for each of these
parsers, so `AirTplAggregate` is able to transition to a state that is not
its own.

This does not go so far as reaching the ultimate objective---having nested
template support---even though it'd be fairly simple to do now; there's
going to be a number of interesting consequences to these changes, and a bit
of cleanup is still needed, and I want tests observing this functionality to
accompany those changes.  That is: let's keep this a refactoring, to the
extent that it's possible.

Things are getting much easier to understand now, and much cleaner.

DEV-13708
2023-03-30 09:26:11 -04:00
Mike Gerwitz c59b92370c tamer: parse::state: Superstate support for Token type lifting
What hell have I gotten myself into.

In the end, this wasn't too bad, but the initial batch of errors was really
demotivating; the diff does this no justice.  `Lookahead::into_super` was
created to help tame those errors.

...now I can move forward.  Imagine my disappointment when I ran into this
when expecting from previous work that superstates would now work properly
for the AirAggregate parsers.

(The reason this was needed is because AirAggregate splits tokens into
subtypes for child parsers.)

DEV-13708
2023-03-29 15:50:06 -04:00
Mike Gerwitz 68e2d5d10e tamer: parse::util::expand: Delete module
Oh, boy, I had forgotten about this, until I started working on some
SuperState stuff and discovered this again due to a compiler error.  Don't
want to fix something that isn't used.

But this does not bring back great memories.  It's unfortunate that it
didn't work out; I'm pretty sure this was part of ~1mo of wasted effort
going down a path that I ultimately had to abort.  Not good times.  I'm
still behind from it.

DEV-13708
2023-03-29 15:30:51 -04:00
Mike Gerwitz 15fd2de437 tamer: asg::air::expr: Eliminate RootStrategy
I love deleting code I just wrote...

This doesn't solve the underlying problems with identifiers, but it does at
least lift it into the `AirAggregateCtx`, allowing `AirExprAggregate` to be
even further simplified.  Now the `From` implementation is not specialized
and we can readily convert to a SuperState.

There's still a lot of TODOs here, though.  And some of them will
unfortunately require runtime checks where there was previously a
compile-time check.  But that's okay in a lot of the cases, because the
empty behavior will replace existing error checks.

DEV-13708
2023-03-29 13:49:05 -04:00
Mike Gerwitz 26ddb2ae9d tamer: asg::air::expr: Remove RootStrategy::hold_dangling
Whether or not dangling expressions are permitted is now based solely off of
the stack context, which is also much more intuitive.

`RootStrategy` now only does one thing, and the existing comments describe
why it exists despite that one thing seeming very similar.

`RootStrategy` further alludes to how `ExprStack` could also be
eliminated, should it be worth doing so.  It is a tad redundant now with the
new stack.

DEV-13708
2023-03-29 13:02:01 -04:00
Mike Gerwitz 525adb8a6c tamer: asg::air: Eliminate parent context from AirExprAggregate
This does the same thing to `AirExprAggregate` that was previously done for
`AirAggregate`, taking all parent context from the stack.

This results in a fairly significant simplification of the code, which is
nice, and it makes the `RootStrategy` obviously obsolete in the dangling
case, which will result in more refactoring to simplify it even more.

I regret not taking this route to begin with, but not only was I hoping I
wouldn't need to, but I was still deriving the graph structure and wasn't
sure how this would eventually turn out.  These commits serve as a proof of
necessity.  Or, at least, concrete rationale.

It's worth noting that this also introduces `From` implementations for
`AirAggregate` and the child parsers, and then uses _that_ to push context
from the `AirTplAggregate` parser.  This means that we're just about ready
for it to serve as a superstate.  But there is still a specialization of
`AirExprAggregate` in that `From` impl, which must be removed.

DEV-13708
2023-03-29 13:02:00 -04:00
Mike Gerwitz 755c91e04a tamer: asg::air: Merge AirStack into AirAggregateCtx
Having an extra layer of abstraction was inconvenient, and unnecessary.

DEV-13708
2023-03-29 12:58:36 -04:00
Mike Gerwitz a5b4eda369 tamer: asg::air::AirAggregate: Remove Pkg context from child parser states
This is more of the same of the previous commit, but in a more digestable
chunk.  We now have child states that are able to be constructed using a
simple `From`, which is important to making `AirAggregate` a `SuperState`.

This also makes `AirStack` act like a prototype chain for `ObjectIndex`es,
creating environments where context shadows.  The linear search should only
have to check the last two frames (e.g. an Expr has a parent Pkg or Tpl
context which will have a `rooting_oi` value), and this is only done during
a rooting operation.

DEV-13708
2023-03-29 12:58:35 -04:00
Mike Gerwitz 1ef1290ee9 tamer: asg::air: Begin to derive context from stack
This begins to introduce `AirStack` and starts to migrate context away from
the individual `ParseState`s onto the stack.

I should have started to commit earlier; this is getting a bit large and
makes it hard to follow what I'm doing so, hopefully stopping a little bit
short will allow the following commit to show that.

This is a work-in-progress change.  All tests pass, but the refactoring is
incomplete.  The `AirStack` abstraction is _also_ incomplete and will have
better, more domain-specific operations that make it harder to mess up
pairing pushes with pops.

The purpose of doing this is to allow `AirAggregate` to serve exclusively as
a sum state, which can then become a SuperState, much like `ele_parse!`'s
approach.

The _end_ goal of all of this is arbitrary template nesting.

DEV-13708
2023-03-29 12:58:35 -04:00
Mike Gerwitz 2ae33a1dfa tamer: asg::graph::object: ObjectIndexTo and ObjectIndexRelTo
The graph's ontology is defined in the direction of the edge: from OA
to OB.  This is enforced by the type system to ensure that no code path is
able to generate an invalid graph.

But that also makes it very difficult to work with a generic source to a
specific target.

This introduces a `ObjectIndexRelTo` trait that says whether `Self` is able
to be related to some `ObjectKind` `OB`, implements it for `ObjectIndex
where ObjectRelTo<OB>`, and introduces a new semi-opaque type
`ObjectIndexTo` that allows for the source `ObjectIndex` to be generic.

This then redefines some existing graph primitives in terms of
`ObjectIndexRelTo`, in particular creating edges, so that `ObjectIndex` can
be used as today, and the new `ObjectIndexTo` can be used in the same way
with the same API, without violating the graph ontology.

This will be used by `AirAggregate` to create dynamic targets for rooting
and splicing/expansion.

DEV-13708
2023-03-29 12:58:35 -04:00
Mike Gerwitz eebacb52cc tamer: asg::air::AirAggregate: Remove waiting AirExprAggregate
To simplify things in support of upcoming changes, we'll just instantiate a
new one as needed.  This doesn't have an appreciable performance impact, so
the optimization is premature.  It was done just because it was more of the
same that TAMER was already doing, but now it's making things more
difficult.

DEV-13708
2023-03-29 12:58:35 -04:00
Mike Gerwitz b1ce7aaf29 tamer: asg::air: AirAggregateCtx: New AirAggregate::Context
Future changes to `AirAggregate` are going to require additional context (a
stack, specifically), but the `Context` is currently utilized
by `Asg`.  This introduces a layer of abstraction that will allow us to add
the stack.

Alongside these changes, `ParseState` has been augmented with a `PubContext`
type that is utilized on public APIs, both maintaining BC with existing code
and keeping these implementation details encapsulated.

This does make a bit of a mess of the internal implementation, though, with
`asg_mut()` sprinkled about, so maybe the next commit can clean that up a
bit.  EDIT: After adding `AsMut` to a bunch of asg::graph::object::*
methods, I decided against it, because it messes with the inferred
ownership, requiring explicit borrows via `as_mut()` where they were not
required before.  I think the existing code is easier to reason about than
what would otherwise result from having `mut asg: impl AsMut<Asg>`
everwhere.

DEV-13708
2023-03-29 12:58:35 -04:00
Mike Gerwitz fc569f7551 tamer: asg::air::tpl: Distinct, generalized root and targets
Previously, `AirTplAggregate` worked only in a `Pkg` context, being able to
root `Tpl` `Ident`s in `Pkg` and expand only into `Pkg`.  This still does
the same, but generalizes to allow for different roots and expansion
targets.

This will be utilized to parse nested templates.

DEV-13708
2023-03-29 12:58:35 -04:00
Mike Gerwitz e1c8e371d5 tamer: nir::tplshort: Desugar nested template applications
I'm happy with how this ended up turning out---I was able to accomplish this
without having to introduce any additional state to the parser (I _removed_
a state, actually) by tweaking NIR a bit in a previous commit.

We can't update the system test yet, though, because nested templates are
not yet supported by asg::air::tpl; that'll come next.  If you try, you'll
be greeted with this error presently (which is worth showing since you'll
never see it unless you're hacking TAMER):

,=====[ ./tests/xmli/template/ logs ]======
|
| thread 'main' panicked at 'not yet implemented: internal error:
| note: nested tpl open
|    --> ./tests/xmli/template/src.xml:129:5
|     |
| 129 |     <t:inner-short />
|     |     -------------- note: for this template
|
|
| !!! ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ !!!
| !!!        THIS IS AN UNFINISHED FEATURE IN TAMER         !!!
| !!! ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ !!!
| !!! This message means that TAMER has encountered an      !!!
| !!! unrecoverable error that forced it to terminate       !!!
| !!! processing.                                           !!!
| !!!                                                       !!!
| !!! TAMER has attempted to provide you with contextual    !!!
| !!! information above that might allow you to work around !!!
| !!! this problem until it can be fixed.                   !!!
| !!!                                                       !!!
| !!! Please report this error, including the above         !!!
| !!! diagnostic output beginning with 'internal error:'.   !!!
| !!! ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ !!!
| ', src/asg/air/tpl.rs:207:55
| note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
| Command exited with non-zero status 101
| 0/165fault 0/8io 3528rss 14/2ctx
| /home/[...]/tame/tamer/target/debug/tamec -o ./tests/xmli/template/out.xmli --emit xmlo ./tests/xmli/template/src.xml
|
`====[ end ./tests/xmli/template/ logs ]====

DEV-13708
2023-03-29 12:58:35 -04:00
Mike Gerwitz 6581c9946c tamer: nir::air: Remove Nir and NirEntity enum prefixes from variants
This is a long-overdue change to make this easier to read, but I'm _still_
holding off on refactoring, since there's still a lot of room for different
patterns to form with all of NIR that is left.

DEV-13708
2023-03-29 12:58:35 -04:00
Mike Gerwitz e595698309 tamer: nir: Apply*Short variants
This adds explicit variants for shorthand template application.  This is
less cryptic, and we'll be able to check for the close directly during
desugaring.

DEV-13708
2023-03-29 12:58:35 -04:00
Mike Gerwitz 975f60bff9 tamer: nir::tplshort: Desugar body into @values@
This represents a significant departure from how the XSLT-based TAME handles
the `@values@` param, but it will end up having the same effect.  It builds
upon prior work, utilizing the fact that referencing a template in TAMER
will expand it.

The problem is this: allowing trees in `Meta` would add yet another
container; we have `Pkg` and `Tpl` already.  This was the same problem with
template application---I didn't want to add support for binding arguments
separately, and so re-used templates themselves, reaching the generalization
I just mentioned above.

`Meta` is intended to be a lexical metasyntatic variable.  That keeps its
implementation quite simple.  But if we start allowing trees, that gets
rather complicated really quickly, and starts to require much more complex
AIR parser state.

But we can accomplish the same behavior by desugaring into an existing
container---a template---and placing the body within it.  Then, in the
future, we'll parse `param-copy` into a simple `Air::RefIdent`, which will
expand the closed template and produce the same result as it does today in
the XSLT-based system.

This leaves open issues of closure (variable binding) in complex scenarios,
such as in templates that introduce metavariables to be utilized by the
body.  That's never a practice I liked, but we'll see how things evolve.

Further, this does not yet handle nested template applications.

But this saved me a ton of work.  Desugaring is much simpler.

The question is going to be how the XSLT-based compiler responds to this for
large packages with thousands of template applications.  I'll have to see
if it's worth the hit at that time, or if we should inline it when
generating the `xmli` file, producing the same `@values@` as
before.  But as it stands at this moment, the output is _not_ compatible
with the current compiler, as it expects `@values@` to be a tree, so a
modification would have to be made there.

DEV-13708
2023-03-29 12:58:35 -04:00
Mike Gerwitz 120f5bdfef tamer: nir::tplshort: Remove variant enum prefixes
This just cleans up a little before I introduce more code, making this
easier to read.

DEV-13708
2023-03-29 12:58:35 -04:00
Mike Gerwitz 9c0e20e58c tamer: asg: Shorthand and long-form template arguments
This applies to template application only; there's still some work to do for
template parameters in definitions (well, for deriving them in `xmli` at
least).  And, as you can see, there's still a lot of TODO items here.

I ended up backtracking on tree edges to Meta, and even on cross edges to
Meta, because it complicated xmli derivation with no benefit right now;
maybe a cross edge will be re-added in the future, but I need to move on and
see where this takes me.

But, it works.

DEV-13708
2023-03-29 12:58:35 -04:00
Mike Gerwitz fcd25d581c tamer: asg::air::expr: Do not cache (globally) identifiers created with StoreDangling
I'm not happy with this implementation.  The linear search is undesirable,
but not too bad (and maybe wouldn't even be worth caching, if this were the
whole story), but we _also_ need to prevent duplicate identifiers.  We are
not going to want to perform a linear search of a linked list (effectively)
every time we add an identifier to check for uniqueness, so I think the
caching is going to have to be generalized very shortly anyway.

As it stands now, a duplicate identifier would cause an error at expansion
time.  That's not what we want, but it's not terrible, because you can have
that same problem in normal circumstances without local conflicts.

But this'll be used for metavariables as well, where we absolutely _do_ want
to fail at template definition time.

DEV-13708
2023-03-29 12:58:35 -04:00
Mike Gerwitz 1c7df894ea tamer: asg::graph: *lookup{=>_global}*
Identifier lookups, as done using the graph methods today, look up from a
cache representing the global environment.

Templates must not contribute to this environment until expansion.  Further,
metavariables will not be present in this environment.  To avoid confusion
and help obviate accidental contributions to this environment, the methods
have been renamed.  This will also allow for the creation of more general
methods down the line.

DEV-13708
2023-03-29 12:58:35 -04:00
Mike Gerwitz 25121c1086 tamer: asg::air: Test formatting (token nesting)
This makes the tests quite a bit easier to understand visually.  I've been
doing this with all new tests but had to go back to some old ones, and still
have more to go back to.  Baby steps.

DEV-13708
2023-03-29 12:58:35 -04:00
Mike Gerwitz bef68e1634 tamer: nir: Desugar shorthand template params and yield AIR
I had intended for this to be a full vertical slice initially, but AIR's
parser is going to need enough work that it'll muddy this patch a bit too
much.

This keeps the desugaring simple, which is what I was hoping for.

The next step is to load it into the graph and emit regenerated longhand
sources.

I also don't like how the namespace prefix is just being ignored for
shorthand param desugaring.  This is also the case in the XSLT-based
compiler, but this violates TAMER's principle that it should parse every bit
of information; nothing should be ignored.  If something does not contribute
useful information, then it is not a useful construct and ought to be
rejected.

DEV-13708
2023-03-29 12:58:35 -04:00
Mike Gerwitz 3dcb2cb03c tamer: nir::NirEntity::TplParam: Optional name/value pair
This will be used for shorthand desugaring.

DEV-13708
2023-03-29 12:58:35 -04:00
Mike Gerwitz a686855e9d tamer: Introduce desugaring operation for shorthand template application
This moves translation from NirToAir into TplShortDesugar, and changes the
output from AIR to NIR.

This is going to be much easier to reason about as a desugaring
operation (and indeed that's always how TAME has implemented it, in XSLT);
this keeps the complexity isolated.

Ideally, NirToAir wouldn't even accept tokens that it can't handle, but
that's going to take quite a bit more work and I don't have the time right
now.  Instead, we'll fail at runtime with some hopefully-useful
information.  It shouldn't actually happen in practice.

DEV-13708
2023-03-29 12:58:34 -04:00
Mike Gerwitz 3e9f407527 tamer: asg::air::ir: Remove TplApply
The implementation decided upon in the previous commits have made this
unnecessary, using `RefIdent` to produce `Tpl->Ident[->Tpl]` instead.

DEV-13708
2023-03-29 12:58:34 -04:00
Mike Gerwitz 669302700a tamer: build-aux/asg-ontviz: Vary arrowhead for cross edges
This makes it more visually apparent, when looking directly at a node,
whether an edge could represent a tree edge.

Dynamic edges could be tree edges, so I left those solid; that's the more
important visual indicator that I'm interested in, and it's disambiguated by
the dashed line.

DEV-13708
2023-03-29 12:58:34 -04:00
Mike Gerwitz 893da0ed20 tamer: asg: Dynamically determined cross edges
Previous to this commit, ontological cross edges were declared
statically.  But this doesn't fare well with the decided implementation for
template application.

The documentation details it, but we have Tpl->Ident which could mean "I
define this Ident once expanded", or it could mean "this is a reference to a
template I will be applying".  The former is a tree edge, the latter is a
cross edge, and that determination can only be made by inspecting edge data
at runtime.

It could have been resolved by introducing new Object types, but that is a
lot of work for little benefit, especially given that only (right now) the
visitor uses this information.

DEV-13708
2023-03-29 12:58:34 -04:00
Mike Gerwitz e132f108e8 tamer: asg::air: {=>diagnostic_}todo!
I forgot about my `diagnostic_todo!` macro!  The purpose was to help guide
development by obviating what comes next in test failures.

DEV-13708
2023-03-29 12:58:34 -04:00
Mike Gerwitz 9d50157f8e tamer: Very basic support for template application NIR -> xmli
This this a big change that's difficult to break up, and I don't have the
energy after it.

This introduces nullary template application, short- and long-form.  Note
that a body of the short form is a `@values@` argument, so that's not
supported yet.

This continues to formalize the idea of what "template application" and
"template expansion" mean in TAMER.  It makes a separate `TplApply`
unnecessary, because now application is simply a reference to a
template.  Expansion and application are one and the same: when a template
expands, it'll re-bind metavariables to the parent context.  So in a
template context, this amounts to application.

But applying a closed template will have nothing to bind, and so is
equivalent to expansion.  And since `Meta` objects are not valid outside of
a `Tpl` context, applying a non-closed template outside of another template
will be invalid.

So we get all of this with a single primitive (getting the "value" of a
template).

The expansion is conceptually like `,@` in Lisp, where we're splicing trees.

It's a mess in some spots, but I want to get this committed before I do a
little bit of cleanup.
2023-03-29 12:58:32 -04:00
Mike Gerwitz aa229b827c tamer: Makefile.am: cargo clippy: Use active feature flags
This was missing `@FEATURES@`, which was causing more compilation than
necessary, but also causing clippy to evaluate different code.

This also adds RUSTFLAGS, for the same reason of not wanting to recompile.

DEV-13708
2023-03-17 10:20:56 -04:00
Mike Gerwitz 03b46ebeff tamer: asg::air::tpl::TplState: Explicitly store reachability of active template
This is a small part of a larger change that I'm still working on.

DEV-13708
2023-03-16 15:08:15 -04:00
Mike Gerwitz d930b26487 tamer: asg::air::ir: Decide on TplApply and expansion
This chooses Option B, as stated would likely be the case in the previous
commit.  The reasons are practical---I intend to support partial application
if doing so is worth it, either in implementation of the compiler or the
source language.

Closed templates can be referenced using `IdentRef` to trigger
expansion---their value is what they expand into, and they are spliced into
that point in the tree, like `,@` in Lisp.  We are able to overload this
behavior because we have the necessary type information.

However, I don't want to have to generate an Ident for every single template
expansion; there are many tens of thousands of them in our production
system.  Since AIR doesn't presently have a way to deal with this situation,
I'll for now add a special token that will close and expand a template in
place; it can be replaced with two separate tokens (`TplEnd` + `Ref`, for
example) in the future if such a need arises.

Are we there yet...?

DEV-13708
2023-03-15 16:40:08 -04:00
Mike Gerwitz be81878dd7 tamer: src::asg: Scaffolding for metasyntactic variables
Also known as metavariables or template parameters.

This is a bit of a tortured excursion, trying to figure out how I want to
best represent this.  I have a number of pages of hand-written notes that
I'd like to distill over time, but the rendered graph ontology (via
`asg-ontviz`) demonstrates the broad idea.

`AirTpl::TplApply` highlights some remaining questions.  What I had _wanted_
to do is to separate the concepts of application and expansion, and support
partial application and such.  But it's going to be too much work for now,
when it isn't needed---partial application can be worked around by simply
creating new templates and duplicating params, as we do today, although that
sucks and is a maintenance issue.  But I'd rather address that head-on in
the future.

So it's looking like Option B is going to be the approach for now, with
templates being closed (as in, no free metavariables) and expanded at the
same time.  This simplifies the parser and error conditions significantly
and makes it easier to utilize anonymous templates, since it'll still be the
active context.

My intent is to get at least the graph construction sorted out---not the
actual expansion and binding yet---enough that I can use templates to
represent parts of NIR that do not have proper graph representations or
desugaring yet, so that I can spit them back out again in the `xmli` file
and incrementally handle them.  That was an option I had considered some
months ago, but didn't want to entertain it at the time because I wasn't
sure what doing so would look like; while it was an attractive approach
since it pushes existing primitives into the template system (something I've
wanted to do for years), I didn't want to potentially tank performance or
compromise the design for it after I had spent so much effort on all of this
so far.

But my efforts have yielded a system that significantly exceeds my initial
performance expectations, with a decent abstractions, and so this seems
viable.

DEV-13708
2023-03-15 16:40:07 -04:00
Mike Gerwitz 9e5958d89e tamer: asg::air::ir::Air: Open/Close => Start/End in token names
See the Air docblock for more information.  I'm introducing new tokens for
the template system, which uses the terms "free" and "closed".  I prefer
open/close for delimiters, as I've expressed elsewhere, but unfortunately it
conflicts too much (and too confusingly) with other standard terminology as
we get more into the formal side of the language.

DEV-13708
2023-03-15 10:59:25 -04:00
Mike Gerwitz 0e42788dcc tamer: asg::air: Restrict AirTplAggregate token domain to new AirTemplatable
This removes special cases, but it does complicate the parent `AirAggregate`
parser.  A pattern of delegation is forming, though abstracting it may be an
interesting challenge, given Rust's limitation on macro invocations as match
arms.  But, I think I can manage by generating the entire match using a
macro with a match-compatible syntax, augmenting where
needed...maybe.  This'll be messy.

...but if I can write the nightmare that is `ele_parse!`, I'm sure I can
manage this.  I just prefer to avoid complex macros unless I really need
them.

DEV-13708
2023-03-11 00:58:08 -05:00
Mike Gerwitz 2233c69bbf tamer: asg::graph::object: Some minor proofreading 2023-03-10 23:44:40 -05:00
Mike Gerwitz 18fa910e0f tamer: {tools=>build-aux}/asg-ontviz
Now that these are actually intended to be used as part of the build, this
is a more appropriate location.  I originally wrote it as a manual tool.

DEV-13708
2023-03-10 15:13:30 -05:00
Mike Gerwitz a5b03e8790 tamer: Embed ASG ontology visualization in rustdoc-generated docs
There, in-your-face and not hidden in some tools directory.

DEV-13708
2023-03-10 14:28:00 -05:00
Mike Gerwitz f733a85597 tamer: tools/asg-ontviz: ASG ontology visualization
This parses the declarative `object_rel!` definitions from the Rust sources
and produces a DOT representation of the ontology of the graph, which can
then be rendered using Graphviz.

This does not yet introduce it into the build; it ought to be run as part of
`make check` (without rendering with Graphviz) to ensure that we catch
breaking changes, and `make html` ought to integrate it into the
documentation, perhaps as part of `asg::graph` or `asg::graph::object`.

DEV-13708
2023-03-10 14:28:00 -05:00
Mike Gerwitz 0aa69c079d tamer: NIR->xmli: Ceil, Floor expressions
Small break from templates for something easier.  I have COVID-19, so I'll
use that as my excuse for wanting to be more lazy.

The real reason is to see some more concrete progress and ensure that
patterns hold for simple expressions before further refactoring.

But, before I proceed with such refactoring, I really ought to approach
something that requires a NIR desugaring step, like case statements.

DEV-13708
2023-03-10 14:28:00 -05:00
Mike Gerwitz b9f0fada51 tamer: asg::graph::object::expr::ExprOp: Doc comment fix {//=>///}
DEV-13708
2023-03-10 14:28:00 -05:00
Mike Gerwitz e6325c4c1d tamer: tests/xmli: Estimate tamec time in milliseconds
Going higher than that doesn't make sense because we're in shell and
invoking commands all around this, so even milliseconds isn't going to be
entirely accurate here.  However, what I am more interested in is observing
time relative to other runs; this isn't intended for profiling, but for
eyeballing unexpected behavior.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz b84ee356d5 tamer: tests/xmli: Formatted and more informative output
There's a lot to look at, especially in the event of failure.  Further, I
wanted to add additional statistics that could be eyeballed.

Right now, tamec is too fast (at least on my machine) for the precision of
/usr/bin/time: we need milliseconds, but we only get hundredths of a
second.  So it'll all show as 0:00.00s.  Which is okay, for now; it just
shouldn't exceed that. ;)

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz a261e75fe0 tamer: tests/xmli: Break apart single test case
This would have gotten unwieldy as time goes on, and already made looking at
traces very difficult.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz 7ebd494752 tamer: tests/xmli/expected.xml: Align with src
This just makes this easier to compare side-by-side.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz 343f5b34b3 tamer: asg::air: Template support for dangling expressions
The intent was to have a very simple implementation of `hold_dangling` and
have everything work.  But, I had a nasty surprise when the system tests
caught bug caused by some interesting depth interactions as it relates to
`xmli` and auto-closing.

I added an extra test/example in `asg::graph::visit::test` to illustrate the
situation; it was difficult to derive from the traces, but trivially obvious
once I wrote it out as an example.

With that, templates can now aggregate tokens for dangling expressions.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz 286f4cb679 tamer: tests/xmli: Reduce output on failure
This won't try the fixpoint test if the prior one fails, which will always
cause that one to fail.  And it further won't attempt the diff on
compilation failure.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz 5c60c5fd15 tamer: asg::air::tpl: Parse template body expressions
And finally we have tokens aggregated onto the ASG in the context of a
template.  I expected to arrive here much more quickly, but there was a lot
of necessary refactoring.  There's a lot more that could be done, but I need
to continue; I had wanted this done a week ago.

It is worth noting, though, that this finally achieves something I had been
wondering about since the inception of this project---how I'd represent
templates on the graph.  I think this worked out rather nicely.  It wasn't
even until a few months ago that I decided to use AIR instead of NIR for
that purpose (NIR wouldn't have worked).

And note how I didn't have to touch the program derivation at all---the
system test just works with the AIR change, because of the consistent
construction of the graph.  Beautiful.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz 431df6cecb tamer: asg::air::expr: Dead states for AirBind
This hoists the errors back into `AirAggregate`; I need dead states for the
`AirTplAggregate` parser so that it will know when to (and not to) interpret
tokens in the context of the template itself.

In a previous commit message, I had pondered whether it may be possible to
eliminate the dead state transition, and yet here I've used it with both of
the sub-parsers now.  So it seems like the better option in the future may
be to narrow the type further---to say precisely _what_ types of tokens may
yield a dead state transition; otherwise you lose the match information from
the parser that yielded it.

A stubbornly persistent problem in Rust, this magical and hidden match
knowledge.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz 1770949b9a tamer: asg::air::expr: Move Dangling expression handling into RootStrategy
And with this, hopefully we are now finally prepared for dangling
expressions in templates.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz 231296d003 tamer: asg::air::expr: Introduce RootStrategy
This sets us up to be able to determine how `Dangling` expressions will be
rooted into templates.

This new strategy isn't yet handling `Dangling`; I wanted to get this
committed first so that the `Dangling` refactoring is more clear.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz fc1d55c4c5 tamer: asg::air::expr: Generic target ObjectKind
Expressions were previously tied to packages.  This prepares for using a
`Tpl` as a container for expressions.

This does not yet handle the situation of auto-rooting dangling expressions
within the container.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz 266c9eb05a tamer: parse::state::ParseState: Remove `Eq` derivation
This is unneeded and is just a pain.  If ever we need `Eq`, it could be
implemented only for `ParseState`s that actually need it.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz 8cb781ccca tamer: asg::air::expr::ExprStack: {SPair=>ObjectIndex} reachable evidence
This result in less useful debug output, but it'll be needed for using
a (possibly-anonymous) template as evidence.

This evidence is simply for debugging, and to require some sort of value
during development to help obviate when maybe something is being done
incorrectly (if no obvious value exists).

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz c1d04f1cf4 tamer: asg::air: Extract template parsing into `tpl`
Same as the previous commit.  These commits have significantly reduced the
cognitive burden of working on this subsystem.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz 4fd8e9ea40 tamer: asg::air: Extract expression parsing into `expr`
This is more of the same refactoring that has been happening.  This
extraction also helps emphasize the relationship between imported objects,
and isolates the growing number of test cases.  This parser will only grow.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz f307f2d70b tamer: asg::air: Extract template parsing into own parser
Just as was done with the expression parser, which this will utilize.  This
initializes it, but doesn't yet make use of it (`AirExprAggregate`).

Refactoring was definitely needed; decomposing this is quite a bit of work,
in no small part because of the complexity.  This helps significantly.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz 25c0aa180e parse::state::transition::TransitionResult::branch_dead: Add branch context
This works around limitations of Rust's borrow checker as of the time of
writing.  See the provided documentation for more information.

The branch context is not yet exposed to the `delegate` family of methods;
it will be added only as needed in the future.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz d99a8efbaf tamer: asg::air::ir: {ExprRef=>RefIdent}
This generalizes the IR, and relates the duals: identifying and referencing.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz e2714ce73f tamer: asg::air::ir::sum_ir: impl Token for IR sum type
This is necessary for the commit that follows.  Maybe it wasn't worth
separating this into its own commit.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz b6d0569b99 tamer: asg::air: Expression parser
This delegates expression parsing to `AirExprAggregate`, in an effort to
both begin to simplify the understanding and maintenance of `AirAggregate`;
and allow for parser composition for template parsing.

This utilizes the prior changes for token sum types to precisely define the
subset of AIR tokens supported by the expression parser.  This differs from
prior approaches which delegated until a dead state, relying on runtime
information to determine if a parser has finished.  This allows us to
determine that statically.

I do want to be able to eliminate the dead state from the parser so we can
get rid of the `unreachable!`, but I need to move on; that's something I had
tried to do in the past too, which ended up adding a bit of complexity, and
I'll have to consider my options in the future, including whether the dead
state transition can be entirely eliminated in favor of the combination of
these sum types and recovery; the parsing framework decisions were made
while recovery was still an open question, at least in practice.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz dfeef4ec25 tamer: asg::air::ir::sum_ir: Support arbitrary sum types
See the provided documentation.  This allows for precisely defining sum
types over all tokens accepted by parsers; see a following commit.

DEV-13708
2023-03-10 14:27:59 -05:00
Mike Gerwitz aec3b97e3f tamer: parse::parser::Parser: Prevent infinite iteration on finalize
This was a rather frustrating thing to encounter.  I was working on
refactoring `AirAggregate`, and found that my tests were hanging despite no
apparent cause in the parser itself.

As it turns out, rather than failing with a `FinalizeError` as I
expected (since I was mid-refactor), `collect()` was allocating space for an
endless stream of errors.  This was easily verified by adding a `take(x)`
and observing the assertion failure (in this case, in `close_pkg_mid_expr`).

This happens to be the first time in a long time that I actually had to
debug---the combination of robust types as proofs and tests to fill in the
gaps means that runtime issues are caught at build time in all but
exceptional cases (like this one).

It's also worth noting that, because of my policy of iterating only at the
higher levels of the program, it was clear that this must somehow be
Parser-related, since that's the only part of the system that has the
potential for unbounded recursion due to its cyclic state machines.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz 34b64fd619 tamer: asg::air: AIR as a sum IR
This introduces a new macro `sum_ir!` to help with a long-standing problem
of not being able to easily narrow types in Rust without a whole lot of
boilerplate.  This patch includes a bit of documentation, so see that for
more information.

This was not a welcome change---I jumped down this rabbit hole trying to
decompose `AirAggregate` so that I can share portions of parsing with the
current parser and a template parser.  I can now proceed with that.

This is not the only implementation that I had tried.  I previously inverted
the approach, as I've been doing manually for some time: manually create
types to hold the sets of variants, and then create a sum type to hold those
types.  That works, but it resulted in a mess for systems that have to use
the IR, since now you have two enums to contend with.  I didn't find that to
be appropriate, because we shouldn't complicate the external API for
implementation details.

The enum for IRs is supposed to be like a bytecode---a list of operations
that can be performed with the IR.  They can be grouped if it makes sense
for a public API, but in my case, I only wanted subsets for the sake of
delegating responsibilities to smaller subsystems, while retaining the
context that `match` provides via its exhaustiveness checking but does not
expose as something concrete (which is deeply frustrating!).

Anyway, here we are; this'll be refined over time, hopefully, and
portions of it can be generalized for removing boilerplate from other IRs.

Another thing to note is that this syntax is really a compromise---I had to
move on, and I was spending too much time trying to get creative with
`macro_rules!`.  It isn't the best, and it doesn't seem very Rust-like in
some places and is therefore not necessarily all that intuitive.  This can
be refined further in the future.  But the end result, all things
considered, isn't too bad.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz d42a46d2b8 tamer: NIR->xmli template definition setup
This sets the stage for template parsing, and finally decides how we're
going to represent templates on the ASG.  This is going to start simple,
since my original plans for improving how templates are
handled (conceptually) is going to have to wait.

This is the last difficult object type to figure out, with respect to graph
representation and derivation, so I wanted to get it out of the way.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz 08278bc867 tamer: asg::air::Air::{ExprIdent=>BindIdent}: Rename
I wasn't initially sure whether I'd want separate tokens for different types
of identifying operations, but now that I see that it is clear from the
current state of the parser, there's no need.

This matches the name of the token in NIR.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz dd2232b58b tamer: asg::graph: object_gen and object_rel macros
The previous commit demonstrated the amount of boilerplate necessary for
introducing new `ObjectKind`s; this abstracts away a lot of that
boilerplate, and allows for declarative relationship definition for the
ASG's ontology.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz 454b91dfce tamer: asg::graph::object: New Tpl object
There's quite a bit of boilerplate here that'll eventually need factoring
out.  But it's also clear that it is somewhat onerous to add new object
types.

Note that a good chunk of this burden is _intentional_, via exhaustiveness
checks---adding a new type of object is an exceptional occurrence (well, in
principle, but we haven't added them all yet, so it'll be more common
initially), and we'd rather be safe to ensure that everything is properly
considering how that new type of object interacts with it.

Let's not confuse coupling with safety---the latter causes a burden because
of the former, not because of itself; it provides a service to us.

But, nonetheless, we'll want to reduce this burden somewhat since there are
a number more to add.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz 98fcb115da tamer: NIR->xmli: Initial classify, any, all support
Just as `rate` is a `sum`, `classify` is an `all` by default.  The `@any`
attribute will change that interpretation, though I only intend to recognize
that in parsing later on, not emit that in XMLI.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz d5cf276de2 tamer: nir::air::NirToAir::parse_token: Exhaustiveness without wildcard
Let's start to be explicit about what's missing as we continue to add new
tokens; the exhaustiveness checks throughout the system will guide the
changes that need to be made.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz 5865d86485 tamer: NIR->xmli: Initial product expression
The element only, no attributes yet.

I'll keep forming boilerplate until abstraction points become obvious with
more variety; this is still pretty close to what was already supported.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz ebc16b7bdb tamer: asg::graph::xmli: Deduplicate with TreeContext
We already had `TreeContext`, and I'm passing the same arguments around, so
this uses it to lift arguments out of these functions, like partial
application.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz 6cbcdb1774 tamer: tests/xmli: Add fixpoint test
See documentation for more information.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz 506d3e9d11 tamer: asg::graph::xmli::AsgTreeToXirf::parse_token: Cleanup
This tidies this method up into a decent state that I'm fairly content
with.  This goes to emphasize my dislike of returns, which muddies control
flow and makes the code more difficult to read at a glance, which increase
the likelihood of logic bugs.

`match` statements in tail position, on the other hand, are very clear, and
less cognitively burdensome since you can see each individual code path at a
glance.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz e3e50c38c7 tamer: asg::graph::xmli: Extract xmli generation from parse_token
This begins to develop a pattern for doing these transformations.  I had
tried a number of things using iterators, but I wasn't satisfied with either
how they were turning out; had to fight too much with the type system; or
had to resort to heap allocations.  Sticking with an explicit
`push`/`push_all` for now works just fine.

Almost done cleaning up `AsgTreeToXirf::parse_token`, and then I can move on
to introducing more objects.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz 9eb1d226b2 tamer: span: Resolve unusued import warning for debug diagnostic 2023-03-10 14:27:58 -05:00
Mike Gerwitz 3587d032c3 tamer: asg::graph::object::rel::DynObjectRel: Store source data
This is generic over the source, just as the target, defaulting just the
same to `ObjectIndex`.

This allows us to use only the edge information provided rather than having
to perform another lookup on the graph and then assert that we found the
correct edge.  In this case, we're dealing with an `Ident->Expr` edge, of
which there is only one, but in other cases, there may be many such edges,
and it wouldn't be possible to know _which_ was referred to without also
keeping context of the previous edge in the walk.

So, in addition to avoiding more indirection and being more immune to logic
bugs, this also allows us to avoid states in `AsgTreeToXirf` for the purpose
of tracking previous edges in the current path.  And it means that the tree
walk can seed further traversals in conjunction with it, if that is so
needed for deriving sources.

More cleanup will be needed, but this does well to set us up for moving
forward; I was too uncomfortable with having to do the separate
lookup.  This is also a more intuitive API.

But it does have the awkward effect that now I don't need the pair---I just
need the `Object`---but I'm not going to remove it because I suspect I may
need it in the future.  We'll see.

The TODO references the fact that I'm using a convenient `resolve_oi_pairs`
instead of resolving only the target first and then the source only in the
code path that needs it.  I'll want to verify that Rust will properly
optimize to avoid the source resolution in branches that do not need it.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz cb5d54b2db tamer: asg::graph::object: Generic Object inner type
This makes the inner `Object` type generic (but defaulting to the same inner
types as before) so that it can be used as a sum type for various types
where `ObjectKind`-based narrowing is required.

In this case, it's used to narrow `ObjectIndex` alongside the inner
`ObjectKind` so that the two are definitely in sync.  This not only results
in cleaner code and a more intuitive API that's approachable to people
less familiar with the system, but it also helps to eliminate logic bugs
that might result form manually narrowing (as was done before this change).

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz d078b24efd tamer: asg::graph::xmli::TokenStack::push_all: New method
Rust optimizes away the iterator and array, compiling into separate `push`
calls as before.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz 9990be58a7 tamer: Lower sum expressions
This was a fairly simple addition, since rate blocks already lower into sum
expressions; these are just non-identified.

This does emphasize that the nir::parse `ele_parse!` abstraction I spent so
much time on ended up not being a perfect fit, as it now has some
boilerplate after it was stripped of much of its capabilities some time ago.

Don't worry, `nir::air` and `asg::graph::xmli` will get cleaned up.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz ee9128fbe0 tamer: asg::graph::{object::xir=>xmli}: Rename module
This better reflects what is being done and makes it easier for someone to
find.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz 82915f11af tamer: asg::graph::object::xir: Initial rate element reconstruction
This extends the POC a bit by beginning to reconstruct rate blocks (note
that NIR isn't producing sub-expressions yet).

Importantly, this also adds the first system tests, now that we have an
end-to-end system.  This not only gives me confidence that the system is
producing the expected output, but serves as a compromise: writing unit or
integration tests for this program derivation would be a great deal of work,
and wouldn't even catch the bugs I'm worried most about; the lowering
operation can be written in such a way as to give me high confidence in its
correctness without those more granular tests, or in conjunction with unit
or integration tests for a smaller portion.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz 95272c4593 tamer: tests: System test support
This provides a test harness for running shell-based system tests.  The
first of such tests will be introduced in the following commit.

This is done in place of integration tests written in Rust because it will
invoke the final binary exactly as the user or build system (using TAMER)
will, providing greater confidence.  Besides, a lot of things are simply
more convenient to do in shell.  ...though some of you may debate that.

DEV-13708
2023-03-10 14:27:58 -05:00
Mike Gerwitz 9200d415f9 tamer: configure.ac: conf.sh: New configuration file
The intent is to source this in shell scripts, like tests.

This exposes feature flags to shell scripts, but it doesn't do so in quite
the same way that Rust does---it doesn't apply the dependencies.  While this
isn't needed now, it does make me a little uncomfortable, and so I may take
a different approach in the future.

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz 716247483f tamer: asg::graph::object::xir: POC use of token stack
Just some final POC setup for how this'll work; it's nothing
significant.  This just emits an `@xmlns` on the `package` element to
demonstrate use of the stack.

With that, it's time to formalize this.

I also need to document at some point why I choose to use `ArrayVec` still
over `Vec`---it's not a microoptimization.  It's intended to simplify the
runtime to keep execution simple with fewer code paths and make it more
amenable to analysis.  Memory allocation is a pretty complex thing and
muddies execution.  It's also another point of failure, though practically
speaking, I'm not worried about that---this is replacing a system that
consumes many GiB of memory (XSLT-based compiler) with one that consumes 10s
of MiB.

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz 7efd08a699 tamer: asg::graph::object::xir: New context to hold stack state
This (a) hold the state of a stack that I can populate with tokens rather
than introducing a state for every single e.g. attribute and such on
elements (so, more like the `xmle` XIR lowering).

It also hides the obvious awkwardness of the `&mut &'a Asg`, but that's not
the intent of it.

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz fe925db47d tamer: parse::lower:::Lower::lower: Implement in terms of lower_with_context
This is just a special case of lowering with a context, and maintaining two
separate implementations has resulted in divergence.  I don't recall why I
didn't do this previously, though it's possible that the lowering pipeline
was in a state that made it more difficult to do (e.g. with error
handling).

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz 6db70385d0 tamer: xir::flat: Introduce configurable acceptors
Technically, an "acceptor" in the context of state machines is actually a
state machine; the terminology here is more describing the configuration of
the state machine (`XirToXirf`) as an acceptor.

This change comes with significant documentation of the rationale and why
this is important; see that for more information.

This change is necessary so that we can enforce finalization on all parsers
in the lowering pipeline, which is not currently being done.  If we were to
do that now, then `tameld` would fail because it halts parsing of the tokens
stream at the end of the `xmlo` header.

This is also quite the type soup, but I'm not going to refine this further
right now, since my focus is elsewhere (XMLI lowering).

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz f8c1ef5ef2 tamer: tamec: MILESONE: POC end-to-end lowering
This has been a long time coming.  The wiring of it all together is a little
rough around the edges right now, but this commit represents a working POC
to begin to fill in the gaps for the entire lowering pipeline.

I had hoped to be at this point a year ago.  Yeah.

This marks a significant milestone in the project because this allows me to
begin to observe the implementation end-to-end, testing it on real-life
inputs as part of a production build pipeline.

...and now, with that, we can begin.  So much work has gone into this
project so far, but aside from the linker (which has been in production for
years), most of this work has been foundational.  It's been a significant
investment that I intend to have pay off in many different ways.

(All this outputs right now is `<package/>`.)

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz 33d2b4f0b8 tamer: tamec: POC lowering pipeline with XirfAutoClose and XirfToXir
This replaces the stub `derive_xmli` with the same result (well, minus a
space before the '/' in the output) using what will become the lowering
pipeline.  Once again, this is quite verbose, and the lowering pipeline in
general needs to be further abstracted away.

Unlike the rest of the pipeline, an error during the derivation process will
immediately terminate with an unrecoverable error, because we do not want to
write partial files.  This does not remove the garbage file, because the
build system ought to do that itself (e.g. `make`)...but that is certainly
open for debate.

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz 29178f2360 tamer: xir::reader: Divorce from `parse`
The reader previously yielded a `ParsedResult`, presumably to simplify
lowering operations.  But the reader is not a `ParseState`, and does not
otherwise use the parsing API, so this was an inappropriate and confusing
coupling.

This resolves that, introducing a new `lowerable` which will translate an
iterator into something that can be placed in a lowering pipeline.

See the previous commit for more information.

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz 963688f889 tamer: parse::lower::ParsedObject: Include Token type parameter
The token type was previously hard-coded to `UnknownToken`, since the use
case was the beginning of the lowering pipeline at the start of the program,
where there was no token type because the first parser (`XirReader`,
currently) is responsible for producing the first token type.

But when we're lowering from the graph (so, the other side of the lowering
pipeline), we _do_ have token types to deal with.

This also emphasizes the inappropriate coupling of `<XirReader as
Iterator>::Item` with `ParsedResult`; I'd like to follow the same approach
that I'm about to introduce with `tamec`, so see a future commit.

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz a68930589e tamer: parse::lower: Handle EOF token
This was missed (because it was not used) when EOF tokens were originally
introduced via `ParseState::eof_tok`---`LowerIter` also needs to consider
the token.

This separation betwen the two iterators is a maintenance burden that needs
to be taken care of; I knew that at the time, and then I forgot about it,
and here we are.

This was caught while beginning to wire together a POC graph lowering
pipeline to emit derived sources.

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz 79cc61f996 tamer: xir::flat::XirfToXir: New lowering operation
This parser does exactly what it says it does.  Its implementation is
simple, but I added a test anyway just to prove that it works, and the test
seems more complicated than the implementation itself, given the types
involved.

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz a5a5a99dbd tamer: asg::graph::visit::TreeWalkRel: New token type
This introduces a `Token` in place of the original tuple for
`TreePreOrderDfs` so that it can be used as input to a parser that will
lower into XIRF.

This requires that various things be describable (using `Display`), which
this also adds.  This is an example of where the parsing framework itself
enforces system observability by ensuring that every part of the system can
describe its state.

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz bc8586e4b3 tamer: xir::autoclose: New lowering operation
This lowering operation is intended to allow me to write a more concise and
clear mapping from the graph to XIRF, without having to worry about
balancing tags, which really complicated the implementation.

This has details docs; see that for more information.

I can't help but be reminded of Wisp (the whitespace-based Lisp-like
syntax).  Which is unfortunate, because I'm not fond of Wisp; I like my
parenthesis.

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz 7f3ce44481 tamer: asg::graph: Formalize dynamic relationships (edges)
The `TreePreOrderDfs` iterator needed to expose additional edge context to
the caller (specifically, the `Span`).  This was getting a bit messy, so
this consolodates everything into a new `DynObjectRel`, which also
emphasizes that it is in need of narrowing.

Packing everything up like that also allows us to return more information to
the caller without complicating the API, since the caller does not need to
be concerned with all of those values individually.

Depth is kept separate, since that is a property of the traversal and is not
stored on the graph.  (Rather, it _is_ a property of the graph, but it's not
calculated until traversal.  But, depth will also vary for a given node
because of cross edges, and so we cannot store any concrete depth on the
graph for a given node.  Not even a canonical one, because once we start
doing inlining and common subexpression elimination, there will be shared
edges that are _not_ cross edges (the node is conceptually part of _both_
trees).  Okay, enough of this rambling parenthetical.)

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz 2b2776f4e1 tamer: asg::graph::object::rel: Extract object relationships
DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz 065dca88fc tamer: asg::graph::vist::tree_reconstruction: Include Depth
This information is necessary to be able to reconstruct the tree, since
the `ObjectIndex` alone does not give you enough information.  Even if you
inspected the graph, it _still_ wouldn't give you enough information, since
you don't know the current path of the traversal for nodes that may have
multiple incoming edges.  (Any assumptions you could make today won't
always be valid in the future.)

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz e6f736298b tamer: asg::graph::visit::tree_reconstruction: New graph traversal
This begins to introduce a graph traversal useful for a source
reconstruction from the current state of the ASG.  The idea is to, after
having parsed and ingested the source through the lowering pipeline, to
re-output it to (a) prove that we have parsed correctly and (b) allow
progressively moving things from the XSLT-based compiler into TAMER.

There's quite a bit of documentation here; see that for more
information.  Generalizing this in an appropriate way took some time, but I
think this makes sense (that work began with the introduction of cross edges
in terms of the tree described by the graph's ontology).  But I do need to
come up with an illustration to include in the documentation.

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz 4afc8c22e6 tamer: asg::air: Merge Pkg closing span
The `Pkg` span will now properly reflect the entire definition of the
package including the opening and closing tags.

This was found while I was working on a graph traversal.

DEV-13597
2023-03-10 14:27:57 -05:00
Mike Gerwitz 39e98210be tamer: asg::graph::object::ident::ObjectIndex::<Ident>::bind_definition: Replace ident span
I noticed this while working on a graph traversal.  The unit test used the
same span for both the reference _and_ the binding, so I didn't notice. -_-

The problem with this, though, is that we do not have a separate span
representing the source location of the identifier reference.  The reason is
that we decided to re-use an existing node rather than creating another one,
which would add another inconvenient layer of indirection (and complexity).

So, I may have to add (optional?) spans to edges.

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz 89700aa949 tamer: asg::graph::object::ObjectRel::is_cross_edge: New trait method
This introduces the concept of ontological cross edges.

The term "cross edge" is most often seen in the context of graph traversals,
e.g. the trees formed by a depth-first search.  This, however, refers to the
trees that are inherent in the ontology of the graph.

For example, an `ExprRef` will produce a cross edge to the referenced
`Ident`, that that is a different tree than the current expression.  (Well,
I suppose technically it _could_ be a back edge, but then that'd be a cycle
which would fail the process once we get to preventing it.  So let's ignore
that for now.)

DEV-13708
2023-03-10 14:27:57 -05:00
Mike Gerwitz 52e5242af2 tamer: bin/tamec: wip-asg-derive-xmli-gated xmli output
This will begin to derive `xmli` output from the graph.

DEV-13708
2023-03-10 14:27:57 -05:00
Goldsmith, Mark 8d618654d4 [DEV-13308] list2typedef automatically sets first item as _NONE with 0 value
See merge request floss/tame!59
2023-02-15 11:32:09 -05:00
Mark Goldsmith c8ceaf00f6 [DEV-13308] list2typedef automatically sets first item as _NONE with 0 value
This was done so we can use t:param template with the generated
enum, but not have to provide the value in the YML test. Without
a NONE enum as 0, the default value of 0 in YML test will have
a domain violation.
2023-02-13 08:52:31 -05:00
Mike Gerwitz 2d3b27ac01 tamer: asg: Root package definition
This causes a package definition to be rooted (so that it can be easily
accessed for a graph walk).  This keeps consistent with the new
`ObjectIndex`-based API by introducing a unit `Root` `ObjectKind` and the
boilerplate that goes with it.

This boilerplate, now glaringly obvious, will be refactored at some point,
since its repetition is onerous and distracting.

DEV-13159
2023-02-01 10:34:17 -05:00
Mike Gerwitz a7fee3071d tamer: asg::graph::object: Move {Ident,Expr}Rel into respective submodules
DEV-13159
2023-02-01 10:34:17 -05:00
Mike Gerwitz f753a23bad tamer: asg: Introduce edge from Package to Ident
Included in this diff are the corresponding changes to the graph to support
the change.  Adding the edge was easy, but we also need a way to get the
package for an identifier.  The easiest way to do that is to modify the edge
weight to include not just the target node type, but also the source.

DEV-13159
2023-02-01 10:34:17 -05:00
Mike Gerwitz 39d093525c tamer: nir, asg: Introduce package to ASG
This does not yet create edges from identifiers to the package; just getting
this introduced was quite a bit of work, so I want to get this committed.

Note that this also includes a change to NIR so that `Close` contains the
entity so that we can pattern-match for AIR transformations rather than
retaining yet another stack with checks that are already going to be done by
AIR.  This makes NIR stand less on its own from a self-validation point, but
that's okay, given that it's the language that the user entered and,
conceptually, they could enter invalid NIR the same as they enter invalid
XML (e.g. from a REPL).

In _practice_, of course, NIR is lowered from XML and the schema is enforced
during that lowering and so the validation does exist as part of that
parsing.

These concessions speak more to the verbosity of the language (Rust) than
anything.

DEV-13159
2023-02-01 10:34:16 -05:00
Mike Gerwitz 2f08985111 tamer: asg::graph::object::new_rel_dyn: Use Option
Rather than panicing at this level, let's panic at the caller, simplifying
impls and keeping them total.

This can't occur now, but an upcoming change introducing a package type will
allow for such a thing.

DEV-13159
2023-02-01 10:34:16 -05:00
Mike Gerwitz e6abd996b7 tamer: asg::graph::Asg: Non-exhaustive Debug impl
This hides information that's taking up a lot of space in the parser traces
and is not useful information.  In particular, the `index` contains a lot of
empty space due to pre-interned symbols.

The index was going to be converted into a HashMap, but that was reverted
because the tradeoff did not make sense, and so this problem remains; see
the previous commit for more information.

DEV-13159
2023-02-01 10:34:16 -05:00
Mike Gerwitz d066bb370f Revert "tamer: asg::graph::index: Use FxHashMap in place of Vec"
This reverts commit 1b7eac337cd5909c01ede3a5b3fba577898d5961.

I don't actually think this ends up being worth it in the end.  Sure, the
implementation is simpler at a glance, but it is more complex at runtime,
adding more cycles for little benefit.

There are ~220 pre-interned symbols at the time of writing, so ~880 bytes (4
bytes per symbol) are potentially wasted if _none_ of the pre-interned
symbols end up serving as identifiers in the graph.  The reality is that
some of them _will_ but, but using HashMap also introduces overhead, so in
practice, the savings is much less.  On a fairly small package, it was <100
bytes memory saving in `tamec`.  For `tameld`, it actually uses _more_
memory, especially on larger packages, because there are 10s of thousands of
symbols involved.  And we're incurring a rehashing cost on resize, unlike
this original plain `Vec` implementation.

So, I'm leaving this in the history to reference in the future or return to
it if others ask; maybe it'll be worth it in the future.
2023-02-01 10:34:16 -05:00
Mike Gerwitz 417df548cf tamer: asg::graph::index: Use FxHashMap in place of Vec
This was originally written before there were a bunch of preinterned
symbols.  Now the index vector is very sparse.

This simplifies things a bit.  If this ends up manifesting as a bottleneck
in the future, we can revisit the implementation.  While this does result in
more cycles, it's neglibable relative to the total cycle count.
2023-02-01 10:34:16 -05:00
Mike Gerwitz 24eecaa3fd tamer: nir: Basic rate block translation
This commit is what I've been sitting on for testing some of the recent
changes; it is a very basic demonstration of lowering all the way down
from source XML files into the ASG.  This can be run on real files to
observe, beyond unit tests, how the system reacts.

Once this outputs data from the graph, we'll finally have tamec end-to-end
and can just keep filling the gaps.

I'm hoping to roll the desugaring process into NirToAir rather than having a
separate process as originally planned a couple of months back.

This also introduces the `wip-nir-to-air` feature flag.  Currently,
interpolation will cause a `Nir::BindIdent` to be emitted in blocks that
aren't yet emitting NIR, and so results in an invalid parse.

DEV-13159
2023-02-01 10:34:15 -05:00
Mike Gerwitz 39ebb74583 tamer: asg: Expression identifier references
This adds support for identifier references, adding `Ident` as a valid edge
type for `Expr`.

There is nothing in the system yet to enforce ontology through levels of
indirection; that will come later on.

I'm testing these changes with a very minimal NIR parse, which I'll commit
shortly.

DEV-13597
2023-01-26 14:45:17 -05:00
Mike Gerwitz 055ff4a9d9 tamer: Remove graphml target
This was originally created to populate Neo4J for querying, but it has not
been utilized.  It's become a maintenance burden as I try to change the API
of and encapsulate the graph, which is important for upholding its
invariants.

This feature, or one like it, will return in the future.  I have other
related plans; we'll see if they materialize.

The graph can't be encapsulated fully just yet because of the linker; those
commits will come in the following days.

DEV-13597
2023-01-26 14:45:17 -05:00
Mike Gerwitz 8735c2fca3 tamer: asg::graph: Static- and runtime-enforced multi-kind edge ontolgoy
This allows for edges to be multiple types, and gives us two important
benefits:

  (a) Compiler-verified correctness to ensure that we don't generate graphs
      that do not adhere to the ontology; and
  (b) Runtime verification of types, so that bugs are still memory safe.

There is a lot more information in the documentation within the patch.

This took a lot of iterating to get something that was tolerable.  There's
quite a bit of boilerplate here, and maybe that'll be abstracted away better
in the future as the graph grows.

In particular, it was challenging to determine how I wanted to actually go
about narrowing and looking up edges.  Initially I had hoped to represent
the subsets as `ObjectKind`s as well so that you could use them anywhere
`ObjectKind` was expected, but that proved to be far too difficult because I
cannot return a reference to a subset of `Object` (the value would be owned
on generation).  And while in a language like C maybe I'd pad structures and
cast between them safely, since they _do_ overlap, I can't confidently do
that here since Rust's discriminant and layout are not under my control.

I tried playing around with `std::mem::Discriminant` as well, but
`discriminant` (the function) requires a _value_, meaning I couldn't get the
discriminant of a static `Object` variant without some dummy value; wasn't
worth it over `ObjectRelTy.`  We further can't assign values to enum
variants unless they hold no data.  Rust a decade from now may be different
and will be interesting to look back on this struggle.

DEV-13597
2023-01-26 14:45:14 -05:00
Mike Gerwitz 8739c2c570 tamer: asg::graph::object: AsRef in place of higher-rank trait bound
We only need a reference to the inner object, for which `AsRef` is the
proper and idiomatic solution.

There is a lot of boilerplate here that I hope to reduce in the future.

DEV-13597
2023-01-23 11:48:35 -05:00
Mike Gerwitz b87c078894 tamer: asg::error: Clarify DanglingExpr
DEV-13597
2023-01-23 11:48:35 -05:00
Mike Gerwitz 50afb2d359 tamer: asg::graph::object::ObjectRelFrom: Remove trait
ObjectRelTo is sufficient and, while I originally thought it was useful to
have it read left-to-right, it just ends up being a cognitive burden.

DEV-13597
2023-01-23 11:48:35 -05:00
Mike Gerwitz ee30600f67 tamer: asg::air::Air: {*Expr=>Expr*}
Makes grouping and code completion easier when they're prefixed.

DEV-13597
2023-01-23 11:48:28 -05:00
Mike Gerwitz ae675a8f29 tamer: asg::graph::object::ident::ObjectIndex::<Ident>: No edge reassignment yet
I'm spending a lot of time considering how the future system will work,
which is complicating the needs of the system now, which is to re-output the
source XML so that we can selectively start to replace things.

So I'm going to punt on this.

I was also planning out how that edge reassignment out to work, along with
traits to try to enforce it, and that is also complicated, so I may wind up
wanting to leave them in the end, or handling this
differently.  Specifically, I'll want to know how `value-of` expressions are
going to work on the graph first, since its target is going to be dynamic
and therefore not knowable at compile-time.  (Rather, I know how I want to
make them work, but I want to observe that working in practice first.)

DEV-13597
2023-01-20 23:37:30 -05:00
Mike Gerwitz f1445961ee tamer: diagnose::panic::diagnostic_todo!: New macro
There is extensive rationale in the documentation for this new macro.  I'm
utilizing it to provide a more clear and friendly message for incomplete
ident resolution so that I can move on and return to those situations later.

It's worth noting that:

  - Externs _will_ need to be handled in the near-term;
  - Opaque and IdentFragment almost certainly won't be bound to a definition
    until I introduce LTO, which is quite a ways off; and
  - They may use the same mechanism and so may be able to be handled at the
    same time anyway.

DEV-13597
2023-01-20 23:37:30 -05:00
Mike Gerwitz 954b5a2795 Copyright year and name update
Ryan Specialty Group (RSG) rebranded to Ryan Specialty after its IPO.
2023-01-20 23:37:30 -05:00
Mike Gerwitz 1be0f2fe70 tamer: asg::object: Move into graph module
The ASG delegates certain operations to Objects so that they may enforce
their own invariants and ontology.  It is therefore important that only
objects have access to certain methods on `Asg`, otherwise those invariants
could be circumvented.

It should be noted that the nesting of this module is such that AIR should
_not_ have privileged access to the ASG---it too must utilize objects to
ensure those invariants are enforced in a single place.

DEV-13597
2023-01-20 23:37:30 -05:00
Mike Gerwitz cdfe9083f8 tamer: asg: Move {expr,ident} into object/
Starting to re-organize things to match my mental model of the new system;
the ASG abstraction has changed quite a bit since the early days.

This isn't quite enough, though; see next commit.

DEV-13597
2023-01-20 23:37:29 -05:00
Mike Gerwitz c9746230ef tamer: asg::graph::test: Extract into own file
DEV-13597
2023-01-20 23:37:29 -05:00
Mike Gerwitz 4e3a81d7f5 tamer: asg: Bind transparent ident
This provides the initial implementation allowing an identifier to be
defined (bound to an object and made transparent).

I'm not yet entirely sure whether I'll stick with the "transparent" and
"opaque" terminology when there's also "declare" and "define", but a
`Missing` state is a type of declaration and so the distinction does still
seem to be important.

There is still work to be done on `ObjectIndex::<Ident>::bind_definition`,
which will follow.  I'm going to be balancing work to provide type-level
guarantees, since I don't have the time to go as far as I'd like.

DEV-13597
2023-01-20 23:37:29 -05:00
Mike Gerwitz 378fe3db66 tamer: asg::Asg::lookup: SymbolId=>SPair
This seems to have been an oversight from when I recently introduced SPairs
to ASG; I noticed it while working on another change and receiving back a
`DUMMY_SPAN`.

DEV-13597
2023-01-20 23:37:29 -05:00
Mike Gerwitz 554bb81a63 tamer: asg::ident: Introduce distinction between opaque and transparent
`Ident` is now `Opaque`, but the new `Transparent` state isn't actually used
yet in any transitions; that'll come next.

The original (now "opaque") identifiers were added for the linker, which
does not need (at present) the associated expressions, since they've already
been compiled.  In the future I'd like to do LTO (link-time optimization),
and then the graph will need more information.

DEV-13160
2023-01-20 23:37:29 -05:00
Mike Gerwitz a9e65300fb tamer: diagnose::panic: Require thunk or static ref for diagnostic data
Some investigation into the disassembly of TAMER's binaries showed that Rust
was not able to conditionalize `expect`-like expressions as I was hoping due
to eager evaluation language semantics in combination with the use of
`format!`.

This solves the problem for the diagnostic system be creating types that
prevent this situation from occurring statically, without the need for a
lint.
2023-01-20 23:37:29 -05:00
Mike Gerwitz e6640c0019 tamer: Integrate clippy
This invokes clippy as part of `make check` now, which I had previously
avoided doing (I'll elaborate on that below).

This commit represents the changes needed to resolve all the warnings
presented by clippy.  Many changes have been made where I find the lints to
be useful and agreeable, but there are a number of lints, rationalized in
`src/lib.rs`, where I found the lints to be disagreeable.  I have provided
rationale, primarily for those wondering why I desire to deviate from the
default lints, though it does feel backward to rationalize why certain lints
ought to be applied (the reverse should be true).

With that said, this did catch some legitimage issues, and it was also
helpful in getting some older code up-to-date with new language additions
that perhaps I used in new code but hadn't gone back and updated old code
for.  My goal was to get clippy working without errors so that, in the
future, when others get into TAMER and are still getting used to Rust,
clippy is able to help guide them in the right direction.

One of the reasons I went without clippy for so long (though I admittedly
forgot I wasn't using it for a period of time) was because there were a
number of suggestions that I found disagreeable, and I didn't take the time
to go through them and determine what I wanted to follow.  Furthermore, it
was hard to make that judgment when I was new to the language and lacked
the necessary experience to do so.

One thing I would like to comment further on is the use of `format!` with
`expect`, which is also what the diagnostic system convenience methods
do (which clippy does not cover).  Because of all the work I've done trying
to understand Rust and looking at disassemblies and seeing what it
optimizes, I falsely assumed that Rust would convert such things into
conditionals in my otherwise-pure code...but apparently that's not the case,
when `format!` is involved.

I noticed that, after making the suggested fix with `get_ident`, Rust
proceeded to then inline it into each call site and then apply further
optimizations.  It was also previously invoking the thread lock (for the
interner) unconditionally and invoking the `Display` implementation.  That
is not at all what I intended for, despite knowing the eager semantics of
function calls in Rust.

Anyway, possibly more to come on that, I'm just tired of typing and need to
move on.  I'll be returning to investigate further diagnostic messages soon.
2023-01-20 23:37:29 -05:00
Mike Gerwitz f1cf35f499 tamer: asg: Add expression edges
This introduces a number of abstractions, whose concepts are not fully
documented yet since I want to see how it evolves in practice first.

This introduces the concept of edge ontology (similar to a schema) using the
type system.  Even though we are not able to determine what the graph will
look like statically---since that's determined by data fed to us at
runtime---we _can_ ensure that the code _producing_ the graph from those
data will produce a graph that adheres to its ontology.

Because of the typed `ObjectIndex`, we're also able to implement operations
that are specific to the type of object that we're operating on.  Though,
since the type is not (yet?) stored on the edge itself, it is possible to
walk the graph without looking at node weights (the `ObjectContainer`) and
therefore avoid panics for invalid type assumptions, which is bad, but I
don't think that'll happen in practice, since we'll want to be resolving
nodes at some point.  But I'll addres that more in the future.

Another thing to note is that walking edges is only done in tests right now,
and so there's no filtering or anything; once there are nodes (if there are
nodes) that allow for different outgoing edge types, we'll almost certainly
want filtering as well, rather than panicing.  We'll also want to be able to
query for any object type, but filter only to what's permitted by the
ontology.

DEV-13160
2023-01-20 23:37:29 -05:00
Mike Gerwitz 5e13c93a8f tamer: asg: New ObjectContainer for Node type
Working with the graph can be confusing with all of the layers
involved.  This begins to provide a better layer of abstraction that can
encapsulate the concept and enforce invariants.

Since I'm better able to enforce invariants now, this also removes the span
from the diagnostic message, since the invariant is now always enforced with
certainty.  I'm not removing the runtime panic, though; we can revisit that
if future profiling shows that it makes a negative impact.

DEV-13160
2023-01-20 23:37:29 -05:00
Mike Gerwitz 8786ee74fa tamer: asg::air: Expression building error cases
This addresses the two outstanding `todo!` match arms representing errors in
lowering expressions into the graph.  As noted in the comments, these errors
are unlikely to be hit when using TAME in the traditional way, since
e.g. XIR and NIR are going to catch the equivalent problems within their own
contexts (unbalanced tags and a valid expression grammar respectively).

_But_, the IR does need to stand on its own, and I further hope that some
tooling maybe can interact more directly with AIR in the future.

DEV-13160
2023-01-20 23:37:29 -05:00
Mike Gerwitz dc3cd8bbc8 tamer: asg::air::AirAggregate: Reduce duplication
This refactors the previous commit a bit to remove the significant amount of
duplication, as planned.

DEV-7145
2023-01-20 23:37:29 -05:00
Mike Gerwitz 40c941d348 tamer: asg::air::AirAggregate: Initial impl of nested exprs
This introduces a number of concepts together, again to demonstrate that
they were derived.

This introduces support for nested expressions, extending the previous
work.  It also supports error recovery for dangling expressions.

The parser states are a mess; there is a lot of duplicate code here that
needs refactoring, but I wanted to commit this first at a known-good state
so that the diff will demonstrate the need for the change that will
follow; the opportunities for abstraction are plainly visible.

The immutable stack introduced here could be generalized, if needed, in the
future.

Another important note is that Rust optimizes away the `memcpy`s for the
stack that was introduced here.  The initial Parser Context was introduced
because of `ArrayVec` inhibiting that elision, but Vec never had that
problem.  In the future, I may choose to go back and remove ArrayVec, but I
had wanted to keep memory allocation out of the picture as much as possible
to make the disassembly and call graph easier to reason about and to have
confidence that optimizations were being performed as intended.

With that said---it _should_ be eliding in tamec, since we're not doing
anything meaningful yet with the graph.  It does also elide in tameld, but
it's possible that Rust recognizes that those code paths are never taken
because tameld does nothing with expressions.  So I'll have to monitor this
as I progress and adjust accordingly; it's possible a future commit will
call BS on everything I just said.

Of course, the counter-point to that is that Rust is optimizing them away
anyway, but Vec _does_ still require allocation; I was hoping to keep such
allocation at the fringes.  But another counter-point is that it _still_ is
allocated at the fringe, when the context is initialized for the parser as
part of the lowering pipeline.  But I didn't know how that would all come
together back then.

...alright, enough rambling.

DEV-13160
2023-01-20 23:37:29 -05:00
Mike Gerwitz b8a7a78f43 tamer: asg::expr::ExprOp: Future implementation note
I had wanted to implement expression operations in terms of user-defined
functions (where primitives are just marked as intrinsic), and would still
like to, but I need to get this thing working, so I'll just include a note
for now.

Yes, TAMER's formalisms are inspired by APL, if that hasn't been documented
anywhere yet.

DEV-13160
2023-01-20 23:37:29 -05:00
Mike Gerwitz 4b9b173e30 tamer: asg::air::Air::span: Provide spans
Not that they're loaded from object files yet, but this will at least work
once they are.

DEV-13160
2023-01-20 23:37:29 -05:00
Mike Gerwitz 8e328d2828 tamer: diagnose::report::VisualReporter::render: Remove superfluous comments 2023-01-20 23:37:29 -05:00
Mike Gerwitz edbfc87a54 tamer: f::Functor: New trait
This commit is purposefully coupled with changes that utilize it to
demonstrate that the need for this abstraction has been _derived_, not
forced; TAMER doesn't aim to be functional for the sake of it, since
idiomatic Rust achieves many of its benefits without the formalisms.

But, the formalisms do occasionally help, and this is one such
example.  There is other existing code that can be refactored to take
advantage of this style as well.

I do _not_ wish to pull an existing functional dependency into TAMER; I want
to keep these abstractions light, and eliminate them as necessary, as Rust
continues to integrate new features into its core.  I also want to be able
to modify the abstractions to suit our particular needs.  (This is _not_ a
general recommendation; it's particular to TAMER and to my experience.)

This implementation of `Functor` is one such example.  While it is modeled
after Haskell in that it provides `fmap`, the primitive here is instead
`map`, with `fmap` derived from it, since `map` allows for better use of
Rust idioms.  Furthermore, it's polymorphic over _trait_ type parameters,
not method, allowing for separate trait impls for different container types,
which can in turn be inferred by Rust and allow for some very concise
mapping; this is particularly important for TAMER because of the disciplined
use of newtypes.

For example, `foo.overwrite(span)` and `foo.overwrite(name)` are both
self-documenting, and better alternatives than, say, `foo.map_span(|_|
span)` and `foo.map_symbol(|_| name)`; the latter are perfectly clear in
what they do, but lack a layer of abstraction, and are verbose.  But the
clarity of the _new_ form does rely on either good naming conventions of
arguments, or explicit type annotations using turbofish notation if
necessary.

This will be implemented on core Rust types as appropriate and as
possible.  At the time of writing, we do not yet have trait specialization,
and there's too many soundness issues for me to be comfortable enabling it,
so that limits that we can do with something like, say, a generic `Result`,
while also allowing for specialized implementations based on newtypes.

DEV-13160
2023-01-20 23:37:27 -05:00
Mike Gerwitz 0784dc306e tamer: .gitignore: Ignore files with common debugging conventions
Admittedly, there are _my_ debugging conventions.  But I'm also the only one
working on this project right now.

I want to keep various things around without cluttering untracked file
output, because finding new files can be annoying in all the output.
2023-01-04 11:56:03 -05:00
Mike Gerwitz 9103c93693 tamer: xir::writer: write{=>_all}
Really, with a C background, I should have known that `write` may not write
all bytes, and I'm pretty sure I was aware, so I'm not sure how that slipped
my mind for every call.  But it's not a great default, and I do feel like
`write_all` should be the deafult behavior, despite the syscall and C
library name.

It shouldn't take clippy to warn about something so significant.
2023-01-01 23:43:00 -05:00
Mike Gerwitz 0863536149 tamer: asg::Asg::get: Narrow object type
This uses `ObjectIndex` to automatically narrow the type to what is
expected.

Given that `ObjectIndex` is supposed to mean that there must be an object
with that index, perhaps the next step is to remove the `Option` from `get`
as well.

DEV-13160
2022-12-22 16:32:21 -05:00
Mike Gerwitz 6e90867212 tamer: asg::object::Object{Ref=>Index}: Associate object type
This makes the system a bit more ergonomic and introduces additional type
safety by associating the narrowed object type with the
`ObjectIndex` (previously `ObjectRef`).  Not only does this allow us to
explicitly state the type of object wherever those indices are stored, but
it also allows the API to automatically narrow to that type when operating
on it again without the caller having to worry about it.

DEV-13160
2022-12-22 15:18:08 -05:00
Mike Gerwitz 646633883f tamer: Initial concept for AIR/ASG Expr
This begins to place expressions on the graph---something that I've been
thinking about for a couple of years now, so it's interesting to finally be
doing it.

This is going to evolve; I want to get some things committed so that it's
clear how I'm moving forward.  The ASG makes things a bit awkward for a
number of reasons:

  1. I'm dealing with older code where I had a different model of doing
       things;
  2. It's mutable, rather than the mostly-functional lowering pipeline;
  3. We're dealing with an aggregate ever-evolving blob of data (the graph)
       rather than a stream of tokens; and
  4. We don't have as many type guarantees.

I've shown with the lowering pipeline that I'm able to take a mutable
reference and convert it into something that's both functional and
performant, where I remove it from its container (an `Option`), create a new
version of it, and place it back.  Rust is able to optimize away the memcpys
and such and just directly manipulate the underlying value, which is often a
register with all of the inlining.

_But_ this is a different scenario now.  The lowering pipeline has a narrow
context.  The graph has to keep hitting memory.  So we'll see how this
goes.  But it's most important to get this working and measure how it
performs; I'm not trying to prematurely optimize.  My attempts right now are
for the way that I wish to develop.

Speaking to #4 above, it also sucks that I'm not able to type the
relationships between nodes on the graph.  Rather, it's not that I _can't_,
but a project to created a typed graph library is beyond the scope of this
work and would take far too much time.  I'll leave that to a personal,
non-work project.  Instead, I'm going to have to narrow the type any time
the graph is accessed.  And while that sucks, I'm going to do my best to
encapsulate those details to make it as seamless as possible API-wise.  The
performance hit of performing the narrowing I'm hoping will be very small
relative to all the business logic going on (a single cache miss is bound to
be far more expensive than many narrowings which are just integer
comparisons and branching)...but we'll see.  Introducing branching sucks,
but branch prediction is pretty damn good in modern CPUs.

DEV-13160
2022-12-22 14:33:28 -05:00
Mike Gerwitz 97685fd146 tamer: span::Span::merge: New method
This will be used for expression start and end spans to merge into a span
that represents the entirety of the expression; see future commits for its
use.

Though, this has been generalized further than that to ensure that it makes
sense in any use case, to avoid potential pitfalls.

DEV-13160
2022-12-21 12:35:01 -05:00
Mike Gerwitz 6d9ca6947a tamer: diagnose::report: Add line of padding above footer
This adds a line of padding between the last line of a source marking and
the first line of a footer, making it easier to read.  This also matches the
behavior of Rust's error message.

This is something I intended to do previously, but didn't have the
time.  Not that I do now, but now that we'll be showing some more robust
diagnostics to users, it ought to look decent.

DEV-13430
2022-12-16 16:24:50 -05:00
Mike Gerwitz 8c4923274a tamer: ld::xmle::lower: Diagnostic message for cycles
This moves the special handling of circular dependencies out of
`poc.rs`---and to be clear, everything needs to be moved out of there---and
into the source of the error.  The diagnostic system did not exist at the
time.

This is one example of how easy it will be to create robust diagnostics once
we have the spans on the graph.  Once the spans resolve to the proper source
locations rather than the `xmlo` file, it'll Just Work.

It is worth noting, though, that this detection and error will ultimately
need to be moved so that it can occur when performing other operation on the
graph during compilation, such as type inference and unification.  I don't
expect to go out of my way to detect cycles, though, since the linker will.

DEV-13430
2022-12-16 15:09:05 -05:00
Mike Gerwitz f3135940c1 tamer: fmt: JoinListWrap: New wrapper
This adds the same delimiter between each list element.

DEV-13430
2022-12-16 14:46:12 -05:00
Mike Gerwitz af91857746 tame: obj::xmlo: Use SPair where applicable
This simply pairs the individual `SymbolId` and `Span`.  Rationale for this
pairing was added as documentation to `SPair`.

DEV-13430
2022-12-16 14:46:10 -05:00
Mike Gerwitz c71f3247b1 tamer: Remove int_log feature flag (stabalized in 1.68-nightly)
This also bumps the minimum nightly version.
2022-12-16 14:44:39 -05:00
Mike Gerwitz 7d86fdd97d tamer: Make RUSTFLAGS explicit in the cargo invocation
Previously this just exported the variable into the environment, but I'm not
comfortable with the lack of visibility that provides; I want to be able to
see not only that it's happening, which will help to debug issues, but also
when it's _not_ happening so that I know that it needs to be introduced into
a configuration at a particular installation site.
2022-12-16 14:44:39 -05:00
Mike Gerwitz 0b2e563cdb tamer: asg: Associate spans with identifiers and introduce diagnostics
This ASG implementation is a refactored form of original code from the
proof-of-concept linker, which was well before the span and diagnostic
implementations, and well before I knew for certain how I was going to solve
that problem.

This was quite the pain in the ass, but introduces spans to the AIR tokens
and graph so that we always have useful diagnostic information.  With that
said, there are some important things to note:

  1. Linker spans will originate from the `xmlo` files until we persist
     spans to those object files during `tamec`'s compilation.  But it's
     better than nothing.
  2. Some additional refactoring is still needed for consistency, e.g. use
     of `SPair`.
  3. This is just a preliminary introduction.  More refactoring will come as
     tamec is continued.

DEV-13041
2022-12-16 14:44:38 -05:00
Mike Gerwitz 3cc40f387b tamer: RUSTFLAGS support
Primarily intended for `-C target-cpu=native`.
2022-12-14 19:56:57 -05:00
Mike Gerwitz 92afc19cf8 tamer: asg::ident::test: Extract into own file
DEV-13430
2022-12-13 23:29:30 -05:00
Mike Gerwitz 56d1ecf0a3 tamer: Air{Token=>}
Consistency with `Nir` et al.

DEV-13430
2022-12-13 14:36:38 -05:00
Mike Gerwitz be41d056bb tamer: nir::air: Lower to Air::TODO
This actually passes data to the next parser, whereas before we were
stopping short.

DEV-13160
2022-12-13 14:28:16 -05:00
Mike Gerwitz d55b3add77 tamer: asg::air::test: Extract into own file
Just minor preparatory work.

DEV-13160
2022-12-13 13:57:04 -05:00
Mike Gerwitz daeaade53c tamer: tamec: Expose ASG context in lowering pipeline
The previous commit had the ASG implicitly constructed and then
discarded.  This will keep it around, which will be necessary not only for
imports, but for passing the ASG off to the next phases of lowering.

DEV-13429
2022-12-13 13:46:31 -05:00
Mike Gerwitz aa1ca06a0e tamer: tamec: Introduce NIR->AIR->ASG lowering
This does not yet yield the produces ASG, but does set up the lowering
pipeline to prepare to produce it.  It's also currently a no-op, with
`NirToAsg` just yielding `Incomplete`.

The goal is to begin to move toward vertical slices for TAMER as I start to
return to the previous approach of a handoff with the old compiler.  Now
that I've gained clarity from my previous failed approach (which I
documented in previous commits), I feel that this is the best way forward
that will allow me to incrementally introduce more fine-grained performance
improvements, at the cost of some throwaway work as this progresses.  But
the cost of delay with these build times is far greater.

DEV-13429
2022-12-13 13:37:07 -05:00
Mike Gerwitz f0aa8c7554 tamer: nir::parse: Remove enum prefix from variants
This just makes things less verbose.  Doing so in its own commit before I
start making real changes.

DEV-13159
2022-12-07 12:50:21 -05:00
Mike Gerwitz cf2139a8ef tamer: nir::interp: Errors and recovery
This finalizes the implementation for interpolation.  There is some more
cleanup that can be done, but it is now functioning as intended and
providing errors.

Finally.  How deeply exhausting all of this has been.

DEV-13156
2022-12-07 10:54:21 -05:00
Mike Gerwitz 2f963fafb2 tamer: nir::interp::test: Remove significant duplication
This just cleans up these tests a bit before I add to them.  What we're left
with follows the structure of most other parser tests and is atm a good
balance between boilerplate and clarity in isolation (a fair level of
abstraction).

Could possibly do better by putting the inner objects in a callback so that
the `Close` can be asserted on commonly as well, but that's a bit awkward
with how the assertion is based on the collection; we'd have to keep the
last item from being collected from the iterator.  I'd rather not deal with
such restructuring right now and figuring out a decent pattern.  Perhaps in
the future.

DEV-13156
2022-12-06 12:04:48 -05:00
Mike Gerwitz 8d2d273932 tamer: nir::interp: Integrate NIR interpolation into lowering pipeline
This is the culmination of all the recent work---the third attempt at trying
to integrate this.  It ended up much cleaner than what was originally going
to be done, but only after gutting portions of the system and changing my
approach to how NIR is parsed (WRT attributes).  See prior commits for more
information.

The final step is to fill the error branches with actual errors rather than
`todo!`s.

What a relief.

DEV-13156
2022-12-05 16:32:00 -05:00
Mike Gerwitz 3050566062 tamer: nir::interp: Expand into new NIR tokens
This begins to introduce the new, simplified NIR by creating tokens that
serve as the expansion for interpolation.  Admittedly, `Text` may change, as
it doesn't really represent `<text>foo</text>`, and I'd rather that node
change as well, though I'll probably want to maintain some sort of BC.

DEV-13156
2022-12-02 00:15:31 -05:00
Mike Gerwitz 07dff3ba4e tamer: xir::parse::ele: Remove attr sum state
This removes quite a bit of work, and work that was difficult to reason
about.  While I'm disappointed that that hard work is lost (aside from
digging it up in the commit history), I am happy that it was able to be
removed, because the extra complexity and cognitive burden was significant.

This removes more `memcpy`s than the sum state could have hoped to, since
aggregation is no longer necessary.  Given that, there is a slight
performacne improvement.  The re-introduction of required and duplicate
checks later on should be more efficient than this was, and so this should
be a net win overall in the end.

DEV-13346
2022-12-01 11:09:26 -05:00
Mike Gerwitz f872181f64 tamer: xir::parse: Remove old `attr_parse!` and unused error variants
This cleans up the old implementation now that it's no longer used (as of
the previous commit) by `ele_parse!`.  It also removes the two error
variants that no longer apply: required attributes and duplicate
attributes.

DEV-13346
2022-12-01 11:09:26 -05:00
Mike Gerwitz ab0e4151a1 tamer: xir::parse::ele::ele_parse!: Integrate `attr_parse_stream!`
This handles the bulk of the integration of the new `attr_parse_stream!` as
a replacement for `attr_parse!`, which moves from aggregate attribute
objects to a stream of attribute-derived tokens.  Rationale for this change
is in the preceding commit messages.

The first striking change here is how it affects the test cases: nearly all
`Incomplete`s are removed.  Note that the parser has an existing
optimization whereby `Incomplete` with lookahead causes immediate recursion
within `Parser`, since those situations are used only for control flow and
to keep recursion out of `ParseState`s.

Next: this removes types from `nir::parse`'s grammar for attributes.  The
types will instead be derived from NIR tokens later in the lowering
pipeline.  This simplifies NIR considerably, since adding types into the mix
at this point was taking an already really complex lowering phase and making
it ever more difficult to reason about and get everything working together
the way that I needed.

Because of `attr_parse_stream!`, there are no more required attribute
checks.  Those will be handled later in the lowering pipeline, if they're
actually needed in context, with possibly one exception: namespace
declarations.  Those are really part of the document and they ought to be
handled _earlier_ in the pipeline; I'll do that at some point.  It's not
required for compilation; it's just required to maintain compliance with the
XML spec.

We also lose checks for duplicate attributes.  This is also something that
ought to be handled at the document level, and so earlier in the pipeline,
since XML cares, not us---if we get a duplicate attribute that results in an
extra NIR token, then the next parser will error out, since it has to check
for those things anyway.

A bunch of cleanup and simplification is still needed; I want to get the
initial integration committed first.  It's a shame I'm getting rid of so
much work, but this is the right approach, and results in a much simpler
system.

DEV-13346
2022-12-01 11:09:26 -05:00
Mike Gerwitz 1983e73c81 tamer: xir::parse::attrstream: Value from SPair
This really does need documentation.

With that said, this changes things up a bit: the value is now derived from
an `SPair` rather than an `Attr`, given that the name is redundant.  We do
not need the attribute name span, since the philosophy is that we're
stripping the document and it should no longer be important beyond the
current context.

It does call into question errors, but my intent in the future is to be able
to have the lowering pipline augment errors with its current state---since
we're streaming, then an error that is encountered during lowering of an
element will still have the element parser in the state representing the
parsing of that element; so that information does not need to be propagated
down the pipeline, but can be augmented as it bubbles back up.

More on that at some point in the future; not right now.

DEV-13346
2022-12-01 11:09:25 -05:00
Mike Gerwitz 9ad7742ad2 tamer: xir::parse::attrstream: Streaming attribute parser
As I talked about in the previous commit, this is going to be the
replacement for the aggreagte `attr_parse!`; the next commit will integrate
it into `ele_parse!` so that I can begin to remove the old one.

It is disappointing, since I did put a bit of work into this and I think the
end result was pretty neat, even if was never fully utilized.  But, this
simplifies things significantly; no use in maintaining features that serve
no purpose but to confound people.

DEV-13346
2022-12-01 11:09:25 -05:00
Mike Gerwitz 6d39474127 tamer: NIR re-simplification
Alright, this has been a rather tortured experience.  The previous commit
began to state what is going on.

This is reversing a lot of prior work, with the benefit of
hindsight.  Little bit of history, for the people who will probably never
read this, but who knows:

As noted at the top of NIR, I've long wanted a very simple set of general
primitives where all desugaring is done by the template system---TAME is a
metalanguage after all.  Therefore, I never intended on having any explicit
desugaring operations.

But I didn't have time to augment the template system to support parsing on
attribute strings (nor am I sure if I want to do such a thing), so it became
clear that interpolation would be a pass in the compiler.  Which led me to
the idea of a desugaring pass.

That in turn spiraled into representing the status of whether NIR was
desugared, and separating primitives, etc, which lead to a lot of additional
complexity.  The idea was to have a Sugared and Plan NIR, and further within
them have symbols that have latent types---if they require interpolation,
then those types would be deferred until after template expansion.

The obvious problem there is that now:

  1. NIR has the complexity of various types; and
  2. Types were tightly coupled with NIR and how it was defined in terms of
     XML destructuring.

The first attempt at this didn't go well: it was clear that the symbol types
would make mapping from Sugared to Plain NIR very complicated.  Further,
since NIR had any number of symbols per Sugared NIR token, interpolation was
a pain in the ass.

So that lead to the idea of interpolating at the _attribute_ level.  That
seemed to be going well at first, until I realized that the token stream of
the attribute parser does not match that of the element parser, and so that
general solution fell apart.  It wouldn't have been great anyway, since then
interpolation was _also_ coupled to the destructuring of the document.

Another goal of mine has been to decouple TAME from XML.  Not because I want
to move away from XML (if I did, I'd want S-expressions, not YAML, but I
don't think the team would go for that).  This decoupling would allow the
use of a subset of the syntax of TAME in other places, like CSVMs and YAML
test cases, for example, if appropriate.

This approach makes sense: the grammar of TAME isn't XML, it's _embedded
within_ XML.  The XML layer has to be stripped to expose it.

And so that's what NIR is now evolving into---the stripped, bare
repsentation of TAME's language.  That also has other benefits too down the
line, like a REPL where you can use any number of syntaxes.  I intend for
NIR to be stack-based, which I'd find to be intuitive for manipulating and
querying packages, but it could have any number of grammars, including
Prolog-like for expressing Horn clauses and querying with a
Prolog/Datalog-like syntax.  But that's for the future...

The next issue is that of attribute types.  If we have a better language for
NIR, then the types can be associated with the NIR tokens, rather than
having to associate each symbol with raw type data, which doesn't make a
whole lot of sense.  That also allows for AIR to better infer types and
determine what they ought to be, and further makes checking types after
template application natural, since it's not part of NIR at all.  It also
means the template system can naturally apply to any sources.

Now, if we take that final step further, and make attributes streaming
instead of aggregating, we're back to a streaming pipeline where all
aggregation takes place on the ASG (which also resolves the memcpy concerns
worked around previously, also further simplifying `ele_parse` again, though
it sucks that I wasted that time).  And, without the symbol types getting
in the way, since now NIR has types more fundamentally associated with
tokens, we're able to interpolate on a token stream using simple SPairs,
like I always hoped (and reverted back to in the previous commit).

Oh, and what about that desugaring pass?  There's the issue of how to
represent such a thing in the type system---ideally we'd know statically
that desugaring always lowers into a more primitive NIR that reduces the
mapping that needs to be done to AIR.  But that adds complexity, as
mentioned above.  The alternative is to just use the templat system, as I
originally wanted to, and resolve shortcomings by augmenting the template
system to be able to handle it.  That not only keeps NIR and the compiler
much simpler, but exposes more powerful tools to developers via TAME's
metalanguage, if such a thing is appropriate.

Anyway, this creates a system that's far more intuitive, and far
simpler.  It does kick the can to AIR, but that's okay, since it's also
better positioned to deal with it.

Everything I wrote above is a thought dump and has not been proof-read, so
good luck!  And lets hope this finally works out...it's actually feeling
good this time.  The journey was necessary to discover and justify what came
out of it---everything I'm stripping away was like a cocoon, and within it
is a more beautiful and more elegant TAME.

DEV-13346
2022-12-01 11:09:25 -05:00
Mike Gerwitz 76beb117f9 Revert "tamer: nir::desugar::interp: Include attribute name in derived param name"
Also: Revert "tamer: nir::desugar::interp: Token {SPair=>Attr}"

This reverts commit 7fd60d6cdafaedc19642a3f10dfddfa7c7ae8f53.
This reverts commit 12a008c66414c3d628097e503a98c80687e3c088.

This has been quite a tortured experience, trying to figure out how to best
fit desugaring into the existing system.  The truth is that it ultimately
failed because I was not sticking with my intuition---I was trying to get
things out quickly by compromising on the design, and in the end, it saved
me nothing.

But I wouldn't say that it was a waste of time---the path was a dead end,
but it was full of experiences.

More to come, but interpolation is back to operating on NIR directly, and I
chose to treat it as a source-to-source mapping and not represent it using
the type system---interpolation can be an optional feature when writing TAME
frontends (the principal one being the XML-based one), and it's up to later
checks to assert that identifiers match a given domain.

I am disappointed by the additional context we lose here, but that can
always be introduced in the future differently, e.g. by maintaining a
dictionary of additional context for spans that can be later referenced for
diagnostic purposes.  But let's worry about that in the future; it doesn't
make sense to further complicate IRs for such a thing.

DEV-13346
2022-12-01 11:09:25 -05:00
Mike Gerwitz 9da6cb439f tamer: nir::desugar::interp: Include attribute name in derived param name
This is simply to aid with debugging.  See commit for information on why I
didn't include the attribute name in the param name itself.

DEV-13156
2022-12-01 11:09:25 -05:00
Mike Gerwitz 6a8befb98c tamer: convert::Expect{From,Into}: Diagnostic panics
Converts to use TAME's diagnostic panics, same as previous commits.  Also
introduces impl for `Result`, which I apparently hadn't needed yet.

In the future, I hope trait impl specializations will be available to
automatically derive and expose span information in these diagnostic
messages for certain types.

DEV-13156
2022-12-01 11:09:25 -05:00
Mike Gerwitz d0a728c27f tamer: nir::desugar::interp: Token {SPair=>Attr}
This changes the input token from a more generic `SPair` to `Attr`, which
reflects the new target integration point in the `attr_parse!`
parser-generator.

This is a compromise---I'd like for it to remain generic and have stitching
deal with all integration concerns, but I have spent far too much time on
this and need to keep moving.

With that said, we do benefit from knowing where this must fit in---it's
easier to reason about in a more concrete way, and we can take advantage of
the extra information rather than being burdened by its presence and
ignoring it.  We need to be able to convert back into `XirfToken` (see a
recent commit that discusses that) for `StitchExpansion`, which is why
`Attr` is here.  And since it is, we can use it to explain to the user not
just the interpolation specification used to derive params, but also the
attribute it is associated with.  This is what TAME (in XSLT) does today,
IIRC (I wrote it, I just forget exactly).  It also means that I can name the
parameters after the attribute.

So, that'll be in a following commit; I was disappointed when my prior
approach with `SPair` didn't give me enough information to be able to do
that, since I think it's important that the system be as descriptive as
possible in how it derives information.  Of course, traces would reveal how
the parser came about the derivation, but that requires recompilation in a
special tracing mode.

DEV-13156
2022-12-01 11:09:25 -05:00
Mike Gerwitz 99dcba690f tamer: parse: SP::Token: From<Self::Token>
Of course I would run into integration issues.  My foresight is lacking.

The purpose of this is to allow for type narrowing before passing data to a
more specialized ParseState, so that the other ParseState doesn't need to
concern itself with the entire domain of inputs that it doesn't need, and
repeat unnecessary narrowing.

For example, consider XIRF: it has an `Attr` variant, which holds an `Attr`
object.  We'll want to desugar that object.  It does not make sense to
require that the desugaring process accept `XirfToken` when we've already
narrowed it to an `Attr`---we should accept an Attr.

However, we run into a problem immediately: what happens with tokens that
bubble back up due to lookahead or errors?  Those tokens need to be
converted _back_ (widened).  Fortunately, widening is a much easier process
than narrowing---we can simply use `From`, as we do today so many other
places.

So, this still keeps the onus of narrowing on the caller, but for now that
seems most appropriate.  I suspect Rust would optimize away duplicate
checks, but that still leaves the maintenance concern---the two narrowings
could get out of sync, and that's not acceptable.

Unfortunately, this is just one of the problems with integration...

DEV-13156
2022-12-01 11:09:14 -05:00
Mike Gerwitz 1aca0945df tamer: parse::util::expand::StitchExpansion: Began transition from ParseState to method
My initial plan with expansion was to wrap a `PasteState` in another that
unwraps `Expansion` and converts into a `Dead` state, so that existing
`TransitionResult` stitching methods (`delegate`, specifically) could be
used.

But the desire to use that existing method was primarily because stitching
was a complex operation that was abstracted away _as part of the `delegate`
method_, which made writing new ones verbose and difficult.  Thus began the
previous commits to begin to move that responsibility elsewhere so that it
could be more composable.

This continues with that, introducing a new trait that will culminate in the
removal of a wrapping `ParseState` in favor of a stitching method.  The old
`StitchableExpansionState` is still used for tests, which demonstrates that
the boilerplate problem still exists despite improvements made here  These
will become more generalized in the future as I have time (and the
functional aspects of the code more formalized too, now that they're taking
shape).

The benefit of this is that we avoid having to warp our abstractions in ways
that don't make sense (use of a dead state transition) just to satisfy
existing APIs.  It also means that we do not need the boilerplate of a
`ParseState` any time we want to introduce this type of
stitching/delegation.  It also means that those methods can eventually be
extracted into more general traits in the future as well.

Ultimately, though, the two would have accomplished the same thing.  But the
difference is most emphasized in the _parent_---the actual stitching still
has to take place for desugaring in the attribute parser, and I'd like for
that abstraction to still be in terms of expansion.  But if I utilized
`StitchableExpansionState`, which converted into a dead state, I'd have to
either forego the expansion abstraction---which would make the parser even
more confusing---or I'd have to create _another_ abstraction around the dead
state, which would mean that I stripped one abstraction just to introduce
another one that's essentially the same thing.  It didn't feel right, but it
would have worked.

The use of `PhantomData` in `StitchableExpansionState` was also a sign that
something wasn't quite right, in terms of how the abstractions were
integrating with one-another.

And so here we are, as I struggle to wade my way through all of the yak
shavings and make any meaningful progress on this project, while others
continue to suffer due to slow build times.

I'm sorry.  Even if the system is improving.

DEV-13156
2022-11-17 15:12:25 -05:00
Mike Gerwitz 1ce36225f6 tamer: diagnose::panic::DiagnosticOptionPanic: New panic
This is just intended to simplify the job of panicing when something is
expected to be `None`.  In my case, `Lookahead`; see upcoming commits.

This is intended to be generalized to more than just `Option`, but I have no
use for it elsewhere yet; I primarily just needed to implement a method on
`Option` so that I could have the ergonomics of the dot notation.

DEV-13156
2022-11-17 14:36:00 -05:00
Mike Gerwitz 42618c5add tamer: parse: Abstract lookahead token replacement panic
There's no use in duplicating this in util::expand.

Lookahead tokens are one of the few invariants that I haven't taken the time
of enforcing using the type system, because it'd be quite a bit of work that
I do not have time for, and may not be worth it with changes that may make
the system less ergonomic.  Nonetheless, I do hope to address it at some
point in the (possibly-far) future.

If ever you encounter this diagnostic message, ask yourself how stable TAMER
otherwise is and how many other issues like this have been entirely
prevented through compile-time proofs using the type system.

DEV-13156
2022-11-16 15:25:52 -05:00
Mike Gerwitz a377261de3 tamer: parse::state::transition::TransitionResult::with_lookahead: {=>diagnostic_}panic!
As in previous commits, this continues to replace panics with
`diagnostic_panic!`, which provides much more useful information both for
debugging and to help the user possibly work around the problem.  And lets
the user know that it's not their fault, and it's a TAMER bug that should be
reported.

...am I going to rationalize it in each commit message?

DEV-13156
2022-11-16 14:20:58 -05:00
Mike Gerwitz 8cb4eb5b81 tamer: parse::util::expand::StitchableExpansionState: Utilize bimap
This is just a light refactoring to utilize the new
`TransitionResult::bimap` method.

DEV-13156
2022-11-16 14:09:14 -05:00
Mike Gerwitz 60ce1305cc tamer: parse::state: Further generalize ParseState::delegate
This moves enough of the handling of complex type conversions into the
various components of `TransitionResult` (and itself), which simplifies
delegation and opens up the possibility of having specialized
delegation/stitching methods implemented atop of `TransitionResult`.

DEV-13156
2022-11-16 14:09:11 -05:00
Mike Gerwitz a17e53258b tamer: parse::state: Begin to tame delegation methods
These delegation methods have been a pain in my ass for quite some time, and
their lack of generalization makes the introduction of new delegation
methods (in the general sense, not necessarily trait methods) very tedious
and prone to inconsistencies.

I'm going to progressively refactor them in separate commits so it's clear
what I'm doing, primarily for future me to reference if need be.

DEV-13156
2022-11-16 10:38:58 -05:00
Mike Gerwitz fc425ff1d5 tamer: parse::state: EchoState and TransitionResult constituent primitives
This beings to introduce more primitive operations to `TransitionResult` and
its components so that I can actually work with them without having to write
a bunch of concrete, boilerplate implementations.  This is demonstrated in
part by `EchoState` (which is nearly all boilerplate, but whose correctness
should be verifiable at a glance), which will be used going forward as a
basis for default implementations for parsers (e.g. expansion delegation).

DEV-13156
2022-11-16 10:37:10 -05:00
Mike Gerwitz 55c55cabd3 tamer: parse::util::expand: Move expansion into own module
This has evolved into a more robust and independent concept, but it is still
a utility in the sense that it's utilizing existing parsing framework
features and making them more convenient.

DEV-13156
2022-11-15 13:28:54 -05:00
Mike Gerwitz ddb4f24ea5 tamer: parse::util (ExpandableParseState, ExpandableInto): Clarifying traits
These traits serve to abstract away some of the type-level details and
clearly state what the end result is (something stitchable with a parent).

I'm admittedly battling myself on this concept a bit.  The proper layer of
abstraction is the concept of expansion, which is an abstraction that is
likely to be maintained all the way through, but we strip the abstraction
for the sake of delegation.  Maybe the better option is to provide a
different method of delegation and avoid the stripping at all, and avoid the
awkward interaction with the dead state.

The awkwardness comes from the fact that delegating right now is so rigid
and defined in terms of a method on state rather than a mapping between
`TransitionResult`s.  But I really need to move on... ;_;

The original design was trying to generalize this such that composition at
the attribute parser level (for NIR) would be able to just accept any
sitchable parser with the convention that the dead state is the replacement
token.  But that is the wrong layer of abstraction, which not only makes it
confusing, but is asking for trouble when someone inevitably violates that
contract.

With all of that said, `StitchableExpansionState` _is_ a delegation.  It
could just as easily be a function (`is_accepting` always delegates too), so
perhaps that should just be generalized as reifying delegation as a
`ParseState`.

DEV-13156
2022-11-15 12:56:25 -05:00
Mike Gerwitz 03cf652c41 tamer: parse::util: Introduce StitchableExpansionState
This parser really just allows me to continue developing the NIR
interpolation system using `Expansion` terminology, and avoid having to use
dead states in tests.  This allows for the appropriate level of abstraction
to be used in isolation, and then only be stripped when stitching is
necessary.

Future commits will show how this is actually integrated and may introduce
additional abstraction to help.

DEV-13156
2022-11-15 12:19:25 -05:00
Mike Gerwitz 4117efc50c tamer: nir::desugar::interp: Generalize without NIR symbol types
This is a shift in approach.

My original idea was to try to keep NIR parsing the way it was, since it's
already hard enough to reason about with the `ele_parse!` parser-generator
macro mess.  The idea was to produce an IR that would explicitly be denoted
as "maybe sugared", and have a desugaring operation as part of the lowering
pipeline that would perform interpolation and lower the symbol into a plain
version.

The problem with that is:

  1. The use of the type was going to introduce a lot of mapping for all the
     NIR token variants there are going to be; and
  2. _The types weren't even utilized for interpolation._

Instead, if we interpolated _as attributes are encountered_ while parsing
NIR, then we'd be able to expand directly into that NIR token stream and
handle _all_ symbols in a generic way, without any mapping beyond the
definition of NIR's grammar using `ele_parse!`.

This is a step in that direction---it removes `NirSymbolTy` and introduces a
generic abstraction for the concept of expansion, which will be utilized
soon by the attribute parser to allow replacing `TryFrom` with something
akin to `ParseFrom`, or something like that, which is able to produce a
token stream before finally yielding the value of the attribute (which will
be either the original symbol or the replacement metavariable, in the case
of interpolation).

(Note that interpolation isn't yet finished---errors still need to be
implemented.  But I want a working vertical slice first.)

DEV-13156
2022-11-10 12:33:30 -05:00
Mike Gerwitz 8a430a52bc tamer: xir::prase: Extract intermediate attribute aggregate state into Context
This was a substantial change.  Design and rationale are documented on
`AttrFieldSum` and related as part of this change, so please review the diff
for more information there.

If you're a Ryan employee, DEV-13209 gives plenty of profiling information,
including raw data and visualizations from kcachegrind.  For everyone else:
you're able to easy produce your own from this commit and the previous and
comparing the `__memcpy_avk_unaligned_erms` calls.  The reduction is
significant in this commit (~90%), and the number of Parsers invoking it has
been reduced.  Rust has been able to optimize more aggressively, and
compound some of those optimizations, with the smaller `NirParseState`
width.

It also worth noting that `malloc` calls do not change at all between
these two changes, so when we refer to memory, we're referring to
pre-allocated memory on the stack, as TAMER was designed to utilize.

DEV-13209
2022-11-09 16:01:09 -05:00
Mike Gerwitz 6ae6ca716c tamer: diagnose::panic::diagnostic_unreachable!: New macro
This is a diagnostic replacement for `unreachable!`.

Eventually TAMER'll have build-time checks to enforce the use of these over
alternatives; I need to survey the old instances on a case-by-case basis to
see what diagnostic information can be reasonably presented in that context.

DEV-13209
2022-11-09 10:47:17 -05:00
Mike Gerwitz 5c5041f90e tamer: nir::desugar::interp: Proper span offsets
The spans were previously not being calculated relative to the offset of the
original symbol span.  Tests were passing because all of those spans began
at offset 0.

DEV-13156
2022-11-08 00:55:45 -05:00
Mike Gerwitz 6b9979da9a tamer: nir::desugar::interp: Valid parses
This completes the valid parses, though some more refactoring will be
done.  Next up is error handling and recovery.

DEV-13156
2022-11-07 23:59:47 -05:00
Mike Gerwitz 4a7fe887d5 tamer: nir::desugar: Initial interpolation desugaring
This demonstrates how desugaring of interpolated strings will work, testing
one of the happy paths.  The remaining work to be done is largely
refactoring; handling some other cases; and errors.  Each of those items are
marked with `todo!`s.

I'm pleased with how this is turning out, and I'm excited to see diagnostic
reporting within the specification string using the derived spans once I get
a bit further along; this robust system is going to be much more helpful to
developers than the existing system in XSLT.

This also eliminates the ~50% performance degredation mentioned in a recent
commit by eliminating the SugaredNirSymbol enum and replacing it with a
newtype; this is a much better approach, though it doesn't change that I do
need to eventually address the excessive `memcpy`s on hot code paths.

DEV-13156
2022-11-07 14:15:16 -05:00
Mike Gerwitz 66f09fa4c9 tamer: parse::prelude: New module
Not sure why I didn't add a prelude sooner, considering all the import
boilerplate.  This will evolve as needed and I'll go back and replace other
imports when I'm not in the middle of something.

DEV-13156
2022-11-02 14:56:26 -04:00
Mike Gerwitz 9922910d09 tamer: nir::NirSymbolTy (Display): Add impl
Add initial descriptions and consolodate some of the types.  There'll be
more to come; this is just to get `Display` derives working for types
that'll be using it.  I'd like to see where this description manifests
itself before I decide how user-friendly I'd like it to be.

DEV-13156
2022-11-01 16:23:51 -04:00
Mike Gerwitz 5e2d8f13a7 tamer: nir (SugaredNir): Mirror PlainNir
This mirror is only a `Todo` variant at the moment, but my hope had been to
try to creatively nest or use generics to simplify the conversaion between
the two flavors without a lot of boilerplate.  But it doesn't seem like I'm
going to be successful, and may have to resort to macros to remove
boilerplate.

But I need to stop fighting with myself and move on.  Though I would still
like to keep the types purely compile-time via const generics if possible,
since they're not needed in memory (or disk) until we get to templates;
they're otherwise static relative to a NIR token variant.

DEV-13209
2022-11-01 15:22:42 -04:00
Mike Gerwitz 7f71f3f09f tamer: nir: Detect interpolated values
This simply detects whether a value will need to be further parsed for
interpolation; it does not yet perform the parsing itself, which will happen
during desugaring.

This introduces a performance regression, for an interesting reason.  I
found that introducing a single new variant to `SugaredNir` (with a
`(SymbolId, Span)` pair), was causing the width of the `NirParseState` type
to increase just enough to cause Rust to be unable to optimize away a
significant number of memcpys related to `Parser` moves, and consequently
reducing performance by nearly 50% for `tamec`.  Yikes.

I suspected this would be a problem, and indeed have tried in all other
cases to avoid aggregation until the ASG---the problem is that I had wanted
to aggregate attributes for NIR so that the IR could actually make some
progress toward simplifying the stream (and therefore working with the
data), and be able to validate against a grammar defined in a single
place.  The problem is that the `NirParseState` type contains a sum type for
every attribute parser, and is therefore as wide as the largest one.  That
is what Rust is having trouble optimizing memcpy away for.

Indeed, reducing the number of attributes improves the situation
drastically.  However, it doesn't make it go away entirely.

If you look at a callgrind profile for `tameld` (or a dissassembly), you'll
notice that I put quite a bit of effort into ensuring that the hot code path
for the lowering pipeline contains _no_ memcpys for the parsers.  But that
is not the case with `tamec`---I had to move on.  But I do still have the
same escape hatch that I introduced for `tameld`, which is the mutable
`Context`.

It seems that may be the solution there too, but I want to get a bit further
along first to see how these data end up propagating before I go through
that somewhat significant effort.

DEV-13156
2022-11-01 15:15:40 -04:00
Mike Gerwitz 37d44e42ad tamer: sym::symbol: Use {=>diagnostic_}panic! for resolution failure
Various parts of the system have to be converted to use `diagnostic_panic!`,
which makes it very clear that this is a bug in TAMER that should be
reported.  I just happened to see this one near code I was about to touch.

DEV-13156
2022-11-01 12:42:36 -04:00
Mike Gerwitz 2a70525275 tamer: sym::prefill::quick_contains_byte: New function
This will be utilized by NIR to avoid having to perform memory lookups for
preinterned static symbols.

DEV-13156
2022-11-01 12:42:32 -04:00
Mike Gerwitz d195eedacb tamer: nir: Sugared and plain flavors
This introduces the concept of sugared NIR and provides the boilerplate for
a desugaring pass.  The earlier commits dealing with cleaning up the
lowering pipeline were to support this work, in particular to ensure that
reporting and recovery properly applied to this lowering operation without
adding a ton more boilerplate.

DEV-13158
2022-10-26 14:19:19 -04:00
Mike Gerwitz dbe834b48a tamer: tamec: Remove lowering pipeline refactoring comment
I'm struggling to go much further yet without sorting out some other things
first with regards to mutable `Context` and, in particular, the ASG.

I'm going to pause on refactoring the lowering pipeline---it's been improved
significantly with the recent work---and I will continue in the next few
weeks.

DEV-13158
2022-10-26 12:44:20 -04:00
Mike Gerwitz 7c4c0ebdda tamer: parse::lower: Separate error types for lowering and return
Lowering errors in tamec end up utilizing recovery and reporting, so there
is a distinction between recoverable and unrecoverable errors.

tameld aborts on the first error, since recovery is not currently
supported (we'll want to add it, since tameld should output e.g. lists of
unresolved externs).

Note that tamec does not yet handle `FinalizeError` like tameld because it
uses `Lower::lower`, which does not yet finalize (though it does in practice
when it reaches the end of the stream and auto-finalizes, but that is
widened into a `ParseError`).

DEV-13158
2022-10-26 12:44:20 -04:00
Mike Gerwitz 26aaf6efc1 tamer: parse::error::ParseError: Extract some variants into FinalizeError
This helps to clarify the situations under which these errors can occur, and
the generality also helps to show why the inner types are as they
are (e.g. use of `String`).

But more importantly, this allows for an error type in `finalize` that is
detached from the `ParseState`, which will be able to be utilized in the
lowering pipeline as a more general error distinguishable from other
lowering errors.  At the moment I'm maintaining BC, but a following commit
will demonstrate the use case to introduce recoverable vs. non-recoverable
errors.

DEV-13158
2022-10-26 12:44:19 -04:00
Mike Gerwitz 2087672c47 tamer: parse::parser::finalize: Introduce FinalizedParser
This newtype allows a caller to prove (using types) that a parser of a given
type (`ParseState`) has been finalized.

This will be used by the lowering pipeline to ensure that all parsers in the
pipeline end up getting finalized (as you can see from a TODO added in the
code, one of them is missing).  The lack of such a type was an oversight
during the (rather stressed) development of the parsing system, and I
shouldn't need to resort to unit tests to verify that parsers have been
finalized.

DEV-13158
2022-10-26 12:44:19 -04:00
Mike Gerwitz 7e62276907 tamer: Revert "tamer: diagnose::report::Report: {Mutable=>immutable} self reference"
This reverts commit 85ec626fcd804eb2fac3fd6f0339182554f72cfd.

This revert had to be modified to work alongside other changes.  Interior
mutability is fortunately no longer needed after the previous commit which
allows reporting to occur in a single place in the lowering pipeline (at the
terminal parser).

DEV-13158
2022-10-26 12:44:18 -04:00
Mike Gerwitz 1c181fe546 tamer: parse::lower: Propagate widened errors to terminal parser
The term "terminal parser" isn't formalized yet in the system, but is meant
to refer to the innermost parser that is responsible for pulling tokens
through the lowering pipeline.

This approach is more of what one would expect when dealing with
`Result`-like monads---we are effectively chaining the inner operation while
propagating errors to short-circuit lowering and let the caller decide
whether recovery ought to be permitted with diagnostic messages.  This will
become more clear as it is further refactored.

This also means that the previous changes for introducing interior
mutability for a shared mutable `Reporter` can be reverted, which is great,
since that approach was antithetical to how the streaming pipeline
operates (and introduces awkward mutable state into an
otherwise-mostly-immutable system).

DEV-13158
2022-10-26 12:32:51 -04:00
Mike Gerwitz 2ccdaf80fe tamer: diagnose::report: Error tracking
This extracts error tracking into the Reporter itself, which is already
shared between lowering operations.  This can then be used to display the
number of errors.

A new formatter (in tamer::fmt) will be added to handle the singular/plural
conversion in place of "error(s)" in the future; I have more important
things to work on right now.

DEV-13158
2022-10-26 12:32:51 -04:00
Mike Gerwitz f049da4496 tamer: tamec: Apply reporting (and continuing) to XirToXirf failure
Previously these errors would immediately abort.

This results in some duplicate code, but it's beginning to derive a common
implementation.  Check out the commits that follow; this is really an
intermediate refactoring state.

DEV-13158
2022-10-26 12:32:51 -04:00
Mike Gerwitz 733f44a616 tamer: diagnose::report::Report: {Mutable=>immutable} self reference
VisualReporter now uses interior mutability so that we can hold multiple
references to it for upcoming lowering pipeline changes.

DEV-13158
2022-10-26 12:32:51 -04:00
Mike Gerwitz a6e72b87f7 tamer: tamec: Extract compilation from main
Another baby step.  The small commits are intended to allow comprehension of
what changes when looking at the diffs.

This also removes a comment stating that errors do not fail compilation,
since they most certainly do.

DEV-13158
2022-10-26 12:32:51 -04:00
Mike Gerwitz 20ea83af1a tamer: tamec: Extract source reading and writing
This begins refactoring the lowering pipeline to begin to obviate
abstraction boundaries.  The lowering pipeline is the backbone of the
system, and so it needs to become clear and self-documenting, which will
take a little bit of work.

DEV-13158
2022-10-26 12:32:51 -04:00
Mike Gerwitz 8c32967cbf tamer: Cargo.toml: Sort dependencies
This always annoys me when I add a dependency and I don't know where I ought
to put it.

Anyway, I was originally going to add the `regex` crate, but with further
planning, I may not end up having use for it.  Nonetheless, at least this is
consistent.
2022-10-18 14:48:14 -04:00
Brandon Ellis b3d8f6c4cd RELEASES.md: Update for v19.1.0 2022-09-22 12:23:13 -04:00
Brandon Ellis 00f46b0032 [DEV-12990] Add gt, gte, lt, lte operators to if/unless
This includes updating Tamer's parser to account for the new
operator possibilities.
2022-09-22 11:38:06 -04:00
Mike Gerwitz 25babde084 .gitlab-ci.yml (build): Re-add tamer/target/doc
This was accidentally removed in a previous commit to reduce artifact sizes.
2022-09-20 09:52:42 -04:00
Mike Gerwitz 80d7de7376 tamer: nir: Remove token `todo!`s
Just preparing to actually define NIR itself.  The _grammar_ has been
represented (derived from our internal systems, using them as a test case),
but the IR itself has not yet received a definition.

DEV-7145
2022-09-19 16:21:42 -04:00
Mike Gerwitz 3456bd593a tamer: tamec: Fail with non-zero status if any NIR parsing errors
This is a quick-and-dirty change.  The lowering pipeline needs a proper
abstraction, but I'm about to be on vacation at the end of the week and
would like to get NIR->AIR lowering started before I consider that
abstraction further, so this will do for now.

NIR parsing has been tested in production without failing for over a week.

DEV-7145
2022-09-19 10:11:47 -04:00
Mike Gerwitz 5403dd06c6 tamer: Provide links to `tame{c,ld}`
DEV-7145
2022-09-19 10:04:40 -04:00
Mike Gerwitz 9966b82b9d tamer: nir::parse: Grammar summary docs
This is intended to provide just enough information to help elucidate how
the system works and why.

DEV-7145
2022-09-19 09:26:38 -04:00
Mike Gerwitz dcb42b6e4b tamer: xir::parse: Improvements to generated docs for NIR attributes
This hides the internal state machine and provides better language for what
remains.

DEV-7145
2022-09-16 13:37:46 -04:00
Mike Gerwitz 1dc691160b tamer: nir: Re-define "NIR"
This was originally the "noramlized" IR, but that's not possible to do
without template expansion, which is going to happen at a later point.  So,
this is just "NIR", pronounced "near", which is an IR that is "near" to the
source code.  You can define it was "Near IR" if you want, but it's just a
homonym with a not-quite-defined acronym to me.

DEV-7145
2022-09-16 09:59:38 -04:00
Mike Gerwitz f9bdcc2775 tamer: xir::parse::ele: Remove `*Error_` types
A type alias was added for BC before errors were hoisted out in a previous
commit, but they are unnecessary because of the associated type on
`ParseState`.

This also corrects the long-existing issue of using generated identifiers in
tests.

DEV-7145
2022-09-15 16:10:47 -04:00
Mike Gerwitz 071c94790f tamer: xir::ele::parse: Formatting: remove a level of indentation
This moves `paste::paste!` up a line and reduces a level of indentation,
since it's so squished.  Aside from docblock reformatting, there are no
other changes.

DEV-7145
2022-09-15 16:09:49 -04:00
Mike Gerwitz b3f4378517 tamer: xir::parse::ele: Hoist NT Display from `ele_parse!` macro
This slims out the macro even further.  It does result in an
awkwardly-placed `PhantomData` because I don't want to add another variant
that isn't actually used (since they represent states).

DEV-7145
2022-09-14 16:34:59 -04:00
Mike Gerwitz 80f29e9420 tamer: xir::parse::ele: Hoist NtState out of `ele_parse!` macro
This does the same as before with SumNtState, and takes advantage of the
preparations made by the preceding commit.  The macro is shrinking.

DEV-7145
2022-09-14 15:35:58 -04:00
Mike Gerwitz 1817659811 tamer: xir::parse::ele: Abstract child NT states in parent parser
This is in preparation for hoisting out the common states, as was done with
the Sum NT in a previous commit.

I also think that organizing states in this way is more clear.  The previous
embedding of the variants named after the NTs themselves was because the
parser was storing the child state within it, before the introduction of the
superstate trampoline.

DEV-7145
2022-09-14 14:47:54 -04:00
Mike Gerwitz d73a18d1a2 tamer: xir::parse::ele: Initial extraction of Sum NT state from macro
After introducing the superstate and trampoline some time ago, the Sum NT
states became fully generalized and can be hoisted out.

DEV-7145
2022-09-14 12:23:52 -04:00
Mike Gerwitz db3fd3f177 tamer: xir::parse::ele: Remove `unreachable!` in state transitions
This will instead fail at compile time.

DEV-7145
2022-09-14 10:00:10 -04:00
Mike Gerwitz a5c7067c68 tamer: xir::parse::ele: Remove NT `todo!` for state transition
Everything except for one state was already accounted for.  We can now have
confidence that the parser will never panic due to state transitions (beyond
legitimate error conditions).

There are some `unreachable!`s to contend with still.

DEV-7145
2022-09-14 09:41:53 -04:00
Mike Gerwitz 212ca06efe tamer: xir::parse: Extract and generalize NT errors
This is the same as the previous commits, but for non-sum NTs.

This also extracts errors into a separate module, which I had hoped to do in
a separate commit, but it's not worth separating them.  My _original_ reason
for doing so was debugging (I'll get into that below), but I had wanted to
trim down `ele.rs` anyway, since that mess is large and a lot to grok.

My debugging was trying to figure out why Rust was failing to derive
`PartialEq` on `NtError` because of `AttrParseError`.  As it turns out,
`AttrParseError::InvalidValue` was failing, thus the introduction of the
`PartialEq` trait bound on `AttrParseState::ValueError`.  Figuring this out
required implementing `PartialEq` myself without `derive` (well, using LSP,
which did all the work for me).

I'm not sure why this was not failing previously, which is a bit of a
concern, though perhaps in the context of the macro-expanded code, Rust was
able to properly resolve the types.

DEV-7145
2022-09-14 09:28:31 -04:00
Mike Gerwitz 5078bd8bda tamer: xir::parse::ele: Extract sum NT error from `ele_parse!`
The `ele_parse!` macro is a monstrosity, and expands into many different
identifiers.  The hope is that chipping away at things like this will not
only make the template easier to understand by framing portions of the
problem in terms of more traditional Rust code, but will also hopefully
reduce compile times by reducing the amount of code that is expanded by the
macro.

DEV-7145
2022-09-13 09:20:29 -04:00
Mike Gerwitz 0ac24baa87 build-aux/Makefile.am (bootstrap-if-necessary): New target
This introduces an order-only prerequisite `bootstrap-if-necessary` for the
generation of `suppliers.mk`.  Projects utilizing TAME as a dependency may
include a `bootstrap.mk` that overrides this target to trigger any
bootstrapping scripts that may be necessary due to toolchain updates.

DEV-7145
2022-09-07 11:18:42 -04:00
Mike Gerwitz 5edefde201 Makefile (bin): Target to build only binaries
Systems utilizing TAME as a build dependency are not interested in
everything else that gets built (tests and docs, primarily).

DEV-7145
2022-09-07 09:53:44 -04:00
Mike Gerwitz 1dfd5d89cb tame: bootstrap: Stop building after bootstrap
This was a relic of the old bootstrap system, where bootstrapping was in the
context of a parent project that utilized TAME (and so TAME needed to be
built).  But that doesn't make sense in the context of TAME itself, and what
_part_ of TAME should be built should be controlled by the project utilizing
it.

This is especially important now that TAMER builds are getting much longer
with the introduction of NIR and its parser-generator.

DEV-7145
2022-09-06 13:55:03 -04:00
Mike Gerwitz a712d8b279 .gitlab-ci.yml: Filter TAMER artifacts to binaries
The binaries are small, but the other data in target/ is huge (>1GiB on my
machine after numerous builds).

DEV-7145
2022-08-29 16:15:37 -04:00
Mike Gerwitz 419b24f251 tamer: Introduce NIR (accepting only)
This introduces NIR, but only as an accepting grammar; it doesn't yet emit
the NIR IR, beyond TODOs.

This modifies `tamec` to, while copying XIR, also attempt to lower NIR to
produce parser errors, if any.  It does not yet fail compilation, as I just
want to be cautious and observe that everything's working properly for a
little while as people use it, before I potentially break builds.

This is the culmination of months of supporting effort.  The NIR grammar is
derived from our existing TAME sources internally, which I use for now as a
test case until I introduce test cases directly into TAMER later on (I'd do
it now, if I hadn't spent so much time on this; I'll start introducing tests
as I begin emitting NIR tokens).  This is capable of fully parsing our
largest system with >900 packages, as well as `core`.

`tamec`'s lowering is a mess; that'll be cleaned up in future commits.  The
same can be said about `tameld`.

NIR's grammar has some initial documentation, but this will improve over
time as well.

The generated docs still need some improvement, too, especially with
generated identifiers; I just want to get this out here for testing.

DEV-7145
2022-08-29 15:52:04 -04:00
Mike Gerwitz c420ab2730 tamer: xir::parse: Correct doc xrefs
These weren't causing problems until they were output as part of NIR (in a
separate module).

NIR is about to be committed.

DEV-7145
2022-08-29 15:52:04 -04:00
Mike Gerwitz 638a9c483b tamer: xir::parse::ele: Hide internal NT enum variants
The user never sees or interacts with these; they're macro-generated, and
distract from the useful information in the generated docs.

DEV-7145
2022-08-29 15:52:04 -04:00
Mike Gerwitz 2b33a45985 tamer: xir::parse::ele: Support NT docs
This just modifies the macro to proxy attributes to generated NTs so that
they can be documented.

DEV-7145
2022-08-29 15:52:04 -04:00
Mike Gerwitz 51728545f7 tamer: xir::parse::ele: Properly handle previous state transitions
This includes when on the last state / expecting a close.

Previously, there were a couple major issues:

  1. After parsing an NT, we can't allow preemption because we must emit a
     dead state so that we can remove the NT from the stack, otherwise
     they'll never close (until the parent does) and that results in
     unbounded stack growth for a lot of siblings.  Therefore, we cannot
     preempt on `Text`, which causes the NT to receive it, emit a dead
     state, transition away from the NT, and not accept another NT of the
     same type after `Text`.

  2. When encountering an unknown element, the error message stated that a
     closing tag was expected rather than one of the elements accepted by the
     final NT.

For #1, this was solved by allowing the parent to transition back to the NT
if it would have been matched by the previous NT.  A future change may
therefore allow us to remove repetition handling entirely and allow the
parent to deal with it (maybe).

For #2, the trouble is with the parser generator macro---we don't have a
good way of knowing the last NT, and the last NT may not even exist if none
was provided.  This solution is a compromise, after having tried and failed
at many others; I desperately need to move on, and this results in the
correct behavior and doesn't sacrifice performance.  But it can be done
better in the future.

It's also worth noting for #2 that the behavior isn't _entirely_ desirable,
but in practice it is mostly correct.  Specifically, if we encounter an
unknown token, we're going to blow through all NTs until the last one, which
will be forced to handle it.  After that, we cannot return to a previous NT,
and so we've forefitted the ability to parse anything that came before it.

NIR's grammar is such that sequences are rare and, if present, there's
really only ever two NTs, and so this awkward behavior will rarely cause
practical issues.  With that said, it ought to be improved in the future,
but let's wait to see if other parts of the lowering pipeline provide more
appropriate places to handle some of these things (even though it really
ought to be handled at the grammar level).

But I'm well out of time to spend on this.  I have to move on.

DEV-7145
2022-08-29 15:52:04 -04:00
Mike Gerwitz 2fcd0b35ae core: vector/cmatch/match-* (@const@): Remove
This removes the deprecated `@const@` argument in favor of shorthand
`@value@` constants, which were introduced long ago precisely to avoid
having to define separate `@const@` parameters for all of these templates.

DEV-7145
2022-08-29 15:52:04 -04:00
Mike Gerwitz 5ee0ddd064 core: list2typedef: Include proper package namespace prefixes
Required by yet-to-be-committed TAMER grammar.

DEV-7145
2022-08-29 15:52:03 -04:00
Mike Gerwitz 8a286878f6 core: numeric/round: Add missing `item/@desc`
This was caught by TAMER using a yet-to-be-committed NIR.

DEV-7145
2022-08-22 15:02:53 -04:00
Mike Gerwitz 93fb6e78e4 core: numeric/round: Remove c:value-of/@type
This is not valid and never was; TAME just didn't validate inside templates,
unlike TAMER.

DEV-7145
2022-08-22 15:02:53 -04:00
Mike Gerwitz 6269f8de6e core: Remove @keep
"keep" is an old feature that forced the linker to retain symbols that were
unused.  This was removed long ago in favor of having all linker roots
defined by the return map.

This also removes an old `@always`, which seems like a typo for
`when="always"` or something...not entirely sure.

DEV-7145
2022-08-22 15:02:53 -04:00
Mike Gerwitz acd0aea6a4 core: Remove @accumulate
Accumulators were an ancient TAME feature removed long ago during The Great
Refactoring (...okay, that part didn't fit the definition of a "refactor",
but that's technically what that's referring to).

TAMER will not accept it.

DEV-7145
2022-08-22 15:02:52 -04:00
Mike Gerwitz 3e1bf48a45 core: Remove __DATE_YEAR__
This has not been in use for years and it's time to go away---it is the only
thing in TAME that causes nondeterminism, at least that I'm immediately
aware of.  Perhaps I'll find something else while reimplementing TAME in
TAMER.

_This does not remove the compiler code to produce this._  If something
still needs `__DATE_YEAR__` (because it's really old), it can define this
value itself, and still utilize it until TAMER (which will not include it).

DEV-7145
2022-08-22 15:02:52 -04:00
Mike Gerwitz 7466ecbe8b tamer: xir::parse::ele: Accept missing child
`ele_parse!` was recently converted to accept zero-or-more for every NT to
simplify the parser-generator, since NIR isn't going to be able to
accurately determine whether child requirements are met anyway (because of
the template system).

This ensures that `Close` can be accepted when we're expecting an
element.  It also adds a test for a scenario that's causing me some trouble
in stashed code so that I can ensure that it doesn't break.

DEV-7145
2022-08-22 09:43:59 -04:00
Mike Gerwitz 9366c0c154 tamer: xir::parse::ele: Increase parser nesting depth
This sets the maximum depth to 64, which is still arbitrary, but
unfortunately the sum types introduce multiple levels of nesting, in
particular for template applications, so nested applications can result in a
fairly large stack.

I have various ideas to improve upon that---limited a bit in that repetition
as it is current implemented inhibits tail calls---but they're not worth
doing just yet relative to other priorities.  The impact of this change is
not significant.

DEV-7145
2022-08-18 16:16:45 -04:00
Mike Gerwitz abb2c80e22 tamer: xir::parse::ele: Always repeat
This removes support for configurable repetition.

What?  Why?

As it turns out, the complexity that repetition adds is quite significant
and is not worth the effort.  The truth is that NIR is going to have to
allow zero-or-more matches on virtually everything _anyway_ because template
application is allowed virtually anywhere---it is not possible to fully
statically analyze TAME's sources because templates can expand into just
about anything.  Given that, AIR (or something down the line) is going to
have to supply the necessary invariants instead.

It does suck, though, that this removes a lot of code that I fairly recently
wrote, and spent a decent amount of time on.  But it's important to know
when to cut your losses.

Perhaps I could have planned better, but deriving this whole system as been
quite the experiment.

DEV-7145
2022-08-18 15:19:40 -04:00
Mike Gerwitz 13d3c76a31 tamer: xir::parse::ele: Test to verify close after child recovery
Just want to be sure that we emit a closing object to match the emitted
opening one after recovery, otherwise the IR becomes unbalanced.

DEV-7145
2022-08-18 12:41:27 -04:00
Mike Gerwitz 955131217b tamer: xir::parse::ele: Attribute dead state recovery
If attributes fail to parse (e.g. missing required attribute) and parsing
reaches a dead state, this will recover by ignoring the entire element.  It
previously panicked with a TODO.

DEV-7145
2022-08-18 12:41:26 -04:00
Mike Gerwitz 77fd92bbb2 tamer: xir::parse::ele: Remove `_` suffix from error variants
These were initially used to prevent conflicts with generated variants, but
we are no longer generating such variants since they're being jumped to via
the trampoline.

DEV-7145
2022-08-17 14:58:54 -04:00
Mike Gerwitz b31ebc00a7 tamer: xir::parse::ele: Handle Close when expecting Open
I'm starting to clean up some TODOs, and this was a glaring one causing
panics when encountered.  The recovery for this is simple, because we have
no choice: just stop parsing; leave it to the next lowering operation(s) to
complain that we didn't provide what was necessary.  They'll have to,
anyway, since templates mean that NIR cannot ever have enough information to
guarantee that a document is well-formed, relative to what would expand from
the template.

DEV-7145
2022-08-17 14:49:34 -04:00
Mike Gerwitz 4c86c5b63c tamer: xir::parse::ele: Support nested Sum NTs
This allows for a construction like this:

```
ele_parse! {
  [...]

  StmtX := QN_X {
    [...]
  };

  StmtY := QN_Y {
    [...]
  };

  ExprA := QN_A {
    [...]
  };

  ExprB := QN_B {
    [...]
  };

  Expr := (A | B);
  Stmt := (StmtX | StmtY);

  // This previously was not allowed:
  StmtOrExpr := (Stmt | Expr);
}
```

There were initially two barriers to doing so:

  1. Efficiently matching; and
  2. Outputting diagnostic information about the union of all expected
     elements.

The first was previously resolved with the introduction of `NT::matches`,
which is macro-expanded in a way that Rust will be able to optimize a
bit.  Worst case, it's effectively a linear search, but our Sum NTs are not
that deep in practice, so I don't expect that to be a concern.

The concern that I was trying to avoid was heap-allocated `NodeMatcher`s to
avoid recursive data structures, since that would have put heap access in a
very hot code path, which is not an option.

That left problem #2, which ended up being the harder problem.  The solution
was detailed in the previous commit, so you should look there, but it
amounts to being able to format individual entries as if they were a part
of a list by making them a function of not just the matcher itself, but also
the number of items in (recursively) the sum type and the position of the
matcher relative to that list.  The list length is easily
computed (recursively) at compile-time using `const`
functions (`NT::matches_n`).

And with that, NIR can be abstracted in sane ways using Sum NTs without a
bunch of duplication that would have been a maintenance burden and an
inevitable source of bugs (from having to duplicate NT references).

DEV-7145
2022-08-17 10:44:53 -04:00
Mike Gerwitz fd3184c795 tamer: fmt (ListDisplayWrapper::fmt_nth): List display without a slice
This exposes the internal rendering of `ListDisplayWrapper::fmt` such that
we can output a list without actually creating a list.  This is used in an
upcoming change for =ele_parse!= so that Sum NTs can render the union of all
the QNames that their constituent NTs match on, recursively, as a single
list, without having to create an ephemeral collection only for display.

If Rust supports const functions for arrays/Vecs in the future, we could
generate this at compile-time, if we were okay with the (small) cost, but
this solution seems just fine.  But output may be even _more_ performant
since they'd all be adjacent in memory.

This is used in these secenarios:

  1. Diagnostic messages;
  2. Error messages (overlaps with #1); and
  3. `Display::fmt` of the `ParseState`s themselves.

The reason that we want this to be reasonably performant is because #3
results in a _lot_ of output---easily GiB of output depending on what is
being traced.  Adding heap allocations to this would make it even slower,
since a description is generated for each individual trace.

Anyway, this is a fairly simple solution, albeit a little bit less clear,
and only came after I had tried a number of other different approaches
related to recursively constructing QName lists at compile time; they
weren't worth the effort when this was so easy to do.

DEV-7145
2022-08-17 10:44:28 -04:00
Mike Gerwitz 6b29479fd6 tamer: xir::fmt (DisplayFn): New fn wrapper
See the docblock for a description.  This is used in an upcoming commit for
=ele_parse!=.

DEV-7145
2022-08-17 10:01:47 -04:00
Mike Gerwitz 4177b8ed71 tamer: xir::parse::ele: Streaming attribute parsing
This allows using a `[attr]` special form to stream attributes as they are
encountered rather than aggregating a static attribute list.  This is
necessary in particular for short-hand template application and short-hand
function application, since the attribute names are derived from template
and function parameter lists, which are runtime values.

The syntax for this is a bit odd since there's a semi-useless and confusing
`@ {} => obj` still, but this is only going to be used by a couple of NTs
and it's not worth the time to clean this up, given the rather significant
macro complexity already.

DEV-7145
2022-08-16 23:06:38 -04:00
Mike Gerwitz 43c64babb0 tamer: xir::parse::ele: Superstate element preemption
This uses the same mechanism that was introduced for handling `Text` nodes
in mixed content, allowing for arbitrary element `Open` matches for
preemption by the superstate.

This will be used to allow for template expansion virtually
anywhere.  Unlike the existing TAME, it'll even allow for it at the root,
though whether that's ultimately permitted is really depending on how I
approach template expansion; it may fail during a later lowering operation.

This is interesting because this approach is only possible because of the
CPS-style trampoline implementation.  Previously, with the composition-based
approach, each and every parser would have to perform this check, like we
had to previously with `Text` nodes.

As usual, this is still adding to the mess a bit, and it'll need some future
cleanup.

DEV-7145
2022-08-16 15:47:41 -04:00
Mike Gerwitz 6f53c0971b tamer: xir::parse::ele: Superstate text node preemption
This introduces the concept of superstate node preemption generally, which I
hope to use for template application as well, since templates can appear in
essentially any (syntatically valid, for XML) position.

This implements mixed content handling by defining the mapping on the
superstate itself, which really simplifies the problem but foregoes
fine-grained text handling.  I had hoped to avoid that, but oh well.

This pushes the responsibility of whether text is semantically valid at that
position to NIR->AIR lowering (which we're not transition to yet), which is
really the better place for it anyway, since this is just the grammar.  The
lowering to AIR will need to validate anyway given that template expansion
happens after NIR.

Moving on!

DEV-7145
2022-08-16 12:26:24 -04:00
Mike Gerwitz 65b42022f0 tamer: xir::st: Prefix all preproc-namespaced constants with `QN_P_`
I had previously avoided this to keep names more concise, but now it's
ambiguous with parsing actual TAME sources.

DEV-7145
2022-08-15 13:00:10 -04:00
Mike Gerwitz 9f98cbf9b4 core: Remove `const/@type`
This has been optional for many years and is not actually used by the
current compiler.  TAMER can infer it, in situations where it actually
matters in the future.

So, rather than adding support for this in the new parser, let's clean up.

DEV-7145
2022-08-15 11:57:45 -04:00
Mike Gerwitz 709291b107 === COMMITS BEFORE HERE WILL NOT COMPILE ON RUST < 2022-08-10 ===
See previous commit for an explanation.  This marker is intended to be
useful while looking through commits.

This is because we utilize an unstable `int_log` feature, which is expected
to occasionally cause BC issues.
2022-08-12 16:42:30 -04:00
Mike Gerwitz 13641e1812 tamer: diagnose::report: `int_log` feature: {=>i}log10
https://github.com/rust-lang/rust/pull/100332

The above MR replaces `log10` and friends with `ilog10`; this is the first
time an unstable feature bit us in a substantially backwards-incompatible
way that's a pain to deal with.

Fortunately, I'm just not going to deal with it: this is used with the
diagnostic system, which isn't yet used by our projects (outside of me
testing), and so those builds shouldn't fail before people upgrade.

This is now pending stabalization with the new name, so hopefully we're good
now:

  https://github.com/rust-lang/rust/issues/70887#issuecomment-1210602692
2022-08-12 16:42:30 -04:00
Mike Gerwitz 2a36bc4210 tamer: (explicit_generic_args_with_impl_trait): Remove unstable feature flag
This was stabalized in Rust 1.63.  I was waiting to be sure our build
servers were updated properly before removing this (and they were, long
ago).
2022-08-12 16:42:30 -04:00
Mike Gerwitz ed8a2ce28a tamer: xir::parse::ele: Superstate not to accept early EOF
This was accepting an early EOF when the active child `ParseState` was in an
accepting state, because it was not ensuring that anything on the stack was
also accepting.

Ideally, there should be nothing on the stack, and hopefully in the future
that's what happens.  But with how things are today, it's important that, if
anything is on the stack, it is accepting.

Since `is_accepting` on the superstate is only called during finalization,
and because the check terminates early, and because the stack practically
speaking will only have a couple things on it max (unless we're in tail
position in a deeply nested tree, without TCO [yet]), this shouldn't be an
expensive check.

Implementing this did require that we expose `Context` to `is_accepting`,
which I had hoped to avoid having to do, but here we are.

DEV-7145
2022-08-12 00:47:15 -04:00
Mike Gerwitz a4419413fb tamer: parse::trace: Include context
This is something that I had apparently forgotten to do, but is now useful
in debugging `ele_parse!` issues with the trampoline.

DEV-7145
2022-08-12 00:47:14 -04:00
Mike Gerwitz 54d8348e95 tamer: Add `--quiet` flag to `make check` (`cargo test`)
I wonder when this option was introduced, unless I never saw it because it
is called "quiet".  But this is what I always wanted (and how I write the
output for my own tools, like progtest in this repo); the output has long
gotten far too large.

DEV-7145
2022-08-12 00:47:14 -04:00
Mike Gerwitz 22a9596cf4 tamer: xir::parse::ele: Hoist whitespace/comment handling to superstate
All child parsers do the same thing, so this simplifies things.

DEV-7145
2022-08-12 00:47:14 -04:00
Mike Gerwitz f8a9e952e5 tamer: xir::parse::ele: Correct handling of sum dead state post-recovery
Along with this change we also had to change how we handle dead states in
the superstate.  So there were two problems here:

  1. Sum states were not yielding a dead state after recovery, which meant
     that parsing was unable to continue (we still have a `todo!`); and
  2. The superstate considered it an error when there was nothing left on
     the stack, because I assumed that ought not happen.

Regarding #2---it _shouldn't_ happen, _unless_ we have extra input after we
have completed parsing.  Which happens to be the case for this test case,
but more importantly, we shouldn't be panicing with errors about TAMER bugs
if somebody puts extra input after a closing root tag in a source file.

DEV-7145
2022-08-12 00:47:14 -04:00
Mike Gerwitz b95ec5a9d8 tamer: xir::parse::ele: Adjust diagnostic display of expected element list
This does two things:

  1. Places the expected list on a separate help line as a footnote where
     it'll be a bit more tolerable when it inevitably overflows the terminal
     width in certain contexts (we may wrap in the future); and
  2. Removes angled brackets from the element names so that they (a) better
     correspond with the span which highlights only the element name and (b)
     do not imply that the elements take no attributes.

DEV-7145
2022-08-12 00:47:14 -04:00
Mike Gerwitz 67ee914505 tamer: xir::parse::ele: Store matching QName on NS match
When we match a QName against a namespace, we ought to store the matching
QName to use (a) in error messages and (b) to make available as a
binding.  The former is necessary for sensible errors (rather than saying
that it's e.g. expecting a closing `t:*`) and the latter is necessary for
e.g. getting the template name out of `t:foo`.

DEV-7145
2022-08-12 00:47:14 -04:00
Mike Gerwitz 8cb03d8d16 tamer: xir::parse::ele: Initial namespace prefix matching support
This allows matching on a namespace prefix by providing a `Prefix` instead
of a `QName`.  This works, but is missing a couple notable things (and
possibly more):

  1. Tracking the QName that is _actually_ matched so that it can be used in
     messages stating what the expected closing tag is; and
  2. Making that QName available via a binding.

This will be used to match on `t:*` in NIR.  If you're wondering how
attribute parsing is supposed to work with that (of course you're wondering
that, random person reading this)---that'll have to work differently for
those matches, since template shorthand application contains argument names
as attributes.

DEV-7145
2022-08-12 00:47:14 -04:00
Mike Gerwitz f9fe4aa13b tamer: xir::st: Static namespace prefixes (c and t)
In particular, `t:*` will be recognized by NIR for short-hand template
application.  These will be utilized in an upcoming commit.

DEV-7145
2022-08-12 00:47:14 -04:00
Mike Gerwitz 88fa0688fa tamer: xir::parse::ele: Abstract node matching
This introduces `NodeMatcher`, with the intent of introducing wildcard QName
matches for e.g. `t:*` nodes.  It's not yet clear if I'll expand this to
support text nodes yet, or if I'll convert text nodes into elements to
re-use the existing system (which I had initially planned on doing, but
didn't because of the work and expense (token expansion) involved in the
conversion).

DEV-7145
2022-08-12 00:47:13 -04:00
Mike Gerwitz 7b9bc9e108 tamer: xir::parse::ele: Ignore Text nodes for now
I need to move on, and there are (a) a couple different ways to proceed that
I want to mull over and (b) upcoming changes that may influence my decision
one way or another.

DEV-7145
2022-08-12 00:47:12 -04:00
Mike Gerwitz 4aaf91a9e7 tamer: xir::parse::ele: Un-nest child parser errors
This will utilize the superstate's error object in place of nested errors,
which was the result of the previous composition-based delegation.

As you can see, all we had to do was remove the special handling of these
errors; the existing delegation setup continues to handle the types properly
with no change.  The composition continues to work for `*Attr_`.

The alternative was to box inner errors, since they're far from the hot code
path, but that's clearly unnecessary.

To be clear: this is necessary to allow for recursive grammars in
`ele_parse` without creating recursive data structures in Rust.

DEV-7145
2022-08-10 11:46:54 -04:00
Mike Gerwitz adf7baf115 tamer: xir::parse::ele: Handle comments like whitespace
Comments ought not have any more semantic meaning than whitespace.  Other
languages may have conventions that allow for various types of things in
comments, like annotations, but those are symptoms of language
limitations---we control the source language here.

DEV-7145
2022-08-10 11:46:54 -04:00
Mike Gerwitz 15e04d63e2 tamer: xir::parse::ele: Transition trampoline
This properly integrates the trampoline into `ele_parse!`.  The
implementation leaves some TODOs, most notably broken mixed text handling
since we can no longer intercept those tokens before passing to the
child.  That is temporarily marked as incomplete; see a future commit.

The introduced test `ParseState`s were to help me reason about the system
intuitively as I struggled to track down some type errors in the monstrosity
that is `ele_parse!`.  It will fail to compile if those invariants are
violated.  (In the end, the problems were pretty simple to resolve, and the
struggle was the type system doing its job in telling me that I needed to
step back and try to reason about the problem again until it was intuitive.)

This keeps around the NT states for now, which are quickly used to
transition to the next NT state, like a couple of bounces on a trampoline:

  NT -> Dead -> Parent -> Next NT

This could be optimized in the future, if it's worth doing.

This also makes no attempt to implement tail calls; that would have to come
after fixing mixed content and really isn't worth the added complexity
now.  I (desperately) need to move on, and still have a bunch of cleanup to
do.

I had hoped for a smaller commit, but that was too difficult to do with all
the types involved.

DEV-7145
2022-08-10 11:46:45 -04:00
Mike Gerwitz 233fa7de6a tamer: diagnose::panic: New module
This change introduces diagnostic messages for panics.  The intent is to be
able to use panics in situations where it is either not possible to or not
worth the time to recover from errors and ensure a consistent/sensible
system state.  In those situations, we still ought to be able to provide the
user with useful information to attempt to get unstuck, since the error is
surely in response to some particular input, and maybe that input can be
tweaked to work around the problem.

Ideally, invalid states are avoided using the type system and statically
verified at compile-time.  But this is not always possible, or in some cases
may be way more effort or cause way more code complexity than is worth,
given the unliklihood of the error occurring.

With that said, it's been interesting, over the past >10y that TAME has
existed, seeing how unlikely errors do sometimes pop up many years after
they were written.  It's also interesting to have my intuition of what is
"unlikely" challenged, but hopefully it holds generally.

DEV-7145
2022-08-09 15:20:37 -04:00
Mike Gerwitz 454b7a163f tamer: xir::parse::ele: Move repeat configuration out of Context
I had previously used `Context` to hold the parser configuration for
repetition, since that was the easier option.  But I now want to utilize the
`Context` for a stack for the superstate trampoline, and I don't want to
have to deal with the awkwardness of the repetition in doing so, since it
requires that the configuration be created during delegation, rather than
just being passed through to all child parsers.

This adds to a mess that needs cleaning up, but I'll do that after
everything is working.

DEV-7145
2022-08-08 15:23:55 -04:00
Mike Gerwitz 6bc872eb38 tamer: xir::parse::ele: Generate superstate
And here's the thing that I've been dreading, partly because of the
`macro_rules` issues involved.  But, it's not too terrible.

This module was already large and complex, and this just adds to it---it's
in need of refactoring, but I want to be sure it's fully working and capable
of handling NIR before I go spending time refactoring only to undo it.

_This does not yet use trampolining in place of the call stack._  That'll
come next; I just wanted to get the macro updated, the superstate generated,
and tests passing.  This does convert into the
superstate (`ParseState::Super`), but then converts back to the original
`ParseState` for BC with the existing composition-based delegation.  That
will go away and will then use the equivalent of CPS, using the
superstate+`Parser` as a trampoline.  This will require an explicit stack
via `Context`, like XIRF.  And it will allow for tail calls, with respect to
parser delegation, if I decide it's worth doing.

The root problem is that source XML requires recursive parsing (for
expressions and statements like `<section>`), which results in recursive
data structures (`ParseState` enum variants).  Resolving this with boxing is
not appropriate, because that puts heap indirection in an extremely hot code
path, and may also inhibit the aggressive optimizations that I need Rust to
perform to optimize away the majority of the lowering pipeline.

Once this is sorted out, this should be the last big thing for the
parser.  This unfortunately has been a nagging and looming issue for months,
that I was hoping to avoid, and in retrospect that was naive.

DEV-7145
2022-08-08 15:23:55 -04:00
Mike Gerwitz 53a689741b tamer: parse::state::ParseState::Super: Superstate concept
I'm disappointed that I keep having to implement features that I had hoped
to avoid implementing.

This introduces a "superstate" feature, which is intended really just to be
a sum type that is able to delegate to stitched `ParseState`s.  This then
allows a `ParseState` to transition directly to another `ParseState` and
have the parent `ParseState` handle the delegation---a trampoline.

This issue naturally arises out of the recursive nature of parsing a TAME
XML document, where certain statements can be nested (like `<section>`), and
where expressions can be nested.  I had gotten away with composition-based
delegation for now because `xmlo` headers do not have such nesting.

The composition-based approach falls flat for recursive structures.  The
typical naive solution is boxing, which I cannot do, because not only is
this on an extremely hot code path, but I require that Rust be able to
deeply introspect and optimize away the lowering pipeline as much as
possible.

Many months ago, I figured that such a solution would require a trampoline,
as it typically does in stack-based languages, but I was hoping to avoid
it.  Well, no longer; let's just get on with it.

This intends to implement trampolining in a `ParseState` that serves as that
sum type, rather than introducing it as yet another feature to `Parser`; the
latter would provide a more convenient API, but it would continue to bloat
`Parser` itself.  Right now, only the element parser generator will require
use of this, so if it's needed beyond that, then I'll debate whether it's
worth providing a better abstraction.  For now, the intent will be to use
the `Context` to store a stack that it can pop off of to restore the
previous `ParseState` before delegation.

DEV-7145
2022-08-08 15:23:54 -04:00
Mike Gerwitz 7a5f731cac tamer: tameld: XIRF nesting 64=>4
Since we'll never be reading past the header, this is all that is needed.

If in the future this is violated, XIRF will cause a nice diagnostic error
displaying precisely what opening tag caused the increased level of nesting,
which will aid in debugging and allow us to determine if it ought to be
increased.  Here's an example, if I set the max to `3`:

  error: maximum XML element nesting depth of `3` exceeded
     --> /home/.../foo.xmlo:261:10
      |
  261 |          <preproc:sym-ref name=":_vproduct:vector_a"/>
      |          ^^^^^^^^^^^^^^^^ error: this opening tag increases the level of nesting past the limit of 3

Of course, the longer-term goal is to do away with `xmlo` entirely.

This had no (perceivable via `/usr/bin/time -v`, at least) impact on memory
or CPU time.

DEV-7145
2022-08-01 15:01:37 -04:00
Mike Gerwitz 77efefe680 tamer: xir::attr::parse: Better parser state descriptions
The attribute name was neither quoted nor `@`-prefixed.  (I noticed this in
the traces.)

DEV-7145
2022-08-01 15:01:37 -04:00
Mike Gerwitz 2d117a4864 tamer: xir::parse::ele: Mixed content parsing
"Mixed content" is the XML term representing element nodes mixed with text
nodes.  For example, `foo <strong>bar</strong> baz` is mixed.

TAME supports text nodes as documentation, intended to be in a literate
style but never fully realized.  In any case, we need to permit them, and I
wanted to do more than just ignore the nodes.

This takes a different approach than typical parser delegation---it has the
parent parser _preempt_ the child by intercepting text before delegation
takes place, rather than having the child reject the token (or possibly
interpret it itself!) and have to handle an error or dead state.

And while this makes it more confusing in terms of state machine stitching,
it does make sense, in the sense that the parent parser is really what
"owns" the text node---the parser is delegating _element_ parsing only, take
asserts authority when necessary to take back control where it shouldn't be
delegated.

DEV-7145
2022-08-01 15:01:37 -04:00
Mike Gerwitz 8779abe2bb tamer: xir::flat: Expose depth for all node-related tokens
Previously a `Depth` was provided only for `Open` and `Close`.  This depth
information, for example, will be used by NIR to quickly determine whether a
given parser ought to assert ownership of a text/comment token rather than
delegating it.

This involved modifying a number of test cases, but it's worth repeating in
these commits that this is intentional---I've been bit in the past using
`..` in contexts where I really do want to know if variant fields change so
that I can consider whether and how that change may affect the code
utilizing that variant.

DEV-7145
2022-08-01 15:01:37 -04:00
Mike Gerwitz b3c0bdc786 tamer: xir::parse::ele: Ignore whitespace around elements
Recent changes regarding whitespace were all to support this change (though
it was also needed for XIRF, pre- and post-root).

Now I'll have to conted with how I want to handle text nodes in various
circumstances, in terms of `ele_parse!`.

DEV-7145
2022-08-01 15:01:37 -04:00
Mike Gerwitz 8f3301431c tamer: span::dummy: New module to hold DUMMY_SPAN and derivatives
Various DUMMY_SPAN-derived spans are used by many test cases, so this
finally extracts them---something I've been meaning to do for some time.

This also places DUMMY_SPAN behind a `cfg(test)` directive to ensure that it
is _only_ used in tests; UNKNOWN_SPAN should be used when a span is actually
unknown, which may also be the case during development.

DEV-7145
2022-08-01 15:01:37 -04:00
Mike Gerwitz 0edb21429d tamer: parse::error: Describe unexpected token of input
When Parser has a unhandled dead state and fails due to an unexpected token
of input, we should display what we interpreted that token as.

DEV-7145
2022-08-01 15:01:37 -04:00
Mike Gerwitz 18803ea576 tamer: xir: Format tokens without tt quotes
Whether or not quoting is appropriate depends on context, and that parent
context is already performing the quoting.  For example:

  error: expected `</rater>`, but found `<import>`
    --> /home/[...]/foo.xml:2:1
     |
   2 | <rater xmlns="http://www.lovullo.com/rater"
     | ------ note: element starts here

    --> /home/[...]/foo.xml:7:3
     |
   7 |   <import package="/rater/core/base" />
     |   ^^^^^^^ error: expected `</rater>`

In these cases (obviously I'm still working on the parser, since this is
nonsense), the parser is responsible for quoting the token "<import>".

DEV-7145
2022-08-01 15:01:37 -04:00
Mike Gerwitz 8778976018 tamer: xir::flat: Ignore whitespace both before and after root
DEV-7145
2022-08-01 15:01:37 -04:00
Mike Gerwitz 4f2b27f944 tamer: xir: Attribute error formatting/typo fixes
There were two problem errors: one showing "element element" and one showing
the value along with the name of the attribute.

The change for `<Attr as Display>::fmt` is debatable.  I'm going to do this
for now (only show `@name`) and adjust later if necessary.

I'll need to go use `crate::fmt` consistently in previously-existing format
strings at some point, too.

DEV-7145
2022-08-01 15:01:37 -04:00
Mike Gerwitz 41b41e02c1 tamer: Xirf::Text refinement
This teaches XIRF to optionally refine Text into RefinedText, which
determines whether the given SymbolId represents entirely whitespace.

This is something I've been putting off for some time, but now that I'm
parsing source language for NIR, it is necessary, in that we can only permit
whitespace Text nodes in certain contexts.

The idea is to capture the most common whitespace as preinterned
symbols.  Note that this heuristic ought to be determined from scanning a
codebase, which I haven't done yet; this is just an initial list.

The fallback is to look up the string associated with the SymbolId and
perform a linear scan, aborting on the first non-whitespace character.  This
combination of checks should be sufficiently performant for now considering
that this is only being run on source files, which really are not all that
large.  (They become large when template-expanded.)  I'll optimize further
if I notice it show up during profiling.

This also frees XIR itself from being concerned by Whitespace.  Initially I
had used quick-xml's whitespace trimming, but it messed up my span
calculations, and those were a pain in the ass to implement to begin with,
since I had to resort to pointer arithmetic.  I'd rather avoid tweaking it.

tameld will not check for whitespace, since it's not important---xmlo files,
if malformed, are the fault of the compiler; we can ignore text nodes except
in the context of code fragments, where they are never whitespace (unless
that's also a compiler bug).

Onward and yonward.

DEV-7145
2022-08-01 15:01:37 -04:00
Mike Gerwitz b38c16fd08 tamer: parse::trace: Generalize reason for trace output
The trace outputs a note in the footer indicating _why_ it's being output,
so that the reader understands both where the potentially-unexpected
behavior originates from and so they know (in the case of the feature flag)
how to inhibit it.

That information originally lived in `Parser`, where the `cfg` directive to
enable it lives, but it was moved into the abstraction.  This corrects that.

DEV-7145
2022-08-01 15:01:12 -04:00
Corey Vollmer 864f50c025 [DEV-9619] Support all UTF-8 characters 2022-07-27 12:58:00 -04:00
Corey Vollmer 2901f06318 [DEV-9619] Return sha256
This fixes the implementation of sha256 to be compatible with our
system.
2022-07-27 12:55:17 -04:00
Mike Gerwitz 17327f1b64 tamer: parse::trace: Extract tracing into new module
This has gotten large and was cluttering `feed_tok`.  This also provides the
ability to more easily expand into other types of tracing in the future.

DEV-7145
2022-07-26 09:29:17 -04:00
Mike Gerwitz 8f25c9ae0a tamer: parse::parser: Include object in parser trace
This information is likely redundant in a lowering pipeline, but is more
useful outside of such a pipeline.  It's also more clear.

`Object` does not implement `Display`, though, because that's too burdensome
for how it's currently used.  Many `Object`s are also `Token`s though and,
if fed to another `Parser` for lowering, it'll get `Display::fmt`'d.

DEV-7145
2022-07-26 09:28:39 -04:00
Mike Gerwitz 4b5e51b0f0 tamer: parse::parser::Parser::feed_tok: cfg note precedence
Rust was warning that `cfg` was unused if both `test` and
`parser-trace-stderr`.  This both allows that and adjusts the precedence to
make more sense for tests.

DEV-7145
2022-07-26 09:28:39 -04:00
Mike Gerwitz c3dfcc565c tamer: parse::parser::Parser: Include errors in parse trace
Because of recovery, the trace otherwise paints a really confusing-looking
picture when given unexpected input.

This is large enough now that it really ought to be extracted from
`feed_tok`, but I'll wait to see how this evolves further.  I considered
adding color too, but it's not yet clear to me that the visual noise will be
all that helpful.

DEV-7145
2022-07-26 09:28:37 -04:00
Corey Vollmer f667a1a58e [DEV-9619] Update sha256 script to handle UTF8
This commit replaces the sha256 script with a newer implemention which supports all UTF8 characters.

https://github.com/emn178/js-sha256/blob/master/src/sha256.js

Note that this commit breaks the system, the following commit fixes
this.
2022-07-22 08:46:35 -04:00
Mike Gerwitz 422f3d9c0c tamer: New parser-trace-stderr feature flag
This flag allows toggling the parser trace that was previously only
available to tests.  Unfortunately, at the time of writing, Cargo cannot
enable flags in profiles, so I have to check for either `test` or this flag
being set to enable relevant features.

This trace is useful as I start to run the parser against existing code
written in TAME so that our existing systems can help to guide my
development.  Unlike the current tests, it also allows seeing real-world
data as part of the lowering pipeline, where multiple `Parser`s are in
play.

Having this feature flag also makes this feature more easily discoverable to
those wishing to observe how the lowering pipeline works.

DEV-7145
2022-07-21 22:10:08 -04:00
Mike Gerwitz de35cc37fd tamer: xir::writer::XmlWriter: Do not take Token ownership
impl for `&Token` instead of Token; the writer is just copying data into the
destination stream anyway.

This will allow us to continue writing the token while also using it for
further processing, like `tee`.

DEV-7145
2022-07-21 15:29:55 -04:00
Mike Gerwitz 0504788a16 tamer: xir::parse::ele: Visibility specifier
We need to be able to export generated identifiers.  Trying to figure out a
syntax for this was a bit tricky considering how much is generated, so I
just settled on something that's reasonably clear and easy to parse with
`macro_rules!`.

I had intended to just make everything public by default and encapsulate
using private modules, but that then required making everything else that it
uses public (e.g. error and token objects), which would have been a bizarre
thing to do in e.g. test cases.

DEV-7145
2022-07-21 14:56:43 -04:00
Mike Gerwitz acced76788 tamer: xir::parse::ele: Expand types for external expansion for sum NT
Like a previous commit, this corrects the types for sum NTs so that they
properly resolve in contexts external to xir::parse.

DEV-7145
2022-07-21 13:44:30 -04:00
Mike Gerwitz 992c000b68 tamer: xir::parse::ele: AttrValueError for attr_parse!'s ValueError
This integrates the previous ValueError for `attr_parse!` into
`ele_parse!`.

DEV-7145
2022-07-21 09:23:34 -04:00
Mike Gerwitz 3a764d111e tamer: xir::parse::attr: Fallible value parsing
Values can be parsed using `TryFrom<Attr>`.  Previously only `From<Attr>`
was supported, which could not fail.

This is critical for parsing values into types, which will wrap `SymbolId`
to provide data assurances.

DEV-7145
2022-07-21 09:23:11 -04:00
Mike Gerwitz 184ff6bdcc tamer: xir::parse: Fixes for {ele,attr}_parse! outside of module
The tests had certain things in scope, but now that I'm trying to use it
outside of those modules, some fixes are needed.

This is admittedly a sloppy commit, with a number of miscellaneous fixes.  I
didn't bother separating it more because most of them are type fixes, and
the `From<Attr>` stuff is going to have to change into, likely,
`TryFrom<Attr>` so that parse failures can occur when attributes do not
match certain patterns.

DEV-7145
2022-07-20 15:40:28 -04:00
Mike Gerwitz e517e15a29 tamer: parse::Token: Swap trait method order
This just places `ir_name` first in the trait definition so that it'll be
inserted in that same order when using LSP.

DEV-7145
2022-07-20 13:58:44 -04:00
Mike Gerwitz c856fd72d9 tamer: xir::parse::ele: Diagnostic output
The only additional information needed was opening spans so that we can
provide useful information regarding closing tags.

This uses a generic Span in place of {Open,Close}Span because the latter
wasn't necessary, but more descriptive types would be nice; it may be
beneficial later on to introduce newtypes for each of the span generated by
{Open,Close}Span.

DEV-7145
2022-07-20 12:17:15 -04:00
Mike Gerwitz ce765d3b56 tamer: xir::parse::attr: Error and recovery on duplicate attr
This was a TODO for the attribute parser generator.  The first attribute
will be kept and later ones will be ignored, producing an error.  Recovery
permits further attribute parsing having ignored the duplicate.

DEV-7145
2022-07-20 12:16:13 -04:00
Mike Gerwitz 21dfff0110 tamer: xir::parse::attr::test: Extract into own file
It's not going to be getting any smaller.

DEV-7145
2022-07-20 10:02:41 -04:00
Mike Gerwitz 1ec9c963fd tamer: xir::parse::ele: Nonterminal repetition (Kleene star)
This allows an element to be repeated by the parent NT.  The easiest way I
saw to implement this for now was to abuse the Context to provide a runtime
configuration that would allow the state machine to reset after it has
completed parsing.

This also influences error recovery, in that if we're expecting zero or more
of something, we cannot provide an error for an unexpected name, and instead
must emit a dead state so that the caller can determine what to do.

DEV-7145
2022-07-19 16:14:12 -04:00
Mike Gerwitz e73c223a55 tamer: parser::Parser: cfg(test) tracing
This produces useful parse traces that are output as part of a failing test
case.  The parser generator macros can be a bit confusing to deal with when
things go wrong, so this helps to clarify matters.

This is _not_ intended to be machine-readable, but it does show that it
would be possible to generate machine-readable output to visualize the
entire lowering pipeline.  Perhaps something for the future.

I left these inline in Parser::feed_tok because they help to elucidate what
is going on, just by reading what the trace would output---that is, it helps
to make the method more self-documenting, albeit a tad bit more
verbose.  But with that said, it should probably be extracted at some point;
I don't want this to set a precedent where composition is feasible.

Here's an example from test cases:

  [Parser::feed_tok] (input IR: XIRF)
  |  ==> Parser before tok is parsing attributes for `package`.
  |   |  Attrs_(SutAttrsState_ { ___ctx: (QName(None, LocalPart(NCName(SymbolId(46 "package")))), OpenSpan(Span { len: 0, offset: 0, ctx: Context(SymbolId(1 "#!DUMMY")) }, 10)), ___done: false })
  |
  |  ==> XIRF tok: `<unexpected>`
  |   |  Open(QName(None, LocalPart(NCName(SymbolId(82 "unexpected")))), OpenSpan(Span { len: 0, offset: 1, ctx: Context(SymbolId(1 "#!DUMMY")) }, 10), Depth(1))
  |
  |  ==> Parser after tok is expecting opening tag `<classify>`.
  |   |  ChildA(Expecting_)
  |   |  Lookahead: Some(Lookahead(Open(QName(None, LocalPart(NCName(SymbolId(82 "unexpected")))), OpenSpan(Span { len: 0, offset: 1, ctx: Context(SymbolId(1 "#!DUMMY")) }, 10), Depth(1))))
  = note: this trace was output as a debugging aid because `cfg(test)`.

  [Parser::feed_tok] (input IR: XIRF)
  |  ==> Parser before tok is expecting opening tag `<classify>`.
  |   |  ChildA(Expecting_)
  |
  |  ==> XIRF tok: `<unexpected>`
  |   |  Open(QName(None, LocalPart(NCName(SymbolId(82 "unexpected")))), OpenSpan(Span { len: 0, offset: 1, ctx: Context(SymbolId(1 "#!DUMMY")) }, 10), Depth(1))
  |
  |  ==> Parser after tok is attempting to recover by ignoring element with unexpected name `unexpected` (expected `classify`).
  |   |  ChildA(RecoverEleIgnore_(QName(None, LocalPart(NCName(SymbolId(82 "unexpected")))), OpenSpan(Span { len: 0, offset: 1, ctx: Context(SymbolId(1 "#!DUMMY")) }, 10), Depth(1)))
  |   |  Lookahead: None
  = note: this trace was output as a debugging aid because `cfg(test)`.

DEV-7145
2022-07-19 14:44:18 -04:00
Mike Gerwitz f462c7daec tamer: xir::parse::attr: Display: element name
This resolves a TODO by including the name of the element whose attributes
are currently being parsed.

This also frees a parent from having to provide additional context, allowing
Display to be fully delegated when stitching.

DEV-7145
2022-07-18 14:43:29 -04:00
Mike Gerwitz 2f4c20dac8 tamer: xir::parse::ele: Remaining Display::fmt for nonterminals
The following commit (test tracing) requires non-panicing `Display` and
`Debug` values.

DEV-7145
2022-07-18 14:31:42 -04:00
Mike Gerwitz cf2cd882ca tamer: xir::parse::ele: Introduce sum nonterminals
This introduces `Nt := (A | ... | Z);`, where `Nt` is the name of the
nonterminal and `A ... Z` are the inner nonterminals---it produces a parser
that provides a choice between a set of nonterminals.

This is implemented efficiently by understanding the QName that is accepted
by each of the inner nonterminals and delegating that token immediately to
the appropriate parser.  This is a benefit of using a parser generator macro
over parser combinators---we do not need to implement backtracking by
letting inner parsers fail, because we know ahead of time exactly what
parser we need.

This _does not_ verify that each of the inner parsers accept a unique QName;
maybe at a later time I can figure out something for that.  However, because
this compiles into a `match`, there is no ambiguity---like a PEG parser,
there is precedence in the face of an ambiguous token, and the first one
wins.  Consequently, tests would surely fail, since the latter wouldn't be
able to be parsed.

This also demonstrates how we can have good error suggestions for this
parsing framework: because the inner nonterminals and their QNames are known
at compile time, error messages simply generate a list of QNames that are
expected.

The error recovery strategy is the same as previously noted, and subject to
the same concerns, though it may be more appropriate here: it is desirable
for the inner parser to fail rather than retrying, so that the sum parser is
able to fail and, once the Kleene operator is introduced, retry on another
potential element.  But again, that recovery strategy may happen to work in
some cases, but'll fail miserably in others (e.g. placing an unknown element
at the head of a block that expects a sequence of elements would potentially
fail the entire block rather than just the invalid one).  But more to come
on that later; it's not critical at this point.  I need to get parsing
completed for TAME's input language.

DEV-7145
2022-07-14 15:12:57 -04:00
Mike Gerwitz 1fdfc0aa4d tamer: xir::parse::ele: Introduce open/close span bindings
This adds the ability to bind identifiers to represent `OpenSpan` and
`CloseSpan`, available to the `@` and `/` maps.  Since identifiers in TAME
originate from attributes, this may not get a whole lot of use, but it's
important to be available.

There is some awkwardness in that the opening span appears to be scoped to
the entire nonterminal, but it's actually only available in the `@`
mapping.  I'll change this if it's actually needed; this keeps things simple
for now.

DEV-7145
2022-07-13 23:42:51 -04:00
Mike Gerwitz cceb8c7fb9 tamer: xir::parse::ele: Initial Close mapping support
Since the parsers produce streaming IRs, we need to be able to emit tokens
representing closing delimiters, where they are important.

This notably doesn't use spans; I'll add those next, since they're also
needed for the previous work.

DEV-7145
2022-07-13 15:02:46 -04:00
Mike Gerwitz c30c0e268d tamer: xir::parse::ele::test: TODO regarding recovery strategy
The comment explains the issue.  I don't think the strategy is going to be a
desirable one, but I want to move on and observe in retrospect how it ought
to be handled.

The important part right now is that recovery is accounted for and possible,
which was a long-standing concern.

DEV-7145
2022-07-13 14:25:25 -04:00
Mike Gerwitz 73efc59582 tamer: xir::parse::ele: Initial element parser generator concept
This begins generating parsers that are capable of parsing elements.  I need
to move on, so this abstraction isn't going to go as far as it could, but
let's see where it takes me.

This was the work that required the recent lookahead changes, which has been
detailed in previous commits.

This initial support is basic, but robust.  It supports parsing elements
with attributes and children, but it does not yet support the equivalent of
the Kleene star (`*`).  Such support will likely be added by supporting
parsers that are able to recurse on their own definition in tail position,
which will also require supporting parsers that do not add to the stack.

This generates parsers that, like all the other parsers, use enums to
provide a typed stack.  Stitched parsers produce a nested stack that is
always bounded in size.  Fortunately, expressions---which can nest
deeply---do not need to maintain ancestor context on the stack, and so this
should work fine; we can get away with this because XIRF ensures proper
nesting for us.  Statements that _do_ need to maintain such context are not
nested.

This also does not yet support emitting an object on closing tag, which
will be necessary for NIR, which will be a streaming IR that is "near" to
the source XML in structure.  This will then be used to lower into AIR for
the ASG, which gives structure needed for further analysis.

More information to come; I just want to get this committed to serve as a
mental synchronization point and clear my head, since I've been sitting on
these changes for so long and have to keep stashing them as I tumble down
rabbit holes covered in yak hair.

DEV-7145
2022-07-13 14:08:47 -04:00
Mike Gerwitz c9b3b84f90 tamer: parse::transition::Lookahead: ParseState=>Token type param
Having the lookahead token generic over the `ParseState` was a pain in the
ass for stitching, since they shared the same token type but not the same
parser.  I don't expect there to be any need to be able to infer other
parser-related types for a token of lookahead, so I'd rather just make my
life easier until such a thing is needed.

DEV-7145
2022-07-13 10:13:35 -04:00
Mike Gerwitz bd783ac08b tamer: Replace ParseStatus::Dead with generic lookahead
Oh what a tortured journey.  I had originally tried to avoid formalizing
lookahead for all parsers by pretending that it was only needed for dead
state transitions (that is---states that have no transitions for a given
input token), but then I needed to yield information for aggregation.  So I
added the ability to override the token for `Dead` to yield that, in
addition to the token.  But then I also needed to yield lookahead for error
conditions.  It was a mess that didn't make sense.

This eliminates `ParseStatus::Dead` entirely and fully integrates the
lookahead token in `Parser` that was previously implemented.

Notably, the lookahead token is encapsulated in `TransitionResult` and
unavailable to `ParseState` implementations, forcing them to rely on
`Parser` for recursion.  This not only prevents `ParseState` from recursing,
but also simplifies delegation by removing the need to manually handle
tokens of lookahead.

The awkward case here is XIRT, which does not follow the streaming parsing
convention, because it was conceived before the parsing framework.  It needs
to go away, but doing so right now would be a lot of work, so it has to
stick around for a little bit longer until the new parser generators can be
used instead.  It is a persistent thorn in my side, going against the grain.

`Parser` will immediately recurse if it sees a token of lookahead with an
incomplete parse.  This is because stitched parsers will frequently yield a
dead state indication when they're done parsing, and there's no use in
propagating an `Incomplete` status down the entire lowering pipeline.  But,
that does mean that the toplevel is not the only thing recursing.  _But_,
the behavior doesn't really change, in the sense that it would infinitely
recurse down the entire lowering stack (though there'd be an opportunity to
detect that).  This should never happen with a correct parser, but it's not
worth the effort right now to try to force such a thing with Rust's type
system.  Something like TLA+ is better suited here as an aid, but it
shouldn't be necessary with clear implementations and proper test
cases.  Parser generators will also ensure such a thing cannot occur.

I had hoped to remove ParseStatus entirely in favor of Parsed, but there's a
lot of type inference that happens based on the fact that `ParseStatus` has
a `ParseState` type parameter; `Parsed` has only `Object`.  It is desirable
for a public-facing `Parsed` to not be tied to `ParseState`, since consumers
need not be concerned with such a heavy type; however, we _do_ want that
heavy type internally, as it carries a lot of useful information that allows
for significant and powerful type inference, which in turn creates
expressive and convenient APIs.

DEV-7145
2022-07-12 00:11:45 -04:00
Mike Gerwitz 61ce7d3fc7 tamer: parse::state::transition: Extract module into own file
That's it.  Just preparing for changes that will change how lookahaeds and
dead state transitions will work.

DEV-7145
2022-07-07 12:47:31 -04:00
Mike Gerwitz e54f93b30f tamer: parse: Introduce lookahaed token in Parser
*NB: This is the initial change to introduce the token of lookahead, but this
does not fully integrate it.  In particular, this is missing from the
stitching/delegation layer.*

This has been a long time coming, I suppose, though I had tried to avoid it
with `Parser::delegate_lookahead`.  But the problem with doing that is that
it forced the ParserState to recurse, which both violates that I want no
looping constructs except for the toplevel, and performs additional stack
allocation as it is not in tail position.

The final straw was having to both return an error _and_ an aggregate object
for the attribute parser when an unexpected element is encountered (this
code is not yet committed).  One option was to add a recovery object to the
error object, and formalize that, but then we have other concerns; for
example, what if that recovery object triggered an error?  We'd have to mask
either the old or the new error.  But we wouldn't want to mask either,
because the object causing the error would be the aggregate attributes,
which is _not_ a recovery object, but actual data we want to emit.  And so
it's a kluge right off of the bat.

The use of a token of lookahaed is a more traditional approach and has uses
outside of just this one scenario.  It'll also allow for the removal of
recursion from the existing ParserStates, and possibly the elimination of
dead state associated data, though I may end up leaving that; more to come.

Rust will also optimize away lookahead storage and processing in Parsers
that do not utilize it.

DEV-7145
2022-07-07 11:19:55 -04:00
Mike Gerwitz 6385270fe6 tamer: Ensure debug_assert! takes effect in test profile
I'd feel rather silly if I used `debug_assert!` for the sake of tests and
they weren't actually being run due to optimization settings.

This is just to catch potential future regressions; all is well today.

DEV-7145
2022-07-05 14:59:35 -04:00
Mike Gerwitz 40c68d3e1e tamer: parse::state::TransitionResult: Make opaque
There was only one test outside of the `parse` module using these
fields.  The next commit will be introducing lookahead, and I do not want to
have to trust callers to ensure invariants are met.

DEV-7145
2022-07-05 14:12:06 -04:00
Mike Gerwitz a16a0d9138 Revert "tamer: xir: Initial re-introduction of AttrEnd"
This reverts commit b973d36862.

Alright, I'm getting sick of fighting with myself on this.  But rather than
just removing the last commit, I'm going to keep it around, so that my
thoughts are clearly documented for my future quarrels with myself.

Firstly: this added more overhead than I wanted it to.  While it wasn't
significant, it did add 100--150ms to one of our largest systems, up from
~2.8s, which seems a bit much for a token that's really just meant to make
life easier for the parser.

Further, it seems that all I've managed to do is push my original problem to
a different layer---this started as a means to resolve having to emit both
an object and an error simultaneously in the case where aggregate attribute
parsing has completed, but we encounter an error on the next token (e.g. an
unexpected element).  But XIRF, if it's missing AttrEnd, should throw an
error, but should also recover.  Recovery is easy---just assume that it was
present---_but then we don't emit a XIRF `AttrEnd` token_, which is
necessary for downstream systems.  So we'd need to either:

  (a) emit both a token and an error; or
  (b) panic.

But if we're doing (a), then the need for `AttrEnd` goes away, because it
solves the original problem (though the other concerns of the previous
commit still stand).  (b) is not ideal at all, even though the missing token
does represent an internal system error; it's not something the user can
correct.  But, given that it's something that the user cannot correct,
doesn't that imply that it's an awkward thing to include in the token
stream?  So back to `AttrEnd` being an awkward PITA to have.

So, given (a), I'll just do that: errors will become more of a "hey, this
error just occurred, but I'm trying to recover---here's an object that you
should use if you choose to continue parsing, but it may or may not be what
you're looking for; proceed with caution".  That flips the original script:
I imagined having external systems feed recovery tokens, but this
encapsulates recovery within the parser, which really is more appropriate,
though less flexible than having an omniscient external recovery system;
such a monolith was always an awkward concept and would be difficult to
implement cleanly.

This can also potentially be implemented as a generalization of the Dead
state change that allowed an object to be emitted alongside the
lookahead/error.

Anyway, back to where I was...I'm sure I'll look back on this in the future
shaking my head, reflecting on how naive I was.

DEV-7145
2022-06-29 11:25:44 -04:00
Mike Gerwitz b973d36862 tamer: xir: Initial re-introduction of AttrEnd
AttrEnd was initially removed in
0cc0bc9d5a (and the commit prior), because
there was not a compelling reason to use it over a lookahead
operation (returning a token via the a dead state transition); `AttrEnd`
simply introduced inconsistencies between the XIR reader (which produced
AttrEnd) and internal XIR stream generators (e.g. the lowering operations
into XIR->XML, which do not).

But now that parsers are performing aggregation---in particular the
attribute parser-generator `xir::parse::attr`---this has become quite a
pain, because the dead state is an actionable token.  For example:

  1. Open
  2. Attr
  3. Attr
  4. Open
  5. ...

In the happy case, token #4 results in `Parsed::Incomplete`, and so can just
be transformed into the object representing the aggregated attributes.  But
even in this happy path, it's ugly, and it requires non-tail recursion on
the parser which requires a duplicate stack allocation for the
`ParserState`.  That violates a core principle of the system.

But if there is an error at #4---e.g. an unexpected element---then we no
longer have a `Parsed::Incomplete` to hijack for our own uses, and we'd have
to introduce the ability to return both an error and a token, or we'd have
to introduce the ability to keep a token of lookahead instead of reading
from the underlying token stream, but that's complicated with push parsers,
which are used for parser composition.  Yikes.

And furthermore, the aggregation has caused me to introduce the ability to
override the dead state type to introduce both a token of lookahead and
aggregation information.  This complicates the system and is going to be
confusing to others.

Given all of this, AttrEnd does now seem appropriate to reintroduce, since
it will allow processing of aggregate operations when encountering that
token without having to worry about the above scenario; without having to
duplicate a `ParseState` stack; without having to hijack dead state
transitions for producing our aggregate object; and everything else
mentioned above.

This commit does not modify those abstractions to use AttrEnd yet; it
re-introduces the token to the core system, not the parser-generators, and
it doesn't yet replace lookahead operations in the parsers that use
them.  That'll come next.  Unlike the commit that removed it, though, we are
now generating proper spans, so make note of that here.  This also does not
introduce the concept to XIRF yet, which did not exist at the time that it
was removed, so XIRF is filtering it out until a following commit.

DEV-7145
2022-06-29 11:02:02 -04:00
Mike Gerwitz 9276d00456 tamer: Cargo.toml: Remove lazy_static
This is not longer needed after the previous commit, with static spans
having been replaced by `const` spans.

This used to be required before Rust acquired better const features, and
before I had preinterned symbols.

DEV-7145
2022-06-24 14:18:04 -04:00
Mike Gerwitz c671bf6a9c tamer: xir: Introduce {Ele,Open,Close}Span
This isn't conceptally all that significant of a change, but there was a lot
of modify to get it working.  I would generally separate this into a commit
for the implementation and another commit for the integration, but I decided
to keep things together.

This serves a role similar to AttrSpan---this allows deriving a span
representing the element name from a span representing the entire XIR
token.  This will provide more useful context for errors---including the tag
delimiter(s) means that we care about the fact that an element is in that
position (as opposed to some other type of node) within the context of an
error.  However, if we are expecting an element but take issue with the
element name itself, we want to place emphasis on that instead.

This also starts to consider the issue of span contexts---a blob of detached
data that is `Span` is useful for error context, but it's not useful for
manipulation or deriving additional information.  For that, we need to
encode additional context, and this is an attempt at that.

I am interested in the concept of providing Spans that are guaranteed to
actually make sense---that are instantiated and manipulated with APIs that
ensure consistency.  But such a thing buys us very little, practically
speaking, over what I have now for TAMER, and so I don't expect to actually
implement that for this project; I'll leave that for a personal
project.  TAMER's already take a lot of my personal interests and it can
cause me a lot of grief sometimes (with regards to letting my aspirations
cause me more work).

DEV-7145
2022-06-24 14:16:29 -04:00
Mike Gerwitz 873e5fc761 tamer: asg::ident: {prolog=>prologue} typo fix
Somewhat humorous.
2022-06-23 09:19:12 -04:00
Mike Gerwitz 2fafc331a1 tamer: xir::reader: Opening and closing tag whitespace
Non-attribute and non-empty start/end tags will have their whitespace
as part of the produced span.  This sets us up for a following change that
will allow for deriving the name span from this span given a QName, which
gives us a span that both represents the entire XIR token and allows
deriving the element name.

An accurate token span is necessary for parsing errors where an element was
not expected, while an element name span is more appropriate for issues of
grammar and semantic errors that deal not with the fact that an element was
encountered, but _what_ element was encountered.

DEV-7145
2022-06-22 15:10:49 -04:00
Mike Gerwitz e5c8a218c3 tamer: xir::reader: Correct empty element whitespace handling
This both adds clarifying tests and corrects the case of `<foo/>`, where the
offset was erroneously off by one---it saw that there were no attributes and
added a byte thinking it'd include `>`, as in `<foo>`.

DEV-7145
2022-06-22 10:28:44 -04:00
Mike Gerwitz adc45d90df tamer: xir::parse: Attribute parser generator
This is the first parser generator for the parsing framework.  I've been
waiting quite a while to do this because I wanted to be sure that I
understood how I intended to write the attribute parsers manually.  Now that
I'm about to start parsing source XML files, it is necessary to have a
parser generator.

Typically one thinks of a parser generator as a separate program that
generates code for some language, but that is not always the case---that
represents a lack of expressiveness in the language itself (e.g. C).  Here,
I simply use Rust's macro system, which should be a concept familiar to
someone coming from a language like Lisp.

This also resolves where I stand on parser combinators with respect to this
abstraction: they both accomplish the exact same thing (composition of
smaller parsers), but this abstraction doesn't do so in the typical
functional way.  But the end result is the same.

The parser generated by this abstraction will be optimized an inlined in the
same manner as the hand-written parsers.  Since they'll be tightly coupled
with an element parser (which too will have a parser generator), I expect
that most attribute parsers will simply be inlined; they exist as separate
parsers conceptually, for the same reason that you'd use parser combinators.

It's worth mentioning that this awkward reliance on dead state for a
lookahead token to determine when aggregation is complete rubs me the wrong
way, but resolving it would involve reintroducing the XIR AttrEnd that I had
previously removed.  I'll keep fighting with myself on this, but I want to
get a bit further before I determine if it's worth the tradeoff of
reintroducing (more complex IR but simplified parsing).

DEV-7145
2022-06-21 13:23:02 -04:00
Mike Gerwitz 9598532d8b tamer: xir::st: Add missing docs for generated QName constants
This was missed.  It was not possible, using the documentation
alone (without looking at the linked source) to tell what the QName actually
represented, though you could assume by the name.

DEV-7145
2022-06-21 13:23:01 -04:00
Mike Gerwitz 3f23bc5e33 tamer: fmt: New type-based formatting system
This is partly an experiment, but is designed to simplify producing English
sentences in various contexts.  It makes use of a not only unstable, but
incomplete, Rust feature---adt_const_params, for a static str const type
parameter.  Hopefully that ends up being stabalized.

This uses types, but it's the same as function composition due to Rust's
monomorphization.

DEV-7145
2022-06-10 16:28:15 -04:00
Mike Gerwitz f7752436da tamer: parse::Parser: Add remaining field docs
DEV-7145
2022-06-07 15:23:20 -04:00
Mike Gerwitz 3c227e5a2d tamer: parse::ParseState: Remove Default trait bound
`ParseState` originally required `Default` for use with `mem::take` in
`Parser::feed_tok`.  This unfortunately cannot last, since more specialized
parsers require context during initialization in order to provide useful
diagnostic information.  (The other option is to require the caller to
augment errors with diagnostic information, but that would have to be
duplicated by every caller and complicates parser composition; I'd prefer
those diagnostic details remain encapsulated.)

Replacing `Default` with `Option` is uglier, but it ends up producing the
same assembly as `mem::take` did, at least at the time of writing.  Because
Rust is able to elide unnecessary moves using this implementation, there is
no need for `unwrap_unchecked` or other unsafe methods, which is great,
since it shows that this parsing methodology is viable entirely in safe
Rust.

DEV-7145
2022-06-07 15:08:40 -04:00
Mike Gerwitz f14ffc87c2 tamer: parse::state::ParseState::DeadToken: New associated type
Previously, `ParseStatus::Dead` always yielded
`ParseState::Token`.  However, I'm working on introducing parsers that
aggregate (parsing XML attributes into structs), and those parsers do not
know that they have completed aggregation until they reach a dead state;
given that, I need to yield additional information at that time.

I played around with a number of alternative ideas, but this ended up being
the cleanest, relative to the effort involved.  For example, introducing
another parameter to `ParseStatus::Dead` was too burdensome on APIs that
ought not concern themselves with the possibility of receiving an object in
addition to a lookahead token, since many parsers are not capable of doing
so (given that they map M:(N<=M)).

Another option that I abandoned fairly quickly was having
`is_accepting` (potentially renamed) return an aggregate object, since
that's on the side and didn't feel like it was part of the parsing pipeline.

The intent is to abstract this some in a new `ParseState` method for
delegation + aggregation.

DEV-7145
2022-06-07 09:37:41 -04:00
Mike Gerwitz 495c1438fd tamer: Consistent span diagram representation
I'll document it more formally eventually, but this settles on a mix of the
two: square brackets and dashes for intervals, `+` for intersecting lines,
byte offsets below interval endpoints, and names below that.

The docblock for `Span` itself iss still off; I'll probably just take one of
the test cases and paste it there at some point.

DEV-7145
2022-06-06 11:32:35 -04:00
Mike Gerwitz bba181f573 tamer: xir::attr::Attr: Introduce AttrSpan
This replaces a tuple with a tuple struct that allows for calculating more
complete span information, such as the span encompassing the entire
attribute and the value span including the surrounding quotes.

This includes logic that ought to be abstracted into `Span` itself, and it's
not as formal as I'd like it to be (e.g. not ensuring context), but this is
a good starting point.

Note that parsers call `Token::span`, which in turn calculates the attribute
span, each time an attribute is encountered during lowering.  But Rust does
a good job at optimizing away unnecessary operations, so this didn't have an
observable impact on time.

DEV-7145
2022-06-06 11:31:28 -04:00
Mike Gerwitz 2b8e7e6031 tamer: xir::st::qname: New module
This moves and deduplicates the static `QName`s into a common area.

DEV-7145
2022-06-06 11:31:27 -04:00
Mike Gerwitz 3da82b351e tamer: xir::flat::{State=>XirToXirf}: Rename
Like the previous two commits, this states the intent of this parser, which
results in more clear pipeline composition.

DEV-7145
2022-06-02 13:48:54 -04:00
Mike Gerwitz 91b55999e2 tamer: asg::air::{AirState=>AirAggregate}: Rename
Like the previous commit, this emphasizes what is happening.

DEV-7145
2022-06-02 13:26:46 -04:00
Mike Gerwitz 45bbf3879e tamer: obj::xmlo::{lower=>air}: Rename {LowerState=>XmloToAir}
This provides much more clarity as to what is going on.  Further, it's less
ambiguous, since I'm about to introduce a new type of xmlo lowering into XIR
for writing the actual xmlo files.

DEV-7145
2022-06-02 13:23:41 -04:00
Mike Gerwitz 8d92667388 tamer: Integrate xir::reader as a parser in the lowering pipeline
This allows `XmlXirReader` to be used in a `Lower` operation, just as
everything else, bringing me one step closer to a pipeline that can be
concisely represented; this is finally beginning to unify in a clear way,
though it is still a bit of a mess.

This causes `XmlXirReader` to _act_ like a `parse::Parser` in that it yields
a `ParsedResult`, but it does not use `parse::Parser` itself; that was the
_original_ plan: convert it into a `ParseState` where `XmlXirReader` became
a context, and force `Parser` to yield by feeding it a stream of tokens with
`repeat`, but that ended up performing poorly relative to this change.  I
did some investigation, which I might write about in the future, but for
now, this solution works just fine.

DEV-7145
2022-06-02 10:30:44 -04:00
Mike Gerwitz f8c28655dc tamer: parse: Split into multiple modules
This abstraction has grown quite a bit, and it's time to start formalizing
it a bit.  This split doesn't change any behavior, but it does start to make
it easier to reason about by clearly stating the broad components and how
they interact with one-another.

This doesn't yet move the tests; those will come next, but they are very
few. The reason I gave previously for this was because (a) they're tested
indirectly via the systems that utilize them and (b) because the abstraction
was not yet settled on the process was already very expensive.  No test
coverage was lost---it's only that failures were potentially harder to debug
on test failures, but in practice not even this was true, because the deeply
expressive types all but ensured that, if it compiles, it will function in a
way that is expected.  Unit tests and documentation for this system will be
added once I'm sure that this abstraction is in a proper state.

DEV-7145
2022-06-01 11:32:58 -04:00
Mike Gerwitz 63aa452197 tamer: parse: Move parse::lower into Lower
This also modifies `poc` such that `Lower` is invoked as an associated
function rather than a method to emphasize the pattern that is forming, so
that it can be later abstracted away.

DEV-11864
2022-06-01 11:15:43 -04:00
Mike Gerwitz f40f8bbafc tamer: parse: Rename {lower_*_while_ok=>lower_*}
The `while_ok` can just be implied with a lowering operation, and that
reduces the name complexity so that we can maybe introduce even more
specialized methods without resulting in a huge sentence as a name.

DEV-11864
2022-05-27 14:10:55 -04:00
Mike Gerwitz b084e23497 tamer: Refactor asg_builder into obj::xmlo::lower and asg::air
This finally uses `parse` all the way up to aggregation into the ASG, as can
be seen by the mess in `poc`.  This will be further simplified---I just need
to get this committed so that I can mentally get it off my plate.  I've been
separating this commit into smaller commits, but there's a point where it's
just not worth the effort anymore.  I don't like making large changes such
as this one.

There is still work to do here.  First, it's worth re-mentioning that
`poc` means "proof-of-concept", and represents things that still need a
proper home/abstraction.

Secondly, `poc` is retrieving the context of two parsers---`LowerContext`
and `Asg`.  The latter is desirable, since it's the final aggregation point,
but the former needs to be eliminated; in particular, packages need to be
worked into the ASG so that `found` can be removed.

Recursively loading `xmlo` files still happens in `poc`, but the compiler
will need this as well.  Once packages are on the ASG, along with their
state, that responsibility can be generalized as well.

That will then simplify lowering even further, to the point where hopefully
everything has the same shape (once final aggregation has an abstraction),
after which we can then create a final abstraction to concisely stitch
everything together.  Right now, Rust isn't able to infer `S` for
`Lower<S, LS>`, which is unfortunate, but we'll be able to help it along
with a more explicit abstraction.

DEV-11864
2022-05-27 13:51:29 -04:00
Mike Gerwitz 95229916ca current/compiler/worksheet: Generate lv:package/@name
This is present on all other packages.  Rather than complicating TAMER to
accommodate a missing name, it's trivial to just add it.

This will, unfortunately, invalidate and require rebuilding of all xmlo
files, based on the `.rev-xmlo` bump.

DEV-11864
2022-05-26 10:20:05 -04:00
Mike Gerwitz eafb3b2a1b tamer: Add Display impl for each ParseState for generic ParseErrors
This is intended to describe, to the user, the state that the parser is
in.  This will be used to convey additional information for general parser
errors, but it should also probably be integrated into parsers' individual
errors as well when appropriate.

This is something I expected to add at some point, but I wanted to add them
because, when dealing with lowering errors, it can be difficult to tell
what parser the error originated from.

DEV-11864
2022-05-25 15:26:02 -04:00
Mike Gerwitz 9edc32dd3b tamer: parse::LowerIter: Generic inner TripIter iterator
This commit is preparing to compose LowerIter directly.

DEV-11864
2022-05-24 10:27:14 -04:00
Mike Gerwitz f218c452b9 tamer: iter::trip: Flatten Result
The `*_iter_while_ok` functions now compose like monads, flattening `Result`
at each step and drastically simplifying handling of error types.  This also
removes the bunch of `?`s at the end of the expression, and allows me to use
`?` within the callback itself.

I had originally not used `Result` as the return type of the callback
because I was not entirely sure how I was going to use them, but it's now
clear that I _always_ use `Result` as the return type, and so there's no use
in trying to be too accommodating; it can always change in the future.

This is desirable not just for cleanup, but because trying to refactor
`asg_builder` into a pair of `Parser`s is really messy to chain without
flattening, especially given some state that has to leak temporarily to the
caller.  More on that in a future commit.

DEV-11864
2022-05-20 16:08:16 -04:00
Mike Gerwitz 958a707e02 tamer: asg: Hoist Root from Ident into Object
This was always the intent, but I didn't have a higher-level object
yet.  This removes all the awkwardness that existed with working the root
in as an identifier.

DEV-11864
2022-05-19 12:48:43 -04:00
Mike Gerwitz 6252758730 tamer: asg::Object: Introduce Object::Ident
This wraps `Ident` in a new `Object` variant and modifies `Asg` so that its
nodes are of type `Object`.

This unfortunately requires runtime type checking.  Whether or not that's
worth alleviating in the future depends on a lot of different things, since
it'll require my own graph implementation, and I have to focus on other
things right now.  Maybe it'll be worth it in the future.

Note that this also gets rid of some doc examples that simply aren't worth
maintaining as the API evolves.

DEV-11864
2022-05-19 12:33:59 -04:00
Mike Gerwitz f75f1b605e tamer: num: Header typo correction 2022-05-19 12:02:38 -04:00
Mike Gerwitz ebf1de5a60 tamer: asg::Ident{Object=>}: Rename
I think this may have been renamed _from_ `Ident` some time ago, but I'm too
lazy to check.  In any case, the name is redundant.

DEV-11864
2022-05-19 11:17:04 -04:00
Mike Gerwitz 7d76cb53f6 tamer: asg: Move SymAttrs conversion into asg_builder
This is a lowering operation and does not belong here.

What a tangled mess this all was (see recent commits); no wonder it was so
confusing.

DEV-11864
2022-05-19 11:07:15 -04:00
Mike Gerwitz eae194abc6 tamer: asg::object: Merge into asg::ident
Everything in this file relates to identifiers, and I'm about to introduce a
higher-level object, one of which may be an identifier.

DEV-11864
2022-05-19 11:05:20 -04:00
Mike Gerwitz 92dba0a28c tamer: obj::xmlo::asg_builder::IdentKindError: Merge into AsgBuilderError
Now that these are in the same module, there's no need for them to be
separate from one-another.

DEV-11864
2022-05-19 10:56:07 -04:00
Mike Gerwitz 07d2ec1ffb tamer: Move Dim and {Sym=>}Dtype into num module
A previous commit mentioned that there's not a place for `Dim`, and
duplicated it between `asg` and `xmlo`.  Well, `Dtype` is also needed in
both, and so here's a home for now.

`Dtype` has always been an inappropriate detail for the system and will one
day be removed entirely in favor of higher-level types; the machine
representation is up to the compiler to decide.

DEV-11864
2022-05-19 10:39:21 -04:00
Mike Gerwitz b2a79e930b tamer: Move SymAttrs lowering into asg_builder
asg_builder is about to be replaced, but in the process of simplifying the
destination IR (the ASG), I'm moving things into the proper place.  This
never belonged here---it belongs with the actual lowering operation.

Previously, this was not reasoned about in terms of a lowering operation,
and was written when I was first introducing myself to Rust and trying to
get a proof-of-concept linker working.

DEV-11864
2022-05-19 10:28:17 -04:00
Mike Gerwitz 8948452b71 tamer: asg::ident::Dim: Narrow type
This matches xmlo::Dim, and could be the same thing, if we can find a home
for it in the future; it's not worth creating such a home right now when I'm
not yet sure what else ought to live there; the duplication may be fine.

The conversion from xmlo needs to be moved, and `Dim` is going to be used
for more than just identifiers (expressions will have type inference
performed).

DEV-11864
2022-05-19 09:32:43 -04:00
Mike Gerwitz 263cb68380 tamer: parse: Persistent context
This allows retrieving and providing a context to a `Parser`.  This is
intended for use with an aggregating parser, in particular to construct the
ASG and return it.

This is a component of a change that replaces `asg_builder` with a
`Parser`-based lowering into the ASG, but there are still changes that need
to be made to simplify things and complete its integration.

DEV-11864
2022-05-18 16:15:09 -04:00
Mike Gerwitz 001499d921 tamer: parse::ParseError: Remove Eq trait bound
Just as in other commits, since it's an unnecessary limitation.

DEV-11864
2022-05-18 16:06:22 -04:00
Mike Gerwitz 3e277270a7 tamer: asg: Track roots on graph
Previously, since the graph contained only identifiers, discovered roots
were stored in a separate vector and exposed to the caller.  This not only
leaked details, but added complexity; this was left over from the
refactoring of the proof-of-concept linker some time ago.

This moves the root management into the ASG itself, mostly, with one item
being left over for now in the asg_builder (eligibility classifications).

There are two roots that were added automatically:

  - __yield
  - __worksheet

The former has been removed and is now expected to be explicitly mapped in
the return map, which is now enforced with an extern in `core/base`.  This
is still special, in the sense that it is explicitly referenced by the
generated code, but there's nothing inherently special about it and I'll
continue to generalize it into oblivion in the future, such that the final
yield is just a convention.

`__worksheet` is the only symbol of type `IdentKind::Worksheet`, and so that
was generalized just as the meta and map entries were.

The goal in the future will be to have this more under the control of the
source language, and to consolodate individual roots under packages, so that
the _actual_ roots are few.

As far as the actual ASG goes: this introduces a single root node that is
used as the sole reference for reachability analysis and topological
sorting.  The edges of that root node replace the vector that was removed.

DEV-11864
2022-05-17 10:42:05 -04:00
Mike Gerwitz 5a866f7735 core/base (___yield): New extern
Rather than having the linker add this symbol opaquely, let's remove the
special case and generalize it.  There's nothing special about yield, except
historical precedent.

Systems can explicitly add it as a root in a common return map.

DEV-11864
2022-05-16 15:07:37 -04:00
Mike Gerwitz 34eb994a0d tamer: asg::Asg::set_fragment: {ObjectRef=>SymbolId}
In the actual implementation (outside of tests), this is always looking up
before adding the symbol.  This will simplify the API, while still retaining
errors, since the identifier will fail the state transition if the
identifier did not exist before attempting to set a fragment.  So while this
is slower in microbenchmarks, this has no effect on real-world performance.

Further, I'm refactoring toward a streaming ASG aggregation, which is a lot
easier if we do not need to perform lookups in a separate step from the
ASG's primitives.

DEV-11864
2022-05-16 13:14:27 -04:00
Mike Gerwitz c49d87976d tamer: parse::Token: Remove Eq trait bound
`PartialEq` remains, and is all that is needed.  See previous commit
regarding the removal of this same bound from `Context`.

This can be re-added if it ends up actually being necessary.  But Tokens are
ephemeral and used only in lowering pipelines, using pattern matching.

DEV-11864
2022-05-16 10:05:14 -04:00
Mike Gerwitz d87006391e tamer: asg::object: Remove IdentObjectState, IdentObjectData
These traits are no longer necessary now that I'm using concrete types; they
just add unnecessary noise and confusion as I attempt to further refactor.

Don't abstract prematurely.

DEV-11864
2022-05-12 16:31:36 -04:00
Mike Gerwitz 3748762d31 tamer: asg::graph::Asg: Remove type parameter O
This removes the generic on the Asg (which was formerly BaseAsg),
hard-coding `IdentObject`, which will further evolve.  This makes the IR an
actual concrete IR rather than an abstract data structure.

These tests bring me back a bit, since they were written as I was still
becoming familiar with Rust.

DEV-11864
2022-05-12 15:46:17 -04:00
Mike Gerwitz 1114edbc6e rater/tame: Remove circular symlink
This was added long ago to maintain BC in some bizarre situation, and I had
forgotten about it, but it's causing problems with lsp-mode in Emacs.
2022-05-12 14:32:24 -04:00
Mike Gerwitz f2c5443176 tamer: asg: Remove generic Asg, rename {Base=>}Asg
This is the beginning of an incremental refactoring to remove generics, to
simplify the ASG.  When I initially wrote the linker, I wasn't sure what
direction I was going in, but I was also negatively influenced by more
traditional approaches to both design and unit testing.

If we're going to call the ASG an IR, then it needs to be one---if the core
of the IR is generic, then it's more like an abstract data structure than
anything.  We can abstract around the IR to slice it up into components that
are a little easier to reason about and understand how responsibilities are
segregated.

DEV-11864
2022-05-11 16:47:13 -04:00
Mike Gerwitz 0493e68cb3 tamer: parse::ParseState::Context: Add missing comment
DEV-11864
2022-05-10 11:06:22 -04:00
Mike Gerwitz 0ef0d2b553 tamer: parse::ParseState:Error: Relax Eq trait bound
This is unnecessarily restrictive, since we do not require anything further
than `PartialEq` for the situations where we care about equality (tests).

DEV-11864
2022-05-06 15:28:47 -04:00
Mike Gerwitz 9f990e19e9 tamer: parse::ParseState::Context: Remove Default trait bound
This is too restrictive, especially for parsers that fold into something,
like the ASG, which may exist prior to invoking the parser.

This moves the trait bound to the functions that actually need it.  Those
obviously cannot be used if the Context does not implement `Default`, but
I'll provide alternative conveniences.

DEV-11864
2022-05-05 15:55:04 -04:00
Mike Gerwitz ba9f429ee7 tamer: obj::xmlo::{XmloEvent=>XmloToken}
The original "event" name was based on quick-xml's `Event`.  This
terminology shift is more closely matched with the new parsing system.

DEV-11864
2022-05-05 12:25:59 -04:00
Mike Gerwitz 0d999b56cd src/current/summary.xsl: Correct invalid UTF-8 sequence
This broke when encoding was set to UTF-8 on this file.
2022-05-04 11:11:02 -04:00
Mike Gerwitz 2954c591a1 src/current/include/preproc/symtable: Remove extern @dtype check
I attempted to resolve an error previously, and I thought I had, but
apparently some symbols acquire a @dtype at some point in the process, or
lose it.  Regardless, I have no interest in debugging or resolving this
mess, since it's going away.

The linker ensures that externs match, so while this could potentially allow
conflicting imports within a package (unlikely, given that extern templates
are recommended), it still will not resolve with a conflicting concrete
implementation.  I'm not worried.

DEV-1036
2022-05-04 10:50:14 -04:00
Mike Gerwitz 0281dfdf0d tamer: Remove wip-frontends feature flag
We want the new system to be used so that we can start catching any problems
that may arise.  Further changes will be flagged as necessary.

DEV-10936
2022-05-04 09:37:10 -04:00
Mike Gerwitz 43c99cb61a src/current/include/preproc/symtable.xsl: Treat mutual missing extern @dtype as match
Extern resolution has apparently been failing for quite some time, resulting
in `preproc:error` nodes in the _symbol table_ of return maps.  This was
caught by the new xmlo parser, which does not ignore nodes it does not care
about.

The failure was caused by missing `@dtype`---the externs did in fact match,
and if they did not, then the linker would have failed.

This doesn't modify the map compiler to properly detect these, because
this compiler is going away in the hopefully-near future, and the problems
will now be caught, though in a very unideal way (as a parse error during
xmlo reading).

DEV-10936
2022-05-04 09:29:29 -04:00
Mike Gerwitz 602cec5560 src/current/compiler/map.xsl: Omit preproc:from from retmap symbols
preproc:sym/preproc:from is used for generating `knownFields` using the
_input_ map, so this has no use for return map values; the map still
produces edges to its dependencies.

The issue is that there are return map entries in some of our systems that
are producing multiple `preproc:from`, but I somewhat-recently modified the
system to support only a single map, to remove dynamic allocation.  This
resolves that problem.

With that said, `knownFields` was created for Liza to know when the
classifier ought to be invoked, to save time.  Back when it was first
introduced ~10y ago, this provided significant savings, however the
structure of our system now is such that nearly every single field invokes
the classifier.

Furthermore, these details should remain encapsulated; if we wanted to make
that determination, we should be provided with a delta, which we could also
use to do incremental classification in the future, if there's an ROI there
after other improvements have been made.

So, eventually, preproc:sym/preproc:from will go away entirely.

DEV-10936
2022-05-04 09:26:18 -04:00
Mike Gerwitz 1ad2fb1dc8 Copyright year update 2022
RSG (Ryan Specialty Group) recently announced a rename to Ryan Specialty (no
"Group"), but I'm not sure if the legal name has been changed yet or not, so
I'll wait on that.
2022-05-03 14:14:29 -04:00
Mike Gerwitz 34fcd19cd0 tamer: obj::xmlo::reader: Replace todo! with error
These are no longer TODOs---they represent invalid tokens.

I'm going to put effort into providing further context with the diagnostic
system [right now] because these are internal errors caused by either
miscompilation or an incomplete reader.

DEV-10936
2022-05-03 09:19:47 -04:00
Mike Gerwitz c4828e7e7a tame: src/current/compiler/worksheet: Place fragments in header
The new xmlo parser was failing on a worksheet xmlo file because fragments
were not properly placed within the header.

This was a change made when tameld was introduced so that we could stop
reading xmlo files early.

DEV-10936
2022-05-03 09:11:13 -04:00
Mike Gerwitz 5875477efa tamer: xir::Token: Remove span from Display
This was missed when removing it from other Display impls when the new
diagnostic system was introduced.  Raw `Span`s display byte offsets and the
context, which is no longer desirable as part of an error message.

DEV-10936
2022-05-03 09:09:55 -04:00
Mike Gerwitz 2ea66f4f97 tame: @encoding="{ISO-8859-1=>utf-8}" for all XML-based files
TAMER rejects this, because we shouldn't be using anything but UTF-8.  My
use of this encoding is ancient, from over a decade ago, that was apparently
just copied around.

DEV-10936
2022-05-02 12:00:42 -04:00
Mike Gerwitz a2e6e37ed1 tamer: Bump nightly Rust version 1.{57=>62}
This removes a couple of feature flags that are no longer necessary.
2022-05-02 11:05:32 -04:00
Mike Gerwitz 7248ef77e4 tamer: diagnose::resolve{r=>}: Rename
Consistent with naming of other modules, which prefers to not needlessly
transform words.

DEV-12151
2022-05-02 09:49:22 -04:00
Mike Gerwitz 75b966c577 tamer: diagnose: Additional documentation
I had waited to provide more documentation until I was sure that the
abstraction was not going to change significantly; there was a lot of
refactoring in prior commits.

DEV-12151
2022-05-02 09:44:53 -04:00
Mike Gerwitz fc1dad8483 tamer: diagnose::report::Section: Further refactor resolved constructor
This speaks for itself.

DEV-12151
2022-04-29 15:54:38 -04:00
Mike Gerwitz ba0ceddd2d tamer: diagnose::report::Section: Constructor refactoring
This moves construction out of `From` and into separate associated
functions, which can be further simplified in a bit.

We also need unit tests for this, since this still relies on integration
tests due to the cost of the aggressive and tight refactoring iterations.

DEV-12151
2022-04-29 13:10:04 -04:00
Mike Gerwitz 3e04217741 tamer: diagnose::report::Section::maybe_squash_into: Remove syslabel TODO
Previously, when adjacent duplicate spans were both resolved, if one failed,
the other certainly would, which would result in duplicate labels each
squash.  Elided spans do not have syslabels, and so this is no longer a
concern.

DEV-12151
2022-04-29 13:07:51 -04:00
Mike Gerwitz 2ae6df38e7 tamer: diagnose::report: Restore source line preview for invalid UTF-8
This was removed in a previous commit while working on simplifying the
implementation, with the hope of returning to it once things were in a
better place.  They are, so let's bring it back.

DEV-12151
2022-04-29 12:41:56 -04:00
Mike Gerwitz f8dda12fae tamer: diagnose::report: Remove TODOs that are no longer applicable
These relate to the most recent commits.

DEV-12151
2022-04-29 12:34:48 -04:00
Mike Gerwitz 2ce0dbdd84 tamer: diagnose::report::SpanLabel: Remove in favor of separate Level and Label
`SpanLabel` was created during a very early refactoring of this system, and
I've just been fighting with it sense.  This removes it, and simplifies
some things in the process.

It also makes clear that `Level` is never optional and removes the awkward
`Level::default` that was there previously; the default is now the lowest
level, which will always be able to be escalated.

DEV-12151
2022-04-29 12:13:11 -04:00
Mike Gerwitz 9a5a2c4f3f tamer: diagnose::report: Avoid re-resolving adjacent identical spans
This does what the original proof-of-concept implementation did---skip a
span that was just processed, since it'll be squashed into the previous
anyway.  These duplicate spans originate from the diagnostic system when
producing supplemental help information.

DEV-12151
2022-04-29 11:57:50 -04:00
Mike Gerwitz a533244473 tamer: diagnose::report::VisualReporter::render: Avoid mspan collection
This used to be necessary when `Report` stored references to heap-allocated
strings, but `Report` now owns those values itself.

DEV-12151
2022-04-29 09:53:22 -04:00
Mike Gerwitz b0a5265ad3 tamer: diagnose::report::test: Extract into separate file
Tests are large and will be getting larger.  The source will also grow as
it's better documented and cleaned up.  It's getting more difficult to
navigate efficiently and concurrently modify implementation and tests, and
parsing via LSP is getting slower with certain types of changes.

DEV-12151
2022-04-29 09:23:06 -04:00
Mike Gerwitz 5c0e224d3c tamer: diagnose::report: Line numbers in gutter
Alright, starting to settle on an abstraction now, and things are coming
together.  This gives us line numbers in the previously-empty gutter, and
widens the gutter to accommodate.  Gutters are normalized across
sections.  Sections are not yet collapsed for sequential line numbers in the
same context.

Exciting!

Here's an example, on an xmlo file:

error: expected closing tag for `preproc:symtable`
     --> /home/.../foo.xmlo:16:4
      |
   16 |    <preproc:symtable xmlns:map="http://www.w3.org/2005/xpath-functions/map">
      |    ----------------- note: element `preproc:symtable` is opened here

     --> /home/.../foo.xmlo:11326:4
      |
11326 |    </preproc:wrong>
      |    ^^^^^^^^^^^^^^^^ error: expected `</preproc:symtable>`

DEV-12151
2022-04-28 23:53:38 -04:00
Mike Gerwitz 5744e08984 tamer: diagnostic::report: Hoist gutter output into Section
The `Section` itself is now responsible for outputting the gutter, which
puts us in a position to be able to apply consistent formatting without
having to propagate width data to every line variant.
2022-04-28 22:59:13 -04:00
Mike Gerwitz 4e03a367a5 tamer: diagnose::report::SourceLine: Separate variants for each line
Now `SourceLine` _does_ actually correspond to a line of output, which will
allow for better formatting (e.g. collapsing padding) and, importantly,
proper management of gutters.

Note that the seemingly unnecessary `SectionSourceLine` allows for a subtle
consistent formatting for all variants' gutters in `SectionLine`, which will
allow us to hoist that rendering out in the next commit.  The other option
was to include a trailing space for padding and marks, but that is not only
sloppy and undesirable, but asking for confusion, especially in editors (like
mine) that trim trailing whitespace.

DEV-12151
2022-04-28 22:49:35 -04:00
Mike Gerwitz fd1c6430a8 tamer: diagnose::report::SectionSourceLine: {Option<Column>=>Column}
If a column isn't present, it degrades to displaying labels like footnotes
anyway, so this simplifies the system rather than catering to a rare
case.  With that said, this does lose functionality, since it does not
render the source line at all, even though we _could_ do so.

I may re-introduce that rendering after some further refactoring,
specifically for gutters.

DEV-12151
2022-04-28 22:23:58 -04:00
Mike Gerwitz 3a5dcfc016 tamer: diagnose::resolver::SourceLine: {Vec<u8>=>String}
Using a byte vector just makes life more difficult with regard to preparing
the diagnostic reports.  We're already validating UTF-8 data for column
generation, which is necessary for a robust report, so let's just store it
as a String to begin with.

DEV-12151
2022-04-28 22:03:37 -04:00
Mike Gerwitz 838db689ad tamer: diagnose::report: Render labels on mark line
Note that, if a span is first encountered with a mark but with _no_ label,
the first label (if collapsed) will be on the next line.  This allows a span
to be marked without extra visual noise if it's not necessary, and to be
able to trust that it'll stay that way.

Until coloring is introduced, this may or may not be easier to read
depending on context.

This is also not yet taking into account where on the line it begins, and so
may render poorly if the span is at the end of a line.  That will be fixed
later on.

DEV-12151
2022-04-28 16:23:13 -04:00
Mike Gerwitz a197267a2d tamer: xir::flat: Remove closing tag name from label
This is now visible in the diagnostic output.  Example at this point in
time, on an xmlo file for one of our smallest systems:

error: expected closing tag for `preproc:symtable`
  --> /home/.../foo.xmlo:16:4
   |
   |    <preproc:symtable xmlns:map="http://www.w3.org/2005/xpath-functions/map">
   |    -----------------
   = note: element `preproc:symtable` is opened here

  --> /home/.../foo.xmlo:11326:4
   |
   |    </preproc:wrong>
   |    ^^^^^^^^^^^^^^^^
   = error: expected `</preproc:symtable>`

DEV-12151
2022-04-28 15:47:34 -04:00
Mike Gerwitz 33baca113a tamer: diagnose::report: Vary mark character depending on level
Looking more and more Rust-like.  Shameless copy.

TBH I forget what character it uses for help, but it's easy enough to
change.

Also, to be clear: this is modeled after Rust, but it's not a requirement of
mine that it look exactly like it.  I just like the general style; I'll
surely deviate over time, as appropriate (or as I feel like it).

DEV-12151
2022-04-28 15:44:50 -04:00
Mike Gerwitz 8119d1ca0d tamer: diagnose::report: Render span marks under lines
This has the effect of highlighting the columns of the source lines using
'^' as an underline.

The next step will be to have the underline character depend on the
`Level`.

If this commit message doesn't sound all that exciting, given what it
finally achieved after all this time, it's because I'm exhausted, and my
prototype has already taken my excitement.  But this is significant, given
all the work leading up to it.

There is some code cleanup needed and some unit tests that ought to be
written rather than relying on integration, but considering how much this is
being refactored, I don't want to add to that refactoring cost just yet
before gutters are introduced and I know things are settled for now.

DEV-12151
2022-04-28 15:44:49 -04:00
Mike Gerwitz 5db026ed76 tamer: diagnose::report: Initial display of source lines
This has been a lot of refactoring for something that I prototyped a week
ago, and the prototype is still further along in its output formatting (it
has line numbering in gutters and span markings).

But, this has come a long way, and I'm happy with it overall, though I'm not
happy with my slow pace and struggle to maintain focus.  But those are
personal issues.

This leaves a lot to be desired, but at the same time is still really
helpful.  There's a couple notable TODOs regarding pointless allocation and
UTF8 re-checking, but otherwise, the feature-related steps are:

  - Gutters with line numbers; and
  - Marking columns associated with the span.

DEV-12151
2022-04-28 14:33:08 -04:00
Mike Gerwitz 3e06c9aaf3 tamer: diagnose::report: Prepare Section for output of source lines
This lowers the resolved span data into `Section` for display.  The next
step is to actually output it.

DEV-12151
2022-04-28 13:34:05 -04:00
Mike Gerwitz 331aada2bd tamer: diagnose::report::MaybeResolvedSpan: Move up in file
Just rearranging, since this was awkwardly placed relative to where it's
used.

DEV-12151
2022-04-28 11:00:36 -04:00
Mike Gerwitz 6a5a29c2f5 tamer: diagnose::report: Remove Section variants and eagerly squash
Rather than squashing as a separate operation, and explicitly denoting when
it occurred, we'll just always squash, as was done before these changes.  It
doesn't really make sense to make this optional and there's not any value in
keeping the decision around.

This also sets us up favorably for future changes: it creates a vector of
labels, which can be analyzed later to determine how to best lay out marks
and labels.

DEV-12151
2022-04-28 10:30:04 -04:00
Mike Gerwitz c8d919d0cc tamer: diagnose::report: {'l=>'d}
Just renames the lifetime to refer to the `Diagnostic`, rather than a
`Label` returned by it, which was all `'l` was previously used for.

Note that many labels have a `'static` lifetime; this doesn't change that or
somehow cause it to reallocate; the label must life _for at least `'d`_.

DEV-12151
2022-04-27 15:20:16 -04:00
Mike Gerwitz e2c68c5e84 tamer: diagnose::report: Avoid message copy
Rather than rendering the diagnostic `Display` message to a string only to
copy it to yet another buffer later on, this simply stores a reference to
the `Diagnostic` that was provided.  This also adds a type to the `Report`
associating it with the provided `Diagnostic`, which does seem appropriate,
given that the report was produced for it.

I should probably rename '{l=>d} now.

DEV-12151
2022-04-27 15:20:14 -04:00
Mike Gerwitz 3dbab881da tamer: diagnose::report: Produce Report object
Rather than writing to the provided `Write` object, this produces a `Report`
object.  While a lifetime still exists for the diagnostic data (labels,
specifically), I was able to remove the other lifetime resulting from
`ResolvedSpan` by transferring ownership of the data to the `Report`
itself.  Once actual source lines are integrated shortly, `Report` will
include those as well.

This has been a tedious process, but it's coming together.  Hopefully these
commits documenting the progressive and ugly refactoring are found useful by
some reader in the future.

DEV-12151
2022-04-27 15:00:30 -04:00
Mike Gerwitz 3679ff590c tamer: diagnose::report: Remove `L` type parameter
The line number was getting special treatment that is simply not worth the
cost (with regards to how burdensome it is on the type definitions).  This
simplifies things quite a bit.

If we want header customization in the future, we can worry about that in a
different way, or allow the header as a whole to be swapped out, rather than
its constituents.

DEV-12151
2022-04-27 14:23:58 -04:00
Mike Gerwitz 589f5e8c58 tamer: diagnose::report::HeadingLineNum: Compose HeadingColNum
`HeadingColNum` is no longer constructed by `HeadingLineNum`.  This both
narrows the types and required data (e.g. removing dummy values in test
cases), and reduces the coupling (by favoring composition, but still coupled
with the concrete type).

DEV-12151
2022-04-27 11:43:46 -04:00
Mike Gerwitz 7dbe25be05 tamer: diagnose::report::HeadingLineNum: Lower MaybeResolvedSpan
Same as the previous commit with `HeadingColNum`---this removes the
dependency on `MaybeResolvedSpan`.

DEV-12151
2022-04-27 11:28:17 -04:00
Mike Gerwitz 68f9f4d241 tamer: diagnose::report::HeadingColNum: Lower MaybeResolvedSpan
This eliminates `MaybeResolvedSpan` from `HeadingColNum`, along with its
type parameters and lifetimes.

DEV-121251
2022-04-27 11:10:16 -04:00
Mike Gerwitz f29918b5a0 tamer: diagnose::report: Continue refactoring into report components
I'm unhappy with the current state of this, which is why I haven't settled
on docs or unit tests for these changes yet (though note that the
integration tests do cover these changes)---this is still a prototype
refactoring.

In particular, this needs to do more lowering---the `ResolvedSpan` and
`MaybeResolvedSpan` need to be eliminated and lowered into exactly what is
needed so that we can stop reasoning about them and propagating them.

Further, having lines and columns lazily evaluate themselves for
display---based on `MaybeResolvedSpan`---adds extra generics that shouldn't
be necessary; they should be pre-computed and store the concrete data they
need in variants.  Display shouldn't involve computation beyond formatting
of pre-computed data.

That was always the plan, but this refactoring has been incremental.

Anyway: this is in a working and integration-tested state, but it's going to
change.

DEV-12151
2022-04-27 10:48:41 -04:00
Mike Gerwitz e2f9d71c1f tamer: diagnose::report: Refined report components
This generalizes the types a bit more and introduces unit tests.  Note that
these are still also covered by integration tests.

The next step will be to finish generalizing
`<VisualReporter as Reporter>::render`, after which I'll get back to the
task of outputting the source line along with markings and labels.

DEV-12151
2022-04-26 13:26:52 -04:00
Mike Gerwitz d05bcaab03 tamer: {Resolved,Span}::{ctx=>context}: Rename
This is just to provide clarity.  `ctx` is not so widely used that we
benefit from such a short identifier, and it's not worth the cognitive
burden of people unfamiliar with what it may mean.

DEV-12151
2022-04-26 10:52:32 -04:00
Mike Gerwitz 16d76b95d0 tamer: diagnose::resolver::ResolvedSpanData: New trait
This provides the methods originally implemented on `ResolvedSpan` itself,
which will allow for mocking for unit testing.

DEV-12151
2022-04-26 10:46:47 -04:00
Mike Gerwitz 0928427116 tamer: diagnose::resolver::Column::At: Remove
This is redundant with the `Endpoints` variant, although it did read
better.  It's just another case to have to handle.

I was originally going to use `std::ops::RangeInclusive` for `Endpoints`,
however that struct also contains an extra bool indicating whether it was
exhausted (as an iterator), which isn't appropriate for this.

DEV-12151
2022-04-26 10:30:07 -04:00
Mike Gerwitz ec93488365 tamer: diagnost::resolver::ResolvedSpan: Clear methods for all data
This (a) makes it clear the intent of these methods and (b) will allow
introducing a trait for mocking it.

DEV-12151
2022-04-26 10:22:31 -04:00
Mike Gerwitz b9ff7770aa tamer: diagnose::report: Begin refactoring into Display impls
This logic is still covered by the integration tests; I'll be adding unit
tests once it's decoupled to the point where that's possible, which should
be shortly, and after I make sure this is the route I do want to go down.

DEV-12151
2022-04-26 10:14:51 -04:00
Mike Gerwitz c0ace258f0 tamer: diagnose::resolver::SourceLine:: Guarantee non-empty lines
This simplifies types and error handling since we will always have at least
one line, provided that the span is within the range of the context.  To
ensure that, this patch introduces a new error.

DEV-12151
2022-04-22 16:50:16 -04:00
Mike Gerwitz 56b8aec9b7 tamer: diagnose::resolver::test: Extract into own file
There's just a lot here.

DEV-12151
2022-04-22 15:31:12 -04:00
Mike Gerwitz 2e0925627e tamer: diagnose::Label: Introduce lifetime and inner Cow
I did not initially introduce lifetimes because I wasn't sure how the system
was going to evolve, but now lifetimes are going to be needed in a number of
contexts.  The core of TAMER is able to avoid lifetimes in most instances
because of its internment system, but its use is not appropriate for the
diagnostic system's buffers (beyond sourcing strings from already-interned
data).

DEV-12151
2022-04-22 13:23:53 -04:00
Mike Gerwitz aeff7aeed3 tamer: diagnose::test: Extract into own file
This is going to get quite large over time.

DEV-12151
2022-04-22 09:21:18 -04:00
Mike Gerwitz 596c9de85e tamer: diagnose::resolver::SourceLine (line=>num): Rename
`line.line` was rather confounding.

DEV-12151
2022-04-21 15:47:15 -04:00
Mike Gerwitz 5b1f0ab6c6 tamer: diagnostic: Column resolution
Determining the column number is not as simple as performing byte
arithmetic, because certain characters have different widths.  Even if we
only accepted ASCII, control characters aren't visible to the user.

This uses the unicode-width crate as an alternative to POSIX wcwidth, to
determine (hopefully) the number of fixed-width cells that a unicode
character will take up on a terminal.  For example, control characters are
zero-width, while an emoji is likely double-width.  See test cases for more
information on that.

There is also the unicode-segmentation crate, which can handle extended
grapheme clusters and such, but (a) we'll be outputting the line to the
terminal and (b) there's no guarantee that the user's editor displays
grapheme clusters as a single column.  LSP measures in UTF-16,
apparently.  I use both Emacs and Vim from a terminal, so unicode-width
applies to me.  There's too much variation to try to solve that right now.

The columns can be considered a visual span---this gives us enough
information to draw line annotations, which will happen soon.

Here are some useful links:

  - https://hsivonen.fi/string-length/
  - https://unicode.org/reports/tr29/
  - https://github.com/rust-analyzer/rowan/issues/17
  - https://www.reddit.com/r/rust/comments/gpw2ra/how_is_the_rust_compiler_able_to_tell_the_visible/

DEV-10935
2022-04-21 14:27:36 -04:00
Mike Gerwitz e555955450 tamer: span::Span::endpoints_saturated: New method
This gets rid of the `Option` and is used in the diagnostic system (next
commit).

DEV-10935
2022-04-21 14:15:25 -04:00
Mike Gerwitz a22e8e79f7 tamer: diagnose: Integrate resolver for source lines
This does not yet resolve columns, and omits the length of the span, but
it's starting to come together.

This is particularly exciting for me to see because I've been wanting line
numbers in TAME error messages for over a decade.

DEV-10935
2022-04-21 12:34:17 -04:00
Mike Gerwitz 9b4c84de26 tamer: diagnose::resolver: Support rewinding
This does adds support for rewinding the underlying buffer when necessary to
read a span that occurs earlier within the same context (which could also
include the same span read twice).

As part of this change, I cleaned up the code a bit.  Working with this
system can be confusing with the different meanings of the byte offsets and
the different ways of interpreting lines relative to the span that is
provided.  There's not a lot of code here, but it represents a lot of work
to get right.
2022-04-21 12:33:27 -04:00
Mike Gerwitz 1b02e77537 tamer: span (SpanOffsetSize, SpanLenSize): New type aliases
Callers can use these types instead of having to reference globals.

DEV-10935
2022-04-20 09:42:13 -04:00
Mike Gerwitz ab48d79e1f tamer: diagnost::resolver: Initial concept for line resolution
This works, but it's ugly and requires some cleanup.  It shows that there
are some interesting considerations when determining how to best represent
the location of spans to the user in a way that is intuitive.

This is not yet integrated with the reporter, which will require a layer to
load a `Context` from disk.

DEV-10935
2022-04-20 09:42:13 -04:00
Mike Gerwitz a77eb7d937 tamer: span: Minor test refactoring
Just some cleanup based on some new conventions, now that I'm about to make
some changes.

DEV-10935
2022-04-20 09:42:12 -04:00
Mike Gerwitz 725dc3fb54 tamer: tamec: Use diagnostic system for errors
This is a POC, minimal-effort integration that also creates the TamecError
sum type analogous to TameldError.

I'll work on reducing the boilerplate in the future.

A note regarding the type and boilerplate vs. dynamic dispatch, for any
future readers: the purpose of this is to be explicit about the error types
so that the system is self-documenting and it forces and understanding of
its error conditions.  `Box<dyn Error>` is basically "eh idk anything can
happen!", which is not what I'm interested in having.

DEV-10935
2022-04-20 09:42:11 -04:00
Mike Gerwitz eaa8133d21 tamer: diagnose: Introduction of diagnostic system
This is a working concept that will continue to evolve.  I wanted to start
with some basic output before getting too carried away, since there's a lot
of potential here.

This is heavily influenced by Rust's helpful diagnostic messages, but will
take some time to realize a lot of the things that Rust does.  The next step
will be to resolve line and column numbers, and then possibly include
snippets and underline spans, placing the labels alongside them.  I need to
balance this work with everything else I have going on.

This is a large commit, but it converts the existing Error Display impls
into Diagnostic.  This separation is a bit verbose, so I'll see how this
ends up evolving.

Diagnostics are tied to Error at the moment, but I imagine in the future
that any object would be able to describe itself, error or not, which would
be useful in the future both for the Summary Page and for query
functionality, to help developers understand the systems they are writing
using TAME.

Output is integrated into tameld only in this commit; I'll add tamec
next.  Examples of what this outputs are available in the test cases in this
commit.

DEV-10935
2022-04-13 15:22:46 -04:00
Mike Gerwitz 702b5ebb23 tamer: span: Remove PathIndex
We can just use PathSymbolId directly and simplify things.  Typing can (and
should) happen on the symbol itself, and if we want a separate symbol type,
it ought to have its own interner.

For now, it doesn't, and having this extra type is just a PITA.

DEV-10935
2022-04-13 09:59:11 -04:00
Mike Gerwitz c49510646b tamer: parse::Parser (last_span): Replace Option with UNKNOWN_SPAN
There's no use in complicating the error handling here when we'd just
default to `UNKNOWN_SPAN` anyway when trying to render it.  `UNKNOWN_SPAN`
didn't exist at the time of writing.

DEV-10935
2022-04-12 09:59:00 -04:00
Mike Gerwitz cfc7f45bc4 tamer: Remove wip-xmlo-xir-reader
This entirely removes the old XmloReader that has since been replaced with a
XIR-based reader.

I had been holding off on this because the new reader is slower, pending
performance optimizations (which I'll do a little later on), however the
performance loss is of no practical consideration and only affects the
linker, which is still fast.

Therefore, it's better to get this old code out of the way to simplify
refactoring going forward.  In particular, I'm working on the diagnostic
system.

This is a little sad, in a way---this is some of my first Rust code that I'm
deleting.

DEV-10935
2022-04-11 16:11:49 -04:00
Mike Gerwitz 4c69efd175 tamer: obj::xmlo::error: Remove XirfError
This does not deal directly with XIRF (that's composed into a pipeline
outside of this parser).

I'd like to clean up further...perhaps I should retire the
wip-xmlo-xir-reader flag now, despite the minor performance regression (see
previous recent commits for explanation).

DEV-10935
2022-04-11 15:52:40 -04:00
Mike Gerwitz f07c0e75be tamer: tameld (TameldError): Error sum type
This aggregates all non-panic errors that can occur during link time, making
`Box<dyn Error>` unnecessary.  I've been wanting to do this for a long time,
so it's nice seeing this come together.  This is a powerful tool, in that we
know, at compile time, all errors that can occur, and properly report on
them and compose them.  This method of error composition ensures that all
errors have a chance to be handled within their context, though it'll take
time to do so in a decent way.

This just maintains compatibility with the dynamic dispatch that was
previous occurring.  This work is being done to introduce the initial
diagnostic system, which was really difficult/confusing to do without proper
errors types at the top level, considering the toplevel is responsible for
triggering the diagnostic reporting.

The cycle error is in particular going to be interesting once the system is
in place, especially once it provides spans in the future, since it will
guide the user through the code to understand how the cycle formed.

More to come.

DEV-10935
2022-04-11 15:15:04 -04:00
Mike Gerwitz a1a4ad3e8e tamer: Introduce context into XirReader
tamec and tameld will now both introduce a `Context` to XIR, which will use
it to create spans.

Here's an example of an error, now that it's all working well together:

  $ target/release/tameld --emit xmle -o /dev/null path/to/package.xmlo
  error: invalid preproc:sym/@dim `9` at [/../path/to/package.xmlo offset 1175451-1175452]

A future task will make this human-readable by producing line and column
numbers, and perhaps even a snippet (if not now, then eventually).

It's exciting to see this coming together finally.

DEV-10934
2022-04-08 16:16:23 -04:00
Mike Gerwitz 68223cb7d3 tamer: xir::reader: Additional quick-xml error spans
There's a bit to unpack here.  Some of the spans originate from quick-xml's
error handling, but in coming up with test cases to try to trigger errors, I
found that quick-xml is far too permissive in what it accepts, and
oughtright dangerous in some situations.

I feel like the writing is on the wall for quick-xml, but I'll probably wait
until replacing `xmlo` with a more efficient format before deciding whether
to use a different library or implement parsing ourselves.  There's a lot of
factors to consider, and a library would have to not only be correct and
performant, but provide useful information for span generation.

But for now, I have other more important things to work on, like a
functioning compiler.  So while quick-xml is around, I'll just have to do
the best I can to provide a correct parser with useful errors.

DEV-10934
2022-04-08 14:54:49 -04:00
Mike Gerwitz ab181670b5 tamer: xir::reader: Initial introduction of spans
This is a large change, and was a bit of a tedious one, given the
comprehensive tests.

This introduces proper offsets and lengths for spans, with the exception of
some quick-xml errors that still need proper mapping.  Further, this still
uses `UNKNOWN_CONTEXT`, which will be resolved shortly.

This also introduces `SpanlessError`, which `Error` explicitly _does not_
implement `From<SpanlessError>` for---this forces the caller to provide a
span before the error is compatable with the return value, ensuring that
spans will actually be available rather than forgotten for errors.  This is
important, given that errors are generally less tested than the happy path,
and errors are when users need us the most (so, need span information).

Further, I had to use pointer arithmetic in order to calculate many of the
spans, because quick-xml does not provide enough information.  There's no
safety considerations here, and the comprehensive unit test will ensure
correct behavior if the implementation changes in the future.

I would like to introduce typed spans at some point---I made some
opinionated choices when it comes to what the spans ought to
represent.  Specifically, whether to include the `<` or `>` with the open
span (depends), whether to include quotes with attribute values (no),
and some other details highlighted in the test cases.  If we provide typed
spans, then we could, knowing the type of span, calculate other spans on
request, e.g. to include or omit quotes for attributes.  Different such
spans may be useful in different situations when presenting information to
the user.

This also highlights gaps in the tokens emitted by XIR, such as whitespace
between attributes, the `=` between name and value, and so on.  These are
important when it comes to code formatting, so that we can reliably
reconstruct the XML tree, but it's not important right now.  I anticipate
future changes would allow the XIR reader to be configured (perhaps via
generics, like a strategy-type pattern) to optionally omit these tokens if
desired.

Anyway, more to come.

DEV-10934
2022-04-08 13:59:37 -04:00
Mike Gerwitz 942bf66231 tamer: frontend: Clean up unused modules
These were part of a POC for frontends quite some time ago.  Some portions
of this concept may be reintroduced, but this was pre-XIR.

DEV-10413
2022-04-07 14:21:08 -04:00
Mike Gerwitz 99aacaf7ca tamer: tamec: Replace copy with XIR parsing/writing
When wip-frontends is on, this will parse the input file using XIR and then
immediately output it again.  This makes the necessary changes to be able to
read every source file we have in our largest project, such that the output
is identical after having been formatted with `xmllint --format -` (there
are differences because e.g. whitespace between attributes is not yet
maintained).

This is performant too, with times remaining essentially identical despite
the additional work.

DEV-10413
2022-04-07 12:13:49 -04:00
Mike Gerwitz b90bf9d8a8 tame: build-aux/{csv2xml,tdat2xml}: Remove xml-stylesheet XML PI
These declarations are relics from when all XML files could be loaded in the
browser to render the Summary Page.  Such a thing has not worked for many
years.

The previous commit will cause files produced by these scripts to be
regenerated.

I noticed this when reading source files using XIR.

DEV-10413
2022-04-07 09:32:00 -04:00
Mike Gerwitz 8e9b2a7211 tame: build-aux/Makefile.am: Generated sources depend on scripts that generate them
This ensures that, when changes are made to these scripts, the files that
are generated from them are re-generated.

Historically this probably was not noticed because (a) they seldom changed
and (b) we had a small team and I told people to re-run bootstrapping
scripts or clean files.  The team is much larger now, and regardless,
there's no reason not to have had this in place.

DEV-10413
2022-04-07 09:31:55 -04:00
Mike Gerwitz 2e386f1baf tamer: xir::reader::XmlXirReader::refill_buf: Clear read buffer
This was done in the old reader many months ago, but I somehow forgot to do
it here (or forgot to).  The new reader was using substantially more memory.

Here's how this change affects the memory profile for one of our
systems (output from `ms_print`):

Before:

    MB
79.75^                                                             #
     |                                                             #
     |                                                             #       @
     |                                               @@@@          #       @
     |                                               @@@           #      @@
     |                                               @@@        @@@#@   @@@@@
     |                                               @@@        @@ #@@@@@@@@@@
     |                                            @@@@@@      @@@@ #@@@@@@@@@@
     |                                         @@ @@ @@@   @@ @ @@ #@@@@@@@@@@
     |                                         @@ @@ @@@  @@@@@ @@ #@@@@@@@@@@
     |                                         @@@@@ @@@ @@@@@@ @@ #@@@@@@@@@@
     |                                         @@@@@ @@@ @@@@@@ @@ #@@@@@@@@@@
     |   @@                                    @@@@@ @@@ @@@@@@ @@ #@@@@@@@@@@
     |   @        @@     @@          @        @@@@@@ @@@ @@@@@@ @@ #@@@@@@@@@@
     |   @        @     @@@         @@  @@@   @@@@@@ @@@ @@@@@@ @@ #@@@@@@@@@@
     |   @     @@@@ @@@@@@@@@@@@@@@@@@@@@ @@@@@@@@@@ @@@ @@@@@@ @@ #@@@@@@@@@@
     | @@@   @@@@@@ @@@@@@@@@ @@@@@ @@@@@ @@ @@@@@@@ @@@ @@@@@@ @@ #@@@@@@@@@@
     | @@@   @ @@@@ @@@@@@@@@ @@@@@ @@@@@ @@ @@@@@@@ @@@ @@@@@@ @@ #@@@@@@@@@@
     | @@@ @@@ @@@@ @@@@@@@@@ @@@@@ @@@@@ @@ @@@@@@@ @@@ @@@@@@ @@ #@@@@@@@@@@
     | @@@ @@@ @@@@ @@@@@@@@@ @@@@@ @@@@@ @@ @@@@@@@ @@@ @@@@@@ @@ #@@@@@@@@@@
   0 +----------------------------------------------------------------------->Gi
     0                                                                   15.20

After:

    MB
63.25^                                                                      #
     |                                                                      #
     |                                                             @@@@@@@@@#@
     |                                                             @@@@@@ @@#@
     |                                                             @@@@@@ @@#@
     |                                                             @@@@@@ @@#@
     |                                                             @@@@@@ @@#@
     |                                                       @@@@@@@@@@@@ @@#@
     |                                                @@@@@@@@@ @@ @@@@@@ @@#@
     |                                         @@@@@@@@ @@@ @@@ @@ @@@@@@ @@#@
     |                                         @@@@@  @ @@@ @@@ @@ @@@@@@ @@#@
     |                                         @@@@@  @ @@@ @@@ @@ @@@@@@ @@#@
     |                                        @@@@@@  @ @@@ @@@ @@ @@@@@@ @@#@
     |                                        @@@@@@  @ @@@ @@@ @@ @@@@@@ @@#@
     |           @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@  @ @@@ @@@ @@ @@@@@@ @@#@
     |        @@@@@@@@@@@@@ @@@@@@@@ @@@@@@@@@@@@@@@  @ @@@ @@@ @@ @@@@@@ @@#@
     |      @@@@@@@@@@@@@@@ @@@@@@@@ @@@@@@@@@@@@@@@  @ @@@ @@@ @@ @@@@@@ @@#@
     |    @@@@@@@@@@@@@@@@@ @@@@@@@@ @@@@@@@@@@@@@@@  @ @@@ @@@ @@ @@@@@@ @@#@
     | @@@@@@@@@@@@@@@@@@@@ @@@@@@@@ @@@@@@@@@@@@@@@  @ @@@ @@@ @@ @@@@@@ @@#@
     | @@@@@@@@@@@@@@@@@@@@ @@@@@@@@ @@@@@@@@@@@@@@@  @ @@@ @@@ @@ @@@@@@ @@#@
   0 +----------------------------------------------------------------------->Gi
     0                                                                   15.20

The bottom graph is virtually identical to the memory profile of the old
reader, just with the exception that it's interning a bit more data than
before, because we're reading more comprehensively.

That's (potentially) the subject of future changes.

DEV-12038
2022-04-06 11:50:07 -04:00
Mike Gerwitz 6871a0cdc7 tamer: parse (ParseState): Doc correction regarding determinism
The pair is now a triple and parsers are often NFAs.
2022-04-05 15:55:58 -04:00
Mike Gerwitz e77bdaf19a tamer: parse: Introduce mutable Context
This resolves the performance issues caused by Rust's failure to elide the
ElementStack (ArrayVec) memcpys on move.

Since XIRF is invoked tens of millions of times in some cases for larger
systems, prior to this change, failure to optimize away moves for XIRF
resulted in tens of millions of memcpys.  This resulted in linking of one
program going from 1s -> ~15s.  This change reduces it to ~2.5s with the
wip-xmlo-xir-reader flag on, with the extra time coming from elsewhere (the
subject of future changes).

In particular, this change introduces a new mutable reference to
`ParseState::parse_token`, which is a reference to a `Context` owned by the
caller (e.g. `Parser`).  In the case of XIRF, this means that
`Parser<flat::State, _>` will own the `ElementStack`/`ArrayVec` instead of
`flat::State`; this allows the latter to remain pure and benefit from Rust's
move optimizations, without sacrificing the otherwise-pure implementation.

ParseStates that do not need a mutable context can use `NoContext` and
remain pure.

DEV-12024
2022-04-05 15:50:53 -04:00
Mike Gerwitz 1a04d99f15 tamer: obj::xmlo::reader: Working xmlo reader
This makes the necessary tweaks to have the entire linker work end-to-end
and produce a compatible xmle file (that is, identical except for
nondeterministic topological ordering).  That's good, and finally that can
get off of my plate.

What's disappointing, and what I'll have more information on in future
commits, is how slow it is.

The linking of our largest package goes from ~1s -> ~15s with this
change.  The reason is because of tens of millions of `memcpy` calls.  Why?

The ParseState abstraction is pure and passes an owned `self` around, and
Parser replaces its own reference using this:

        let result;
        TransitionResult(Transition(self.state), result) =
            take(&mut self.state).parse_token(tok);

Naively, this would store a copy of the old state in `result`, allocate a
new ParseState for `self.state`, pass the original or a copy to
`parse_token`, and then overwrite `self.state` with the new ParseState that
is returned once it is all over.

Of course, that'd be devastating.  What we want to happen is for Rust to
realize that it can just pass a reference to `self.state` and perform no
copying at all.

For certain parsers, this is exactly what happens.  Great!

But for XIRF, it we have this:

  /// Stack of element [`QName`] and [`Span`] pairs,
  ///   representing the current level of nesting.
  ///
  /// This storage is statically allocated,
  ///   allowing XIRF's parser to avoid memory allocation entirely.
  type ElementStack<const MAX_DEPTH: usize> = ArrayVec<(QName, Span), MAX_DEPTH>;

  /// XIRF document parser state.
  ///
  /// This parser is a pushdown automaton that parses a single XML document.
  #[derive(Debug, Default, PartialEq, Eq)]
  pub enum State<const MAX_DEPTH: usize, SA = AttrParseState>
  where
      SA: FlatAttrParseState,
  {
      /// Document parsing has not yet begun.
      #[default]
      PreRoot,

      /// Parsing nodes.
      NodeExpected(ElementStack<MAX_DEPTH>),

      /// Delegating to attribute parser.
      AttrExpected(ElementStack<MAX_DEPTH>, SA),

      /// End of document has been reached.
      Done,
  }

ParseState contains an ArrayVec, and its implementation details are causes
LLVM _not_ to elide the `memcpy`.  And there's a lot of them.

Considering that ParseState is supposed to use only statically allocated
memory and be zero-copy, this is rather ironic.

Now, this _could_ be potentially fixed by not using ArrayVec; removing
it (and the corresponding checks for balanced tags) gets us down to
2s (which still needs improvement), but we can't have a core abstraction in
our system resting on a house of cards.  What if the optimization changes
between releases and suddenly linking / building becomes shit slow?  That's
too much of a risk.

Further, having to limit what abstractions we use just to appease the
compiler to optimize away moves is very restrictive.

The better option seems like to go back to what I used to do: pass around
`&mut self`.  I had moved to an owned `self` to force consideration of _all_
state transitions, but I can try to do the same thing in a different type of
way using mutable references, and then we avoid this problem.  The
abstraction isn't pure (in the functional sense) anymore, but it's safe and
isn't relying on delicate inlining and optimizer implementation details to
have a performant system.

More information to come.

DEV-10863
2022-04-01 16:31:14 -04:00
Mike Gerwitz 9eaebd576b tamer: obj::xmlo::reader: preproc:fragment parsing
This concludes the bulk of the header parsing, though there are surely going
to be other issues when I try to read a real xmlo file, such as
whitespace.  That is something I expect that I'd rather handle as part of
XIRF, but maybe I'll initially ignore it here just to get it working.  We'll
see.

DEV-10863
2022-04-01 16:31:14 -04:00
Corey Vollmer f3545cf347 RELEASES.md: Update for v19.0.3 2022-04-01 15:06:07 -04:00
Corey Vollmer d46aebe4bd [DEV-11788] Add upper & lower abbreviation for states 2022-03-31 16:33:45 -04:00
Mike Gerwitz fb3da09fa4 tamer: obj::xmlo::reader: preproc:sym-deps processing
This parses the symbol dependency list (adjacency list).

I'm noticing some glaring issues in error handling, particularly that the
token being parsed while an error occurs is not returned and so recovery is
impossible.  I'll have to address that later on, after I get this parser
completed.

Another previous question that I had a hard time answering in prior months
was how I was going to compose boilerplate parsers, e.g. handling the
parsing of single-attribute elements and such.  A pattern is clearly taking
shape, and with the composition of parsers more formalized, that'll be able
to be abstracted away.  But again, that's going to wait until after this
parser is actually functioning.  Too many delays so far.

DEV-10863
2022-03-30 15:05:55 -04:00
Mike Gerwitz 3f8e397e57 tamer: obj::xmlo::reader: Parse preproc:sym/preproc:from
Ideally this would just be an attribute, but I guess I never got around to
making that change in the compiler and I don't want a detour right now.

DEV-10863
2022-03-30 12:06:38 -04:00
Mike Gerwitz 9b429b6fc3 tamer: obj::xmlo::reader::SymtableState: Correct object span
I clearly was not paying attention to what was correct behavior here, since
the tests also verified the wrong behavior: rather than taking the last
processed attribute span, we should be taking the span of the opening
tag for the `preproc:sym` node.

DEV-10863
2022-03-30 10:07:11 -04:00
Mike Gerwitz 5c16add95d tamer: parse (Transitionable): New
This simply removes boilerplate.

This will receive concrete examples once I come up with docs for the entire
module; there's boilerplate involved in testing and documenting this in
isolation and the time investment is not worth it yet until I'm certain that
this will not be changed.

DEV-10863
2022-03-30 10:03:14 -04:00
Mike Gerwitz 1e278cbe26 tamer: obj::xmlo::reader: preproc:symtable/preproc:sym parsing
This integrates much of the work done so far to parse into a
`XmloEvent::SymDecl`.  The attribute parsing _is_ verbose, and I do intend
to abstract it away later on, but I'm going to wait on that for now.

The new reader should be finishing up soon, which is really exciting, since
I started working on this months ago (before having to take a break on
TAMER); I'm anticipating strong performance gains in the reader, and this is
a test that will tell us how the compiler will perform moving forward with
the abstractions that I've spent so much time on.

DEV-10863
2022-03-30 09:09:48 -04:00
Mike Gerwitz 4cb478a42d tamer: parser::ParseState::delegate_lookahead: New concept
This introduces a new method similar to the previous `delegate`, but with
another closure that allows for handling lookahead tokens from the child
parser.

Admittedly, this isn't exactly what I was going for---a list of arguments
isn't exactly self-documenting, especially with the brevity when the
arguments line up---but this was easy to do and so I'll run with this for
now.

This also modified `delegate` to accept a context, even though it wasn't
necessary, both for consistency with its lookup counterpart and for brevity
with the `into` argument (allowing, in our case, to just pass the name of
the variant, rather than a closure).

I'm not going to handle the actual starting and accepting state stitching
abstraction for now; I'd like to observe future boilerplate more before I
consider the best way to handle it, though I do have some ideas.

DEV-10863
2022-03-29 14:46:43 -04:00
Mike Gerwitz 2a3d5be159 tamer: parse::ParseState::delegate: Initial state stitching concept
This is the delegation portion of what I've come to call "state
stitching"---wiring together two state machines that recognize the same
input tokens.

This handles the delegation of tokens once the parser has been entered, but
does not yet handle the actual stitching part of it: wiring the start and
accepting states of the child parser to the parent.

This is indirectly tested by the XmloReader, but it will receive its own
tests once I further finalize this concept.  I'm playing around with some
ideas.  With that said, a quick visual inspection together with the
guarantees provided by the type system should convince any familiar reader
of its correctness.

DEV-10863
2022-03-29 14:12:26 -04:00
Mike Gerwitz df05a71508 tamer: obj::xmlo::reader: Emphasize generic SymtableState stitching for Object
This simply makes the block more generic to emphasize how it can be
abstracted away.

DEV-10863
2022-03-29 11:25:05 -04:00
Mike Gerwitz f42288f3a2 tamer: obj::xmlo::reader: Begin symbol table parsing
This wasn't the simplest thing to start with, but I wanted to explore
something with a higher level of complexity.  There is some boilerplate to
observe here, including:

  1. The state stitching (as I guess I'm calling it now) of SymtableState
     with XmloReaderState is all boilerplate and requires no lookahead,
     presenting an abstraction opportunity that I was holding off on
     previously (attr parsing for XIRF requires lookahead).
  2. This is simply collecting attributes into a struct.  This can be
     abstracted away in the future.
  3. Creating stub parsers to verify that generics are stitched rather than
     being tightly coupled with another state is boilerplate that maybe can
     be abstracted away after a pattern is observed in future tests.

DEV-10863
2022-03-29 11:14:47 -04:00
Mike Gerwitz f402e51d04 tamer: parse: More flexible Transition API
This does some cleanup and adds `parse::Object` for use in disambiguating
`From` for `ParseStatus`, allowing the `Transition` API to be much more
flexible in the data it accepts and automatically converts.  This allows us
to concisely provide raw output data to be wrapped, or provide `ParseStatus`
directly when more convenient.

There aren't yet examples in the docs; I'll do so once I make sure this API
is actually utilized as intended.

DEV-10863
2022-03-25 16:45:32 -04:00
Mike Gerwitz c0fa89222e tamer: obj::xmlo::ir::Dim: New enum
This replaces u8 and will be used for the new XmloReader.

Previously I wasn't sure what direction TAMER was going to go in with
regards to dimensionality, but I do not expect that higher dimensions will
be supported, and if they are, they'd very likely compile down to lower ones
and create an illusion of higher-dimensionality.

Whatever the future holds, it's not used today, and I'd rather these types
be correct.

ASG needs changing too, but one step at a time.

DEV-10863
2022-03-25 14:28:18 -04:00
Mike Gerwitz 279ddc79d7 tamer: parse::TransitionResult: Alias=>newtype
This converts the tuple type alias into a newtype, so that we may provide
our own implementations.

This differs from a previous approach that I took, which involved making
this type `Result<(S, T), (S, E)>` so that the return values composed well
with other functions.  But the reality is that this is used only by other
`ParseState`s and `Parser`, so it's unnecessary.

However, this is also an attempt to utilize the new Try and FromResidual
traits; note how the Try associated types match precisely what I was trying
to do before, though they're used as intermediate types.  I'll see how this
evolves.

DEV-10863
2022-03-25 12:28:50 -04:00
Mike Gerwitz 2e98a69d15 Revert "tamer: parse::TransitionResult: Move common Transition into Result"
This reverts commit bf5da75096.
2022-03-25 09:17:25 -04:00
Mike Gerwitz bf5da75096 tamer: parse::TransitionResult: Move common Transition into Result
This allows the Results to compose and, importantly, is compatible with
`?` without having to put in any extra effort.

This makes puts the caller in an awkward spot, so I introduced a utility
function `result_tup0_invert` for now; we'll see if that stays or evolves
differently.

DEV-10863
2022-03-24 23:48:30 -04:00
Mike Gerwitz 9d9b1f30a8 tamer: obj::xmlo::reader: Move XmloEvent to top of module
Since this is the object produced by this parser, this is likely the most
useful first thing to present as a summary of what `XmloReader` actually
does.

DEV-10863
2022-03-24 10:14:40 -04:00
Mike Gerwitz 2e3d94c3d6 tamer: obj::xmlo::reader: Simplify wip-xmlo-xir-reader flagging
This removes the flag from most of the code, which also resolves the
indentation.  Not only was it bothering me, but I don't want (a) every line
modified when the module body is hoisted and (b) `rustfmt` to reformat
everything when that happens.

This means that everything will be built, even though it's not used, when
the flag is off, but I see that as a good thing.

DEV-10863
2022-03-24 09:45:59 -04:00
Mike Gerwitz fab7b16ea0 tamer: obj::xmlo::reader: Parse package attributes
Finally we get to do some actual parsing with all of the preparatory work!

This means that we're finally ready to fully replace the old XmloReader,
provided that I'm okay with some boilerplate / lack of abstractions for
now (and I am, because all I've been doing is working on abstractions to
prepare lowering operations).

DEV-10863
2022-03-23 16:48:51 -04:00
Mike Gerwitz ad8616aaa1 tamer: xir::attr::Attr: Convert to tuple struct with public fields
This makes more sense for pattern matching.  Encapsulation of these fields
is not necessary, given that it's passed around as an owned value and its
`new` method constructs it verbatim; the individual fields are
self-validating.

DEV-10863
2022-03-23 16:41:28 -04:00
Mike Gerwitz fbf786086a tamer: parse::Parser (lower_while_ok): New method
This introduces a WIP lowering operation, abstracting away quite a bit of
the manual wiring work, which is really important to providing an API that
provides the proper level of abstraction for actually understanding what the
system is doing.

This does not yet have tests associated with it---I had started, but it's a
lot of work and boilerplate for something that is going to
evolve.  Generally, I wouldn't use that as an excuse, but the robust type
definitions in play, combined with the tiny amount of actual logic, provide
a pretty high level of confidence.  It's very difficult to wire these types
together and produce something incorrect without doing something obviously
bad.

Similarly, I'm holding off on proper docs too, though I did write some
information here.

More to come, after I actually get to work on the XmloReader.

On a side note: I'm happy to have made progress on this, since this wiring
is something I've been dreading and wondering about since before the Parser
abstraction even existed.

Note also that this makes parser::feed_toks private again---I don't intend
to support push parsers yet, since they're only needed internally.  Maybe
for error recovery, but I'll wait to decide until it's actually needed.

DEV-10863
2022-03-23 14:31:16 -04:00
Mike Gerwitz b4a7591357 tamer: obj::xmlo::reader: Begin conversion to ParseState
This begins to transition XmloReader into a ParseState.  Unlike previous
changes where ParseStates were composed into a single ParseState, this is
instead a lowering operation that will take the output of one Parser and
provide it to another.

The mess in ld::poc (...which still needs to be refactored and removed)
shows the concept, which will be abstracted away.  This won't actually get
to the ASG in order to test that that this works with the
wip-xmlo-xir-reader flag on (development hasn't gotten that far yet), but
since it type-checks, it should conceptually work.

Wiring lowering operations together is something that I've been dreading for
months, but my approach of only abstracting after-the-fact has helped to
guide a sane approach for this.  For some definition of "sane".

It's also worth noting that AsgBuilder will too become a ParseState
implemented as another lowering operation, so:

  XIR -> XIRF -> XMLO -> ASG

These steps will all be streaming, with iteration happening only at the
topmost level.  For this reason, it's important that ASG not be responsible
for doing that pull, and further we should propagate Parsed::Incomplete
rather than filtering it out and looping an indeterminate number of times
outside of the toplevel.

One final note: the choice of 64 for the maximum depth is entirely
arbitrary and should be more than generous; it'll be finalized at some point
in the future once I actually evaluate what maximum depth is reasonable
based on how the system is used, with some added growing room.

DEV-10863
2022-03-22 14:06:52 -04:00
Mike Gerwitz f6957ff028 tamer: parse::Parser: Extract logic from Iterator impl
This introduces a (still-private) way to _push_ tokens into the parser,
rather than relying purely on a pull-based interface.  Not only does this
simplify the iterator, but this is also preparing to make the new `feed_tok`
public so that parsers can be composed in more contexts.  I suspect that
this method may also be useful for error recovery, since it can be used to
inject tokens into arbitrary points of a token stream.

I kept the new method private for now so that I can introduce the new API
and docs separate from this refactoring.

DEV-10863
2022-03-22 10:10:59 -04:00
Mike Gerwitz ceb00c4df5 tamer: xir: Complete parse type migration
A previous commit moved the parser.  This updates the types so that they can
actually be utilized in that context.

DEV-10863
2022-03-21 15:50:43 -04:00
Mike Gerwitz 14638a612f tamer: {xir::=>}parse: Move parser out of XIR
The parsing framework originally created for XIR is now more general and
useful to other things.  We'll see how this evolves.

This needs additional documentation, but I'd like to see how it changes as
I implement XmloReader and then some of the source readers first.

DEV-10863
2022-03-18 16:24:53 -04:00
Mike Gerwitz 0360226caa tamer: xir::parse: Generalize input token type
This adds a `Token` type to `ParseState`.  Everything uses `xir::Token`
currently, but `XmloReader` will use `xir::flat::Object`.

Now that this has been generalized beyond XIR, the parser ought to be
hoisted up a level.

DEV-10863
2022-03-18 15:26:05 -04:00
Mike Gerwitz 150b3b9aa4 tamer: xir::flat: Improve parser validation
This does a couple of things: it ensures that documents one and only one
root note, and it properly handles dead transitions once parsing is
complete (allowing it to be composed).

This should make XIRF feature-complete for the time being.  It does rely on
the assumption that the reader is stripping out any trailing whitespace, so
I guess we'll see if that's true as we proceed.

DEV-10863
2022-03-17 23:22:38 -04:00
Mike Gerwitz f04d845452 tamer: xir::flat::parse_token: Remove now-unapplicable comment
Forgot to delete this in a previous commit.

DEV-10863
2022-03-17 21:37:05 -04:00
Mike Gerwitz aba89f809d tamer: xir::parse: UnexpectedEof Span at final offset
I'm not rendering errors yet in practice, so this wouldn't have been
noticed, but we want error messages to reference the final byte in a file on
EOF, not the offset of the last-encountered token, which would be confusing.

This doesn't _directly_ pertain to what I'm working on; I just happened to
notice it.

DEV-10863
2022-03-17 21:33:05 -04:00
Mike Gerwitz e18eb2a4ac tamer: xir::flat::State::parse_node: Use TransitionResult
This was simply missed in a previous commit.

DEV-10863
2022-03-17 16:30:35 -04:00
Mike Gerwitz 6b8f0663ea tamer: xir::{tree::=>}attr: Move
With the introduction of XIRF, attribute parsing is no longer a XIRT thing.

DEV-10863
2022-03-17 16:10:56 -04:00
Mike Gerwitz 7b6d68af85 tamer: xir::parse::Transition: Generalize flat::Transition
XIRF introduced the concept of `Transition` to help document code and
provide mental synchronization points that make it easier to reason about
the system.  I decided to hoist this into XIR's parser itself, and have
`parse_token` accept an owned state and require a new state to be returned,
utilizing `Transition`.

Together with the convenience methods introduced on `Transition` itself,
this produces much clearer code, as is evidenced by tree::Stack (XIRT's
parser).  Passing an owned state is something that I had wanted to do
originally, but I thought it'd lead to more concise code to use a mutable
reference.  Unfortunately, that concision lead to code that was much more
difficult than necessary to understand, and ended up having a net negative
benefit by leading to some more boilerplate for the nested types (granted,
that could have been alleviated in other ways).

This also opens up the possibility to do something that I wasn't able to
before, which was continue to abstract away parser composition by stitching
their state machines together.  I don't know if this'll be done immediately,
but because the actual parsing operations are now able to compose
functionally without mutability getting the way, the previous state coupling
issues with the parent parser go away.

DEV-10863
2022-03-17 16:02:05 -04:00
Mike Gerwitz 899fa79e59 tamer: xir::flat: Initial XIRF implementation
This introduces XIR Flat (XIRF), which is conceptually between XIR and
XIRT.  This provides a more appropriate level of abstraction for further
lowering operations to parse against, and removes the need for other parsers
to perform their own validations (inappropriately) to ensure well-formed
XML.

There is still some cleanup worth doing, including moving some of the
parsing responsibility up a level back into the XIR parser.

DEV-10863
2022-03-17 13:08:16 -04:00
Mike Gerwitz ce48a654b1 tamer: span::Span::offset_add: Make const
This behavior is unchanged, but it allows us to create more constant spans
for testing.  For example:

  const S = DUMMY_SPAN.offset_add(1).unwrap();

This, in turn, will allow for removing lazy_static! for tests that use it
for span generation.

DEV-10863
2022-03-16 14:16:28 -04:00
Mike Gerwitz 18cb5e7b39 tamer: Update dependencies
Petgraph was previously held back due to petgraph-graphml.  I'd like to
transition away from that at some point, given that it's tied to petgraph
and also pulls in xmlns, on top of quick-xml and our XIR, but that can come
down the line.
2022-03-11 10:51:51 -05:00
Mike Gerwitz 2f703ab2df tamer: obj::xmlo: Remove PackageAttrs in favor of token stream
The Options here are awkward and will be able to go away in the new reader
and in AsgBuilder once it has a proper state machine.

This gets rid of some of the initial migratory work for the new reader,
because PackageAttrs is gone.  I'm going to wait to update this to the new
way until I get further into this.

DEV-11449
2022-03-10 15:44:54 -05:00
Mike Gerwitz d428755a2e tamer: obj::xmlo::XmloEvent::SymDeps: Remove
This is not longer needed after the previous commit.
2022-03-10 13:43:07 -05:00
Mike Gerwitz dcfae8a624 tamer: obj::xmlo: Begin transition to streaming quick-xml reader
I'm finally back to TAMER development.

The original plan, some time ago, was to gate an entirely new XmloReader
behind a feature flag (wip-xmlo-xir-reader), and go from there, leaving the
existing implementation untouched.  Unfortunately, it became too difficult
and confusing to marry the old aggregate API with the new streaming one.

AsgBuilder is the only system interacting with XmloReader, so I decided (see
previous commits) to just go the route of refactoring the existing
one.  I'm not yet sure if I'll continue to progressively refactor this one
and eliminate the two separate implementations behind the flag, or if I'll
get this API similar and then keep the flag and reimplement it.  But I'll
know soon.

DEV-11449
2022-03-10 13:31:24 -05:00
Mike Gerwitz 74ddc77adb tamer: xir::escape::CachingEscaper: allow(dead_code) for feature-flagged code
For now, until this feature flag is removed, so that we do not see warnings
when the flag is off.
2022-03-10 10:03:07 -05:00
Mike Gerwitz 76b16fed09 tamer: iter::collect::TryCollect::try_collect_ok: Disambiguate try_collect
The Rust team has begun to introduce try_collect.  I will keep an eye on
this implementation and revisit this, but for the time being, I'm going to
disambiguate this so that I can move on without worrying about a future
breakage.

  - https://github.com/rust-lang/rust/issues/94047
  - https://doc.rust-lang.org/nightly/std/iter/trait.Iterator.html#method.try_collect
2022-03-08 12:55:54 -05:00
Mike Gerwitz 21770305f9 RELEASES.md: Update for v19.0.2 2022-03-07 12:24:31 -05:00
Mike Gerwitz 70d1ad17b8 map: Force param/@default in translation to be numeric
The default ought to be numeric, always, but until we have the compiler
checking for that, I'm going to leave the casting in place.

DEV-10484
2022-03-07 12:22:18 -05:00
Mike Gerwitz 054ad9b4c4 map: Properly apply param/@default for translation fallback
This was broken by the previous fix, because I had cast to a numeric value
before invoking `set_defaults`, which needs the empty string retained so
that it knows whether a default ought to be applied.

This also ensures that `set_values` will always return a numeric value when
that default is applied.

DEV-10484
2022-03-07 11:47:58 -05:00
Mike Gerwitz a49dd68cfd RELEASES.md: Update for v19.0.1 2022-03-03 13:47:37 -05:00
Mike Gerwitz 501a9441a5 map: Produce 0 instead of NaN for non-numeric string values
This has been a problem for...ever, but the old classification system (and
calculations) had `||0` for ever variable reference, whereas the new one
does not; NaNs result in undefined behavior in the new classification
system, since those values are not expected to exist.

This ought to have automated tests, but it will be rewritten in TAMER.

DEV-10484
2022-03-03 13:22:24 -05:00
Mike Gerwitz fb5f38d14c RELEASES.md: Update for v19.0.0 2022-03-01 16:32:43 -05:00
Mike Gerwitz 297b88c3c1 x/0=0 with global flag for new classification system
This was originally my plan with the new classification system, but it was
undone because I had hoped to punt on the somewhat controversial
issue.  Unfortunately, I see no other way.  Here I attempt to summarize the
reasons why, many of which are specific to the design decisions of TAME.

Keep in mind that TAME is a domain-specific language (DSL) for writing
insurance rating systems.  It should act intuitively for our use case, while
still being mathematically sound.

If you still aren't convinced, please see the link at the bottom.

Target Language Semantics (ECMAScript)
--------------------------------------
First: let's establish what happens today.  TAME compiles into ECMAScript,
which uses IEEE 754-2008 floating-point arithmetic.  Here we have:

  x/0 = Infinity,  x > 0;
  x/0 = -Infinity, x < 0;
  0/0 = NaN,       x = 0.

This is immediately problematic: TAME's calculations must produce concrete
real numbers, always.  NaN is not valid in its domain, and Infinity is of no
practical use in our computational model (TAME is build for insurance rating
systems, and one will never have infinite premium).  Put plainly: the
behavior is undefined in TAME when any of these values are yielded by an
expression.

Furthermore, we have _three different possible situations_ depending on
whether the numerator is positive, negative, or zero.  This makes it more
difficult to reason about the behavior of the system, for values we do not
want in the first place.

We then have these issues in ECMAScript:

  Infinity  * 0 = NaN.
  -Infinity * 0 = NaN.
  NaN       * 0 = NaN.

These are of particular concern because of how predicates work in TAME,
which will be discussed further below.  But it is also problematic because
of how it propagates: once you have NaN, you'll always have NaN, unless you
break out of the situation with some control structure that avoids using it
in an expression at all.

Let's now consider predicates:

  NaN  >  0   = false.
  NaN  <  0   = false.
  NaN === 0   = false.
  NaN === NaN = false.

These will be discussed in terms of classification predicates (matches).

We also have issues of serialization:

  JSON.stringify(Infinity) = "null".
  JSON.stringify(NaN)      = "null".

These means that these values are difficult to transfer between systems,
even if we wanted them.

TAME's Predicates
-----------------
TAME has a classification system based on first-order logic, where ⊥ is
represented by 0 and ⊤ is represented by 1.  These classifications are used
as predicates to calculations via the @class attribute of a rate block.  For
example:

  <rate-each class="property" generates="propValue" index="k">
    <c:quotient>
      <c:value-of name="buildingTiv" index="k" />
      <c:value-of name="tivPropDivisor" index="k" />
    </c:quotient>
  </rate>

As can be observed via the Summary Page, this calculation compiles into the
following mathematical expression:

  ∑ₖ(pₖ(tₖ/dₖ)),

that is—the quotient is then multiplied by the value of the `property`
classification, which is a 0 or 1 respectively for that index.

Let's say that tivPropDivisor were defined in this way:

  <rate-each class="property" generates="tivPropDivisor" index="k">
    <!--- ... logic here ...  -->
  </rate>

It does not matter what the logic here is.  Observe that the predicate here
is `property` as well, which means that, if this risk is not a property
risk, then `tivPropDivisor` will be `0`.

Looking back at `propValue`, let's say that we do have a property risk, and
that `buildingTiv` is `[100_000, 200_000]` and `tivPropDivisor` is 1000.  We
then have:

  1(100,000 / 1000) + 1(200,000 / 1000)) = 300.

Consider instead what happens if `property` is 0.  Since we have no property
locations, we have `[0, 0]` as `buildingTiv` and `tivPropDivisor` is 0.

  0(0/0) + 0(0/0)) = 0(NaN + NaN) = NaN.

This is clearly not what was intended.  The predicate is expected to be
_strongly_ zero, as if using an Iverson bracket:

  ((0/0)[0] + (0/0)[0]) = 0.

Of course, one option is to redefine TAME such that we use Iverson's
convention in place of summation, however this is neither necessary nor
desirable given that

  (a) NaN is not valid within the domain of any TAME expression, and
  (b) Summation is elegantly generalized and efficiently computed using
      vector arithmetic and SIMD functions.

That is: there's no use in messing with TAME's computational model for a
valid that should be impossible to represent.

Short-Circuiting Computation
----------------------------
There's another way to look at it, though: that we intended to skip the
computation entirely, and so it doesn't matter what the quotient is.  If the
compiler were smart enough (and maybe one day it will be), it would know
that the predicate of `tivPropDivisor` and `propValue` are the same and so
there is no circumstance under which we would compute `propValue` and have
`tivPropDivisor` be 0.

The problem is: that short-circuiting is employed as an _optimization_, and
is an implementation detail.  Mathematically, the expression is unchanged,
and is still invalid within TAME's domain.  It is unrepresentable, and so
this is not an out.

But let's pretend that it was defined that way, which would yield this:

              { ∑ₖ(pₖ(tₖ/dₖ)),  ∀x∈p(x = 1);
  propValue = <
              { 0,             otherwise.

This is the optimization that is employed, but it's still not mathematically
correct!  What happens if p₀ = 1, but p₁ = 0?  Then we have:

  1(100,000/1000) + 0(0/0) = 100 + NaN = NaN,

but the _intent_ was clearly to have 100 + 0 = 100, and so we return to the
original problem once again.

Classification Predicates and Intent
------------------------------------
Classifications are used as predicates for equations, but classifications
_themselves_ have predicates in the form of _matches_.  Consider, for
example, a classification that may be used in an assertion to prevent
negative premium from being generated:

  <t:assert failure="premBuilding must not be negative for any index">
    <t:match-gte value="premBuilding" value="#0" />
  </t:assert>

Simple enough—the system will fail if the premium for a given building is
below $0.

But what happens if premBuilding is calculated as so?

  <rate-each class="property" yields="premBuildingTotal"
             generates="premBuilding" index="k">
    <c:product>
      <c:value-of name="propValue" index="k" />
      <c:value-of name="propRate" index="k" />
    </c:product>
  </rate-each>

Alas, if `property` is false for any index, then we know that `propValue` is
NaN, and NaN * x = NaN, and so `premBuilding` is NaN.

The above assertion will compile the match into the first-order sentence

  ∀x∈b(x > 0).

Unfortunately, NaN is not greater than, less than, equal to, or any other
sort of thing to 0, and so _this assertion will trigger_.  This causes
practical problems with the `_premium_` template, which has an
`@allow-zero@` argument to permit zero premium.

Consider this real-world case that I found (variables renamed), to avoid a
strawman:

  <t:premium class="loc" round="cent"
             yields="locInitialTotal"
             generates="locInitial" index="k"
             allow-zero="true"
             desc="...">
    <c:value-of name="premAdditional" />

    <c:quotient>
      <c:value-of name="premLoc" index="k" />
      <c:value-of name="premTotal" />
    </c:quotient>
  </t:premium>

This appears to be responsible for splitting up `premAdditional` relative to
the total premium contribution of each location.  It explicitly states that
it wants to permit a zero value.  The intent of this block is clear: a value
of 0 is explicitly permitted and _expected_.

But if `premTotal` is for whatever reason 0—whether it be due to a test
case or some unexpected input—then it'll yield a NaN and make the entire
expression NaN.  Or if `premAdditional` or `premLoc` are tainted by a NaN,
the same result will occur.  The assertion will trigger.  And, indeed, this
is what I'm seeing with test cases against the new classification system.

What about Infinity?  Is it intuitive that, should `propValue` in the
previous example be positive and `propRate` be 0, that we would, rather than
producing a very small value, produce an infinitely large one?  Does that
match intuition?  Remember, this system is a domain-specific language for
_our_ purposes—it is not intended to be used to model infinities.

For example, say we had this submission because the premium exceeds our
authority to write with some carrier:

  <t:submit reason="Premium exceeds authority">
    <t:match-gt name="premBuilding" value="#100k" />
  </t:submit>

If we had

  (100,000 / 0) = ∞,

then this submit reason would trigger.  Surely that was not intended, since
we have `property` as a predicate and `propRate` with the same predicate,
implying that the answer we _actually_ want is 0!  In that case, what we
_probably_ want to trigger is something like

  <rate yields="premFinal">
    <t:maxreduce>
      <c:value-of name="premBuildingTotal" />
      <c:value-of name="#500" />
    </t:maxreduce>
  </rate>,

in order to apply a minimum premium of $500.  But if `premBuildingTotal` is
Infinity, then you won't get that—you'll get Infinity, which is of course
nonsense.

And nevermind -Infinity.

Why Wasn't This a Problem Before?
---------------------------------
So why bring this up now?  Why have we survived a decade without this?

We haven't, really—these bugs have been hidden.  But the old classification
system covered them up; predicates would implicitly treat missing values as
0 by enclosing them in `(x||0)` in the compiled code.  Observe this
ECMAScript code:

  NaN || 0 = 0.

Consequently, the old classification system absorbed bad values and treated
them implicitly as 0.  But that was a bug, and had to be removed; it meant
that missing indexes in classifications would trigger predicates that were
not intended to be triggered, if they matched against 0, or matched against
a value less than some number larger than zero.  (See
`core/test/core/class` for examples.)

The new classification system does not perform such defaulting.  _But it
also does not expect to receive values outside of its valid domain._
Consequently, _NaN and Infinity lead to undefined behavior_, and the
current implementation causes the predicate to match (NaN < 0) and therefore
fail.

The reason for this is because that this implementation is intended to
convey precisely the computation necessary for the classification system, as
formally defined, so that it can be later optimized even further.  Checking
for values outside the domain not only should not be necessary, but it would
prevent such future optimizations.

Furthermore, parameters used to compile into (param||0), to account for
missing values or empty strings.  This changed somewhat recently with
5a816a4701, which pre-cast all inputs and
allowed relaxing many of those casts since they were both wasteful and no
longer necessary.

Given that, for all practical purposes, 0/0=0 in the system <1yr ago.

Infinity, of course, is a different story, since (Infinity||0)=Infinity;
this one has always been a problem.

Let's Just Fail
---------------
Okay, so we cannot have a valid expression, so let's just fail.

We could mean that in two different ways:

  1. Fail at runtime if we divide by 0; or
  2. Fail at compile-time if we _could_ divide by 0.

Both of these have their own challenges.

Let's dismiss #2 right off the bat for now, because until we have TAMER,
that's not really feasible.  We need something today.  We will discuss that
in the future.

For #1—we cannot just throw an error and halt computation, because if the
`canterm` flag passed into the system is `false`, then _computation must
proceed and return all results_.  Terminating classifications are checked
after returning rather than throwing errors.

Since we have to proceed with computation, then the computations have to be
valid, and so we're left with the same problem again—we cannot have
undefined behavior.

One could argue that, okay, we have undefined behavior, but we're going to
fail because of the assertion anyway!  That's potentially defensible, but it
is at the moment undesirable, because we get so many failures.  And,
relative to the section below, it's not clear to me what benefit we get from
that behavior other than making things more difficult for ourselves.

Furthermore, such an assertion would have to be defined for every
calculation that performs a quotient, and would have to set some
intermediate flag in the calculation which would then have to be checked for
after-the-fact.  This muddies the generated calculation, which causes
problems for optimizations, because it requires peering into state of the
calculation that may be hidden or optimized away.

If we decide that calculations must be valid because we cannot fail, and we
have to stick with the domain of calculations, then `x/0` must be
_something_ within that domain.

x/0=0 Makes Sense With the Current System
-----------------------------------------
Let's take a step back.  Consider a developer who is unaware that
NaN/Infinity are permitted in the system—they just know that division by
zero is a bad thing to do because that's what they learned, and they want to
avoid it in their code.

Consider that they started with this:

  <rate-each class="property" generates="propValue" index="k">
    <c:quotient>
      <c:value-of name="buildingTiv" index="k" />
      <c:value-of name="tivPropDivisor" index="k" />
    </c:quotient>
  </rate>

They have inspected the output of `tivPropDivisor` and see that it is
sometimes 0.  They understand that `property` is a predicate for the
calculation, and so reasonably think that they could do something like this:

  <classify as="nonzero-tiv-prop-divisor" ...>
    <t:match-ne on="tivPropDivisor" value="#0" />
  </classify>

and then change the rate-each to

  <rate-each class="property nonzero-tiv-prop-divisor" ...>.

Except that, of course, we know that will have no effect, because a NaN is a
NaN.  This is not intuitive.

So they'd have to do this:

  <rate-each class="property" generates="propValue" index="k">
    <c:cases>
      <c:case>
        <t:when-ne name="tivPropDivisor" value="#0" />

        <c:quotient>
          <c:value-of name="buildingTiv" index="k" />
          <c:value-of name="tivPropDivisor" index="k" />
        </c:quotient>
      </c:case>

      <c:otherwise>
        <c:value-of name="#0" />
      </c:otherwise>
    </c:cases>
  </rate>.

But for what purpose?  What have we gained over simply having x/0=0, which
does this for you?

The reason why this is so unintuitive is because 0 is the default case in
every other part of the system.  If something doesn't match a predicate, the
value becomes 0.  If a value at an index is not defined, it is implicitly
zero.  A non-matching predicate is 0.

This is exploited for reducing values using summation.  So the behavior of
the system with regards to 0 is always on the mind of the developer.  If we
add it in another spot, they would think nothing of it.

It would be nice if it acted as an identity in a monoidic operation,
e.g. as 0 for sums but as 1 for products, but that's not how the system
works at all today.  And indeed such a thing could be introduced using a
special template in place of `c:value-of` that copies the predicates of the
referenced value and does the right thing.

The _danger_, of course, is that this is _not_ how the system as worked, and
so changing the behavior has the risk of breaking something that has relied
on undefined behavior for so long.  This is indeed a risk, but I have taken
some confident in (a) all the test cases for our system pass despite a
significant number of x/0=0 being triggered due to limited inputs, and (b)
these situations are _not correct today_, resulting in `null` in serialized
result data because `JSON.stringify([NaN, Infinity]) === "[null, null]"`.

Given all of that, predictable incorrect behavior is better than undefined
behavior.

So x/0=0 Isn't Bad?
-------------------
No, and it's mathematically sound.  This decision isn't unprecedented—
Coq, Lean, Agda, and other theorem provers define x/0=0.  APL originally
defined x/0=1, but later switched to 0.  Other languages do their own thing
depending on what is right for their particular situation.

Division is normally derived from

  a × a⁻¹ = 1, a ≠ 0.

We're simply not using that definition—when we say "quotient", or use the
`/` symbol, we mean a _different_ function (`div`, in the compiled JS),
where we have an _additional_ axiom that

  a / 0 = 0.

And, similarly,

  0⁻¹ = 0.

So we've taken a _normally undefined_ case and given it a definition.  No
inconsistency arises.

In fact, this makes _sense_ to do, because _this is what we want_.  The
alternative, as mentioned above, is a lot of boilerplate—checking for 0 any
time we want to do division.  Complicating the compiler to check for those
cases.  And so on.  It's easier to simple state that, in TAME, quotients
have this extra convenient feature whereby you don't have to worry about
your denominator being zero because it'll act as though you enclosed it in a
case statement, and because of that, all your code continues to operate in
an intuitive way.

I really recommend reading this blog post regarding the Lean theorem prover:

  https://xenaproject.wordpress.com/2020/07/05/division-by-zero-in-type-theory-a-faq/
2022-02-28 16:27:51 -05:00
Mike Gerwitz 9fa79ce5ea TAME_PARAMS: New Makefile var
This is intended to be set via the configure script, and is being added
primarily for the upcoming flag to enable the legacy classification
system.  This is only used for the XSLT-based compiler.
2022-02-28 12:35:17 -05:00
Mike Gerwitz ce0da76ccf Improve symbol table processing time
preproc:symtable-process-symbols is run on each pass (e.g. during initial
processing and after each template expansion) to introduce new symbols into
the symbol table from imports and newly discovered symbols.

This processing was previously optimized a bit using maps to reduce the cost
of symbol table lookups, but the processing was still inefficient, relying
on XSLT1-style processing (as originally written) for deduplication.  This
now uses `for-each-group` and `perform-sort` to offload the expensive
computation onto Saxon, which is much more efficient.

Symbol table processing has long been a culprit, but I hadn't attempted to
optimize further in recent months because of TAMER work.  Since TAMER has
been on pause for a few months with other things needing my attention, I
needed to provide a short-term performance improvement to keep up with
increasing build times.

DEV-11716
2022-02-22 22:05:07 -05:00
Mike Gerwitz 1796753940 core/vector: Remove aggregate package
Like core/numeric, this was to maintain BC and has not been used for many
years (it does not even build).
2022-01-28 12:01:18 -05:00
Mike Gerwitz a300842582 core/build.xml: Remove
This is no longer necessary (and proably never was).  I assume that this was
added when I was trying to get core to build independently.
2022-01-28 12:00:26 -05:00
Mike Gerwitz 40e2472fac core/numeric: Remove aggregate package
This package was originally added long ago when it was split into
multiple.  It is no longer used.
2022-01-28 11:56:06 -05:00
Mike Gerwitz cd13b80f31 build-aux/check-coupling: Prohibit supplier imports of UI packages
The reverse was checked, but apparently a check for suppliers importing the
UI was never added.
2022-01-28 10:50:27 -05:00
Mike Gerwitz 2a84e44a58 bin/tame: Fix runner output line clearing
The output was being omitted under certain conditions, meaning that users
would have to look in the runlogs for errors.
2022-01-28 09:21:34 -05:00
Mike Gerwitz 8b255c2251 tame: tamed --help: Add missing closing quote to awk example 2022-01-26 13:51:34 -05:00
Mike Gerwitz 8fbddfb3b3 tamed: Fix --help and add another reporting example
$2 was not escaped and would fail expansion.  I apparently did not run
--help before committing.  Shame on me.
2022-01-20 23:32:28 -05:00
Mike Gerwitz 6fd570477a tamed: Add runtab and TAMED_RUNTAB_OUT
This provides logging that can be used to analyze jobs.  See `tamed --help`
for some examples.  More to come.

You'll notice that one of the examples reprents package build time in
_minutes_.  This is why TAMER is necessary; as of the time of writing, the
longest-building package is nearly five and a half minutes, and there are a
number of packages that take a minute or more.  But, there are potentially
other optimizations that can be done.  And this is _after_ many rounds of
optimizations over the years.  (TAME was not originally built for what it is
currently being used for.)
2022-01-19 16:47:12 -05:00
Mike Gerwitz 4a3b86f480 tamed: Ignore SIGUSR2
This was originally going to tell tamed to redraw the runner status line,
but a different approach was taken.
2022-01-19 15:41:28 -05:00
Mike Gerwitz c72d908a3f tamed: Add missing --report to help
Missing from previous commit.
2022-01-19 13:29:23 -05:00
Mike Gerwitz 756dcd7894 tamed --report and runner status line (TAMED_TUI)
This is something that I've wanted to do for quite some time, but for good
reason, have been avoiding.

`tamed --report` is fairly basic right now, but allows you to see what each
of the runners are doing.  This will be expanded further to gather data for
further analysis.

The thing that I was avoiding was a status line during the build to
summarize what the runners are doing, since it's nearly impossible to do so
from the build output with multiple runners.  This will not only allow me to
debug more easily, but will keep the output plainly visible to developers at
all times in the hope that it can help them improve the build times
themselves in certain cases.

It is currently gated behind TAMED_TUI, since, while it works well overall,
it is imperfect, and will cause artifacts from build output partly
overwriting the status line, and may even occasionally clobber the PS1 by
erasing the line.  This will be improved upon in the future; something is
better than nothing.
2022-01-19 11:51:48 -05:00
Mike Gerwitz 4c5b860195 tamer: Remove Ix generic from ASG
This is simply not worth it; the size is not going to be the bottleneck (at
least any time soon) and the generic not only pollutes all the things that
will use ASG in the near future, but is also incompatible with the SymbolId
default that is used everywhere; if we have to force it to 32 bits anyway,
then we may as well just default it right off the bat.

I thought that this seemed like a good idea at the time, and saving bits is
certainly tempting, but it was premature.
2022-01-14 10:21:49 -05:00
Mike Gerwitz 5af698d15c tamer: xir::{tree::=>}parse: Move module
It's a bit odd that I've done next to nothing with TAMER for the past week
or so, and decided to do this one small thing before I go on break for the
holidays, but I felt compelled to do _something_.  Besides, this gets me in
a better spot for the inevitable mental planning and writing I'll be doing
over the holidays.

This move was natural, given what this has evolved into---it has nothing to
do with the concept of a "tree", and the modules imports emphasized that
fact given the level of inappropriate nesting.
2021-12-23 13:17:18 -05:00
Mike Gerwitz 8221e3a011 tamer: xir::tree::Stack: Refactor transitions
Now that the parser has been simplified by removing attributes, we can
further simplify the state transitions to make it more clear what further
refactoring can be done.

DEV-11339
2021-12-17 11:40:30 -05:00
Mike Gerwitz d5a2d43526 tamer: xir::tree::attr::parse::AttrParse{r=>}State
Simply correcting a naming inconsistency between the trait and the concrete
type.

DEV-11339 / DEV-11268
2021-12-17 10:22:29 -05:00
Mike Gerwitz 0cc0bc9d5a tamer: xir::Token::AttrEnd: Remove
More information can be found in the prior commit message, but I'll
summarize here.

This token was introduced to create a LL(0) parser---no tokens of
lookahead.  This allowed the underlying TokenStream to be freely passed to
the next system that needed it.

Since then, Parser and ParseState were introduced, along with
ParseStatus::Dead, which introduces the concept of lookahead for a single
token---an LL(1) grammar.

I had always suspected that this would happen, given the awkwardness of
AttrEnd; it was just a matter of time before the right abstraction
manifested itself to handle lookahead.

DEV-11339
2021-12-17 10:14:31 -05:00
Mike Gerwitz 61f7a12975 tamer: xir::tree: Integrate AttrParserState into Stack
Note that AttrParse{r=>}State needs renaming, and Stack will get a better
name down the line too.  This commit message is accurate, but confusing.

This performs the long-awaited task of trying to observe, concretely, how to
combine two automata.  This has the effect of stitching together the state
machines, such that the union of the two is equivalent to the original
monolith.

The next step will be to abstract this away.

There are some important things to note here.  First, this introduces a new
"dead" state concept, where here a dead state is defined as an _accepting_
state that has no state transitions for the given input token.  This is more
strict than a dead state as defined in, for example, the Dragon Book, where
backtracking may occur.

The reason I chose for a Dead state to be accepting is simple: it represents
a lookahead situation.  It says, "I don't know what this token is, but I've
done my job, so it may be useful in a parent context".  The "I've done my
job" part is only applicable in an accepting state.

If the parser is _not_ in an accepting state, then an unknown token is
simply an error; we should _not_ try to backtrack or anything of the sort,
because we want only a single token of lookahead.

The reason this was done is because it's otherwise difficult to compose the
two parsers without requiring that AttrEnd exist in every XIR stream; this
has always been an awkward delimiter that was introduced to make the parser
LL(0), but I tried to compromise by saying that it was optional.  Of course,
I knew that decision caused awkward inconsistencies, I had just hoped that
those inconsistencies wouldn't manifest in practical issues.

Well, now it did, and the benefits of AttrEnd that we had in the previous
construction do not exist in this one.  Consequently, it makes more sense to
simply go from LL(0) to LL(1), which makes AttrEnd unnecessary, and a future
commit will remove it entirely.

All of this information will be documented, but I want to get further in
the implementation first to make sure I don't change course again and
therefore waste my time on docs.

DEV-11268
2021-12-16 09:44:02 -05:00
Mike Gerwitz 0c7f04e092 tamer: xir::tree: Simplify Stack and remove isolated attr remnants
These were missed from a couple of commits ago, after I recalled that I
could now simplify the Stack variants; they were made more complicated due
to isolated attribute parsing.

These progressive refactorings do a good job illustrating why composing
parsers is better than a monolith---the complexity of the parsers is
significantly reduced, and the number of combinations of states are also
greatly reduced, which allows us to reason about them in isolation.

DEV-11268
2021-12-14 12:49:06 -05:00
Mike Gerwitz 0061a13d63 tree: xir::tree::Object: Remove now-unneeded enum
This was added only for isolated attribute parsing.  Of course, this does
mean that a new union type will be needed when combining the two parsers,
depending on the desired resolution, but that'll come at a later time and
possibly in a more general way.

DEV-11268
2021-12-14 12:44:32 -05:00
Mike Gerwitz c7f846752d tamer: xir::tree: Remove now-unused isolated attribute parsing
This is handled by the new AttrState, so this is largely just removing
now-duplicate code.

DEV-11268
2021-12-14 12:42:02 -05:00
Mike Gerwitz 69acba3ec0 tamer: xir::tree: Use parse::Parser for parse
All tree module parsing functions now make use of parse::Parser.

This module will eventually be hoisted from tree.

DEV-11268
2021-12-14 12:36:35 -05:00
Mike Gerwitz b30d7dc84e tamer: xir::tree::parser_from: Use parse::Parser
This nearly completely integrates the new Parser with xir::tree, but does
not yet compose AttrParseState.  I also need to determine what to do with
`parse()` and, further, make `parser_from` generic as part of mod parse.

If we take a moment to reflect on all of the changes, this struggle has been
a roundabout way of converting tree's parser into parse::Parser; providing
a trait for Stack (as ParseState); beginning parser decomposition; and
moving some common logic into Parser.  The composition of parsers is the
final piece to be realized.

This could have been a lot less work if I really understood exactly what I
wanted to do up front, but as was mentioned in previous commits, I was
really confusing myself trying to maintain API BC in ways that I should not
have for XmloReader.  More on that will be coming soon as well.

DEV-11268
2021-12-13 16:57:04 -05:00
Mike Gerwitz 6e9d139373 tamer: xir::tree::parse::Parser: Remove lifetime
This will allow Parser to operate on both owned and &mut values, and is the
same approach that Rust's built-in iterators take.

This is at first quite surprising, and I often forget that this is a
feature, and, as a bonus, an attractive way to avoid lifetimes in struct
definitions when generics are used for the type that may become a
reference.

DEV-11268
2021-12-13 16:51:15 -05:00
Mike Gerwitz f09900b80c tamer: xir::tree: Remove isolated AttrList parsing
This isn't currently used by anything, and this is collecting, which does
not fit well with the streaming model.  AttrList was originally written for
Element parsing, and the isolated attr parser was written for test cases,
before it was fully decided how this system ought to work.

Instead, if AttrList is in fact needed, we can either collect (ideally not)
or implement Extend for AttrList.  (Or create TryExtend.)

DEV-11268
2021-12-13 16:20:50 -05:00
Mike Gerwitz 29fdf5428c tamer: xir::tree: {Parse=>Stack}Error
Prepare to adopt parse::ParseError, which will contain StackError.

DEV-11268
2021-12-13 15:27:20 -05:00
Mike Gerwitz faed32af7e tamer: xir::tree::ParserState: Remove and expose Stack directly
This removes the layer of encapsulation that was hiding Stack, which is the
actual parser.  The new layer of encapsulation is parse::Parser, which will
be introduced here soon.  Baby steps, so it's clear how this evolves.

DEV-11268
2021-12-13 15:02:08 -05:00
Mike Gerwitz 24e9b94b37 tamer: xir::tree::Parsed: Remove in favor of xir::tree::parse::Parsed
These were the same thing after the previous commit.  This moves toward
tree::Stack becoming a ParseState.

DEV-11268
2021-12-13 14:29:16 -05:00
Mike Gerwitz 48517502d9 tamer: xir::tree::Parsed: Mirror xir::tree::parse::Parsed
I think it's obvious where the next commit is going---replace
xir::tree::Parsed.

DEV-11268
2021-12-13 14:19:12 -05:00
Mike Gerwitz c6d6f44bcb tamer: xir::tree::parse: ParseStatus and Parsed
The old Parsed was renamed to ParseStatus to be used by Parser, and Parser
converts it into Parsed, which has the same variants as it did before and
has all but the Done variant, since it's not possible for Parser to yield
it.

DEV-11268
2021-12-10 16:51:53 -05:00
Mike Gerwitz 9facc26b4f tamer: xir::tree::parse: Use new Parsed::Done variant over None
This removes Option from ParseState, as mentioned in previous commits.

This is ideal because it not only removes a layer of abstraction, but also
makes the intent very clear; the use of None was too tied to the concept of
an Iterator, which is the concern of Parser, _not_ ParseState.

This is now similar to tree::Parsed, which will help with that refactoring
shortly.

The Done variant is not accessible outside of Parser, since it always
coverts it to None (to halt iteration); given that, we should have another
public-facing type, as was also mentioned in a previous commit.

DEV-11268
2021-12-10 16:22:02 -05:00
Mike Gerwitz 38363da9ff tamer: xir::tree: {TokenStream=>ParseState}
This also renames related types.

See previous commits for more in formation.  In essence, this trait
represents the reification of all parser state.  The omission of "r" in the
name ParseState is intentional, since it indicates the state of a current
parse.  We'll see whether that naming ends up being too confusing; it's easy
enough to change.

DEV-11268
2021-12-10 15:42:01 -05:00
Mike Gerwitz 8eddf2f5ef tamer: xir::tree::parse: Remove TokenStreamParser trait
This just leaves Parser, which is what I started with, but I wasn't sure how
far I was going to take this.  I went against my usual judgment in creating
a trait that I may not need, in an attempt to try to reason about the API
that I wanted, because it wasn't yet clear at the time whether the Parser
ought to be generic.

Since then (as detailed in the last commit), this has become more of a
coordinator/mediator, and the real parser is actually TokenStreamState,
which will be renamed shortly.

DEV-11268
2021-12-10 14:58:44 -05:00
Mike Gerwitz bfe46be5bb tamer: xir::tree::attr_parser_from: Integrate AttrParser
This begins to integrate the isolated AttrParser.  The next step will be
integrating it into the larger XIRT parser.

There's been considerable delay in getting this committed, because I went
through quite the struggle with myself trying to determine what balance I
want to strike between Rust's type system; convenience with parser
combinators; iterators; and various other abstractions.  I ended up being
confounded by trying to maintain the current XmloReader abstraction, which
is fundamentally incompatible with the way the new parsing system
works (streaming iterators that do not collect or perform heap
allocations).

There'll be more information on this to come, but there are certain things
that will be changing.

There are a couple problems highlighted by this commit (not in code, but
conceptually):

  1. Introducing Option here for the TokenParserState doesn't feel right, in
     the sense that the abstraction is inappropriate.  We should perhaps
     introduce a new variant Parsed::Done or something to indicate intent,
     rather than leaving the reader to have to read about what None actually
     means.
  2. This turns Parsed into more of a statement influencing control
     flow/logic, and so should be encapsulated, with an external equivalent
     of Parsed that omits variants that ought to remain encapsulated.
  3. TokenStreamState is true, but these really are the actual parsers;
     TokenStreamParser is more of a coordinator, and helps to abstract away
     some of the common logic so lower-level parsers do not have to worry
     about it.  But calling it TokenStreamState is both a bit
     confusing and is an understatement---it _does_ hold the state, but it
     also holds the current parsing stack in its variants.

Another thing that is not yet entirely clear is whether this AttrParser
ought to care about detection of duplicate attributes, or if that should be
done in a separate parser, perhaps even at the XIR level.  The same can be
said for checking for balanced tags.  By pushing it to TokenStream in XIR,
we would get a guaranteed check regardless of what parsers are used, which
is attractive because it reduces the (almost certain-to-otherwise-occur)
risk that individual parsers will not sufficiently check for semantically
valid XML.  But it does _potentially_ match error recovery more
complicated.  But at the same time, perhaps more specific parsers ought not
care about recovery at that level.

Anyway, point being, more to come, but I am disappointed how much time I'm
spending considering parsing, given that there are so many things I need to
move onto.  I just want this done right and in a way that feels like it's
working well with Rust while it's all in working memory, otherwise it's
going to be a significant effort to get back into.

DEV-11268
2021-12-10 14:25:08 -05:00
Mike Gerwitz 0e08cf3efe tamer: xir::tree::parse: EOF span
This stores the last seen Span and uses that when reporting EOF, so that the
user will be able to be notified of where exactly the problem occurred.

When I get into creating combinators, it'll be the responsibility of those
combinators to ensure that any None return value will be supplemented by its
own last span.

DEV-11268
2021-12-06 15:34:29 -05:00
Mike Gerwitz 325c3167ee tamer: xir::Token::span: New method
This permits retrieving a Span from any Token variant.  To support this,
rather than having this return an Option, Token::AttrEnd was augmented with
a Span; this results in a much simpler and friendlier API.

DEV-11268
2021-12-06 14:48:55 -05:00
Mike Gerwitz 77c18d0615 tamer: xir: Remove Attr::Extensible
This removes XIRT support for attribute fragments.  The reason is that
because this is a write-only operation---fragments are used to concatenate
SymbolIds without reallocation, which can only happen if we are generating
XIR internally.

Given that this cannot happen during read, it was a mistake to complicate
the parsers.  But it makes sense why I did originally, given that the XIRT
parser was written for simplifying test cases.  But now that we want parsers
for real, and are writing production-quality parsers, this extra complexity
is very undesirable.

As a bonus, we also avoid any potential for heap allocations related to
attributes.  Granted, they didn't _really_ exist to begin with, but it was
part of XIRT, and was ugly.

DEV-11268
2021-12-06 14:26:58 -05:00
Mike Gerwitz 42b5007402 tamer: xir:tree: Begin work on composable XIRT parser
The XIRT parser was initially written for test cases, so that unit tests
should assert more easily on generated token streams (XIR).  While it was
planned, it wasn't clear what the eventual needs would be, which were
expected to differ.  Indeed, loading everything into a generic tree
representation in memory is not appropriate---we should prefer streaming and
avoiding heap allocations when they’re not necessary, and we should parse
into an IR rather than a generic format, which ensures that the data follow
a proper grammar and are semantically valid.

When parsing attributes in an isolated context became necessary for the
aforementioned task, the state machine of the XIRT parser was modified to
accommodate.  The opposite approach should have been taken---instead of
adding complexity and special cases to the parser, and from a complex parser
extracting a simple one (an attribute parser), we should be composing the
larger (full XIRT) parser from smaller ones (e.g. attribute, child
elements).

A combinator, when used in a functional sense, refers not to combinatory
logic but to the composition of more complex systems from smaller ones.  The
changes made as part of this commit begin to work toward combinators, though
it's not necessarily evident yet (to you, the reader) how that'll work,
since the code for it hasn't yet been written; this is commit is simply
getting my work thusfar introduced so I can do some light refactoring before
continuing on it.

TAMER does not aim to introduce a parser combinator framework in its usual
sense---it favors, instead, striking a proper balance with Rust’s type
system that permits the convenience of combinators only in situations where
they are needed, to avoid having to write new parser
boilerplate.  Specifically:

  1. Rust’s type system should be used as combinators, so that parsers are
  automatically constructed from the type definition.

  2. Primitive parsers are written as explicit automata, not as primitive
     combinators.

  3. Parsing should directly produce IRs as a lowering operation below XIRT,
     rather than producing XIRT itself.  That is, target IRs should consume
     XIRT and produce parse themselves immediately, during streaming.

In the future, if more combinators are needed, they will be added; maybe
this will eventually evolve into a more generic parser combinator framework
for TAME, but that is certainly a waste of time right now.  And, to be
honest, I’m hoping that won’t be necessary.
2021-12-06 11:27:39 -05:00
Mike Gerwitz fd1b1527d6 tamer: Remove tests invoking cargo and associated libs
There are a number of reasons for this, where the benefits do not make up
for the losses.

First: this is actually invoking cargo.  Not only is this not necessary, but
it's not desirable: cargo by default hits the network and does all sorts of
other stuff, when all we want to do is invoke the executable.  So the tests
aren't really testing the right thing in that sense.  See the previous
commit for more information.

The way it invokes cargo is different than the way the Makefile invokes
cargo, so on my system, it's actually invoking a _different cargo_!  This is
causing problems, in particular with lock files, which causes my tests to
fail.

Importantly, this also removes a _lot_ of dependencies, which removes a lot
of supplier chain risk and a lot of code to audit.  This provides
significant security benefits, especially given that what was being tested
was rather small, and could be done in a shell script.

TAMER will receive significant system testing later on.  But for now, none
of this was worth it.

Further audits of dependencies will come later on.  I've always been fairly
insistent on keeping the dependency graph small and auditable, but recent
supply chain attacks have given me a better way to rationalize the security
risk.  Further, I'm the only one on this project right now.
2021-12-02 12:38:06 -05:00
Mike Gerwitz 87c457ba41 tamer: cargo --frozen --offline
Cargo's default behavior is unfortunately to issue network calls each time
it is invoke in order to check for dependencies updates.  This is not only
bad for reproducibility and privacy, but it's also a concern for supply
chain attacks, since most developers are unaware that this is occurring.

Instead, we pin to the lockfile.  Installing dependencies can be done with
`cargo fetch` and updating dependencies must be explicitly done by the
developer, with the lockfile updated.
2021-12-02 11:49:51 -05:00
Mike Gerwitz 54531e2284 tamer: xir::tree::attr: Display impls 2021-11-23 13:05:10 -05:00
Mike Gerwitz ba7ebad930 tamer: obj::xmlo::reader::test: {DUMMY_SPAN=>DS} for brevity
There's a lot of boilerplate that can be reduced in general, but I _really_
want to focus on getting this thing done; I can clean up later.
2021-11-22 11:16:43 -05:00
Mike Gerwitz ba4c32383f tamer: obj::xmlo::reader: Parse root package node attributes
Well, parse to the extent that it was being parsed before, anyway.

The core of this change demonstrates how well TAMER's abstractions work well
together.  (As long as you have an e.g. LSP to help you make sense of all of
the inference, I suppose.)

  Token::Open(QN_LV_PACKAGE | QN_PACKAGE, _) => {
      return Ok(XmloEvent::Package(
          attr_parser_from(&mut self.reader)
              .try_collect_ok()??,
      ));
  }

This finally makes use of `attr_parser_from` and `try_collect_ok`.  All of
the types are inferred---from the iterator transformations, to the error
conversions, to the destination PackageAttrs type.

DEV-10863
2021-11-18 00:59:10 -05:00
Mike Gerwitz d421112f35 tamer: xir::tree::ParserState::store_or_emit: Properly emit Parsed::Done
This was forgotten when the attribute parser was introduced, and led to the
parser continuing to the token following AttrEnd, which properly caused a
failure given that the parser was in the Done state.

There is a future task I have in my backlog to properly address the Done
state, but this is sufficient for now.
2021-11-17 00:13:07 -05:00
Mike Gerwitz e0811589fa tamer: xir::tree::attr::value_atom: Doc typo fix 2021-11-16 15:48:59 -05:00
Mike Gerwitz 7367e20c01 tamer: obj::xmlo: Extract error types into own module 2021-11-16 15:47:52 -05:00
Mike Gerwitz f519dab2b6 tamer: xir::tree::attr::Attr::value_atom: Option<SymbolId>=>SymbolId
To maintain a proper abstraction, this cannot be the responsibility of the
caller; most callers should not know that fragments exist, letalone how to
handle them.
2021-11-16 12:41:03 -05:00
Mike Gerwitz c9be1d613d tamer: iter::collect::TryCollect::try_collect_ok: Doc fix
This was copied from another docblock and I messed it up.
2021-11-16 12:26:05 -05:00
Mike Gerwitz 5233822322 tamer: xir: Remove Text enum
Like previous commits, this replaces the explicit escaping context with the
convention that all values retrieved from `xir` are unescaped on read and
escaped on write.

Comments are a notable TODO, since we must escape only `--`.

CData is also an issue.  I had _expected_ to use it as a means to avoid
unescaping fragments, but I had forgotten that quick_xml hard-codes escaping
on read, so that it can re-use BytesStart!  That is terribly unfortunate,
and may result in us having to re-implement our own read method in the
future to avoid this nonsense.  So I'm just leaving it as a TODO for now.

DEV-11081
2021-11-15 23:47:14 -05:00
Mike Gerwitz 8723ca154d tamer: xir::escape::CachingEscaper: Use new sym::st::ST_COUNT
This adds a constant `ST_COUNT` representing the number of statically
allocated symbols, and uses that to estimate an initial capacity for the
`CachingEscaper`.

This is just a guess (and is certainly too low), but we can adjust later on
after profiling, if it ever comes up.
2021-11-15 21:46:57 -05:00
Mike Gerwitz d710437ee4 tamer: xir::escape::CachingEscaper: New Escaper
As promised, this will cache previously seen escaped/unescaped values by
creating a two-way mapping between them.

DEV-11081
2021-11-15 16:44:24 -05:00
Mike Gerwitz 27ba03b59b tamer: xir::escape: Remove XirString in favor of Escaper
This rewrites a good portion of the previous commit.

Rather than explicitly storing whether a given string has been escaped, we
can instead assume that all SymbolIds leaving or entering XIR are unescaped,
because there is no reason for any other part of the system to deal with
such details of XML documents.

Given that, we need only unescape on read and escape on write.  This is
customary, so why didn't I do that to begin with?

The previous commit outlines the reason, mainly being an optimization for
the echo writer that is upcoming.  However, this solution will end up being
better---it's not implemented yet, but we can have a caching layer, such
that the Escaper records a mapping between escaped and unescaped SymbolIds
to avoid work the next time around.  If we share the Escaper between _all_
readers and the writer, the result is that

  1. Duplicate strings between source files and object files (many of which
     are read by both the linker and compiler) avoid re-unescaping; and
  2. Writers can use this cache to avoid re-escaping when we've already seen
     the escaped variant of the string during read.

The alternative would be a global cache, like the internment system, but I
did not find that to be appropriate here, since this is far less
fundamental and is much easier to compose.

DEV-11081
2021-11-12 14:03:23 -05:00
Mike Gerwitz b1c0783c75 tamer: xir::XirString: WIP implementation (likely going away)
I'm not fond of this implementation, which is why it's not fully
completed.  I wanted to commit this for future reference, and take the
opportunity to explain why I don't like it.

First: this task started as an idea to implement a third variant to
AttrValue and friends that indicates that a value is fixed, in the sense of
a fixed-point function: escaped or unescaped, its value is the same.  This
would allow us to skip wasteful escape/unescape operations.

In doing so, it became obvious that there's no need to leak this information
through the API, and indeed, no part of the system should care.  When we
read XML, it should be unescaped, and when we write, it should be
escaped.  The reason that this didn't quite happen to begin with was an
optimization: I'll be creating an echo writer in place of the current
filesystem-based copy in tamec shortly, and this would allow streaming XIR
directly from the reader to the writer without any unescaping or
re-escaping.

When we unescape, we know the value that it came from, so we could simply
store both symbols---they're 32-bit, so it results in a nicely compressed
64-bit value, so it's essentially cost-free, as long as we accept the
expense of internment.  This is `XirString`.  Then, when we want to escape
or unescape, we first check to see whether a symbol already exists and, if
so, use it.

While this works well for echoing streams, it won't work all that well in
practice: the unescaped SymbolId will be taken and the XirString discarded,
since nothing after XIR should be coupled with it.  Then, when we later
construct a XIR stream for writting, XirString will no longer be available
and our previously known escape is lost, so the writer will have to
re-escape.

Further, if we look at XirString's generic for the XirStringEscaper---it
uses phantom, which hints that maybe it's not in the best place.  Indeed,
I've already acknowledged that only a reader unescapes and only a writer
escapes, and that the rest of the system works with normal (unescaped)
values, so only readers and writers should be part of this process.  I also
already acknowledged that XirString would be lost and only the unescaped
SymbolId would be used.

So what's the point of XirString, then, if it won't be a useful optimization
beyond the temporary echo writer?

Instead, we can take the XirStringWriter and implement two caches on that:
mapping SymbolId from escaped->unescaped and vice-versa.  These can be
simple vectors, since SymbolId is a 32-bit value we will not have much
wasted space for symbols that never get read or written.  We could even
optimize for preinterned symbols using markers, though I'll probably not do
so, and I'll explain why later.

If we do _that_, we get even _better_ optimizations through caching that
_will_ apply in the general case (so, not just for echo), and we're able to
ditch XirString entirely and simply use a SymbolId.  This makes for a much
more friendly API that isn't leaking implementation details, though it
_does_ put an onus on the caller to pass the encoder to both the reader and
the writer, _if_ it wants to take advantage of a cache.  But that burden is
not significant (and is, again, optional if we don't want it).

So, that'll be the next step.
2021-11-10 12:22:10 -05:00
Mike Gerwitz c57aa7fb53 tamer: iter::TryCollect::try_collect_ok: New method
This is intended to alleviate what will be some common boilerplate because
of the Rust compiler error described therein.

This will evolve over time, I'm sure.

DEV-10863
2021-11-10 09:09:07 -05:00
Mike Gerwitz 3140279f04 tamer: iter::trip::TrippableIterator: New trait
This provides convenience methods atop of the already-existing
functions.  These are a bit more ergonomic since they (a) remove a variable
and its generics and (b) are conveniently suggested via LSP (with
e.g. rust-analyzer) if the iterator is of the right type, even if the trait
is not yet imported.  This should help with discoverability as well.
2021-11-05 16:55:46 -04:00
Mike Gerwitz 90e3e94c0a tamer: iter::{TryCollect, TryFromIter}: New traits
These traits augment Rust's built-in traits to handle failure scenarios,
which will allow us to encapsulate lowering logic into discrete,
self-parsing units that enforce e.g. schemas (the example alludes to my
intentions).
2021-11-05 16:33:16 -04:00
Mike Gerwitz 1f01833d30 tamer: xir::tree::attr_parser_from: Do not take ownership over iter
The previous implementation took ownership over the provided iterator, which
was an oversight, considering that this is intended to be used in contexts
where doing so is not possible.  A good example where isolated test cases
aren't necessarily painting the correct picture.

`scan` takes owned values, so this instead uses the same parsing method as
`parse_attrs`, but using a `FromFn` iterator to avoid having to create a
whole new iterator type.  This will work well so long as we don't need to
store the type returned by this (while also wanting to avoid boxing).

DEV-11062
2021-11-05 10:54:05 -04:00
Mike Gerwitz 428d508be4 tamer: {ir::=>}{asg, xir}
See the previous commit.  There is no sense in some common "IR" namespace,
since those IRs should live close to whatever system whose data they
represent.

In the case of these, they are general IRs that can apply to many different
parts of the system.  If that proves to be a false statement, they'll be
moved.

DEV-10863
2021-11-04 16:13:27 -04:00
Mike Gerwitz 5a91db6d54 tamer: obj::xmlo::{legacy=>}ir
Calling it "legacyir" is just confusing.  The original hope, when beginning
TAMER, was that I'd be able to use a new object format in the near future to
help speed up the compilation process.  But that's far from our list of
priorities now, and so seeing "legacy" all over the place is really
confusing considering that it implies that perhaps it shouldn't be used for
new code.

This helps to clear up that cognitive dissonance by remaining neutral on the
topic.  And the reality is that it won't be "legacy" for some time.

DEV-10863
2021-11-04 13:23:38 -04:00
Mike Gerwitz cee6402f8b tamer: Move {ir::legacyir=>obj::xmlo::legacyir}
The IRs really ought to live where they are owned, especially given that
"IR" is so generic that it makes no sense for there to be a single location
for them; they're just data structures coupled with different phases of
compilation.

This will be renamed next commit; see that for details.

This also removes some documentation describing the lowering process,
because it's undergone a number of changes and needs to be accurately
re-summarized in another location.  That will come at a later time after the
work is further along so that I don't have to keep spending the time
rewriting it.

DEV-10863
2021-11-04 13:20:38 -04:00
Mike Gerwitz d06f31b4d3 tamer: obj::xmlo: Compile quickxml even with flag off
This was previous gated behind the negation of the wip-xmlo-xir-reader flag,
which meant that it was not being compiled or picked up by LSP.  Both of
those things are inconvenient and unideal.

DEV-10863
2021-11-04 12:35:08 -04:00
Mike Gerwitz e494f3fdfd tamer: ir::xir::tree::attr_parser_from: New parser iterator
This allows for the lazy parsing of attributes, and makes the necessary
changes to the parser to be able to do so safely without getting into a bad
context.

When XIRT was originally conceived, this concept existed somewhat, but it
was done in a way that would allow the parser to accept invalid input.  This
avoids that problem.

This also introduces the concept of "Done", primarily because we had to for
the AttrEnd token.  This will evolve in following commit(s), which will
allow carrying out the important check of ensuring that the parser has ended
parsing in a valid accepting state (in terms of a state machine).

DEV-11062
2021-11-04 11:04:42 -04:00
Mike Gerwitz 3ba478b09b tamer: ir::xir::tree::ParseError::AttrNameExpected: Display typo fix
We do not want to put backticks around a token display.
2021-11-03 15:07:52 -04:00
Mike Gerwitz adc939d779 tamer: ir::xir::Token: Implement Display
This also modifies xir::tree errors to use Display instead of Debug when
rendering error output.

DEV-10863
2021-11-03 14:54:37 -04:00
Mike Gerwitz c7eb50b636 tamer: xir::xir::tree::parse_attrs: Isolated attribute parsing
This produces an `AttrList` independent from a containing
`Element`.  Upcoming changes may further permit the parser to yield smaller
components that are not part of an aggregate.

DEV-10863
2021-11-03 14:39:03 -04:00
Mike Gerwitz 54e1877d20 tamer: ir::xir::tree: Isolate AttrList parsing
This maintains existing functionality but prepares for an isolated context
for AttrList parsing.

DEV-10863
2021-11-02 14:07:20 -04:00
Mike Gerwitz 6eed728756 tamer: ir::xir::tree: Explicitly list unhandled tokens for exhaustiveness
This allows Rust to carry out its exhaustiveness check for when we add new
tokens.  It further ensure that we understand what we missed, or chose not
to handle.

DEV-10863
2021-11-02 14:07:05 -04:00
Mike Gerwitz edf9a75575 tamer: ir::xir::{QName, Prefix, LocalName}: Implement Display
These will be shown in error messages and need user-friendly
representations.

DEV-10863
2021-11-02 13:55:33 -04:00
Mike Gerwitz d045786cfb tamer: ir::xir::tree::Element::attrs: Wrap in Option
This allows AttrList not only to be lazily initialized (which is less of a
problem at the moment with Vec, but may become one in the future), but also
leaves a space open for attributes to be added _after_ having been
parsed.  It further leaves room to _take_ attributes from their `Element`.

This is important because the next commit will re-introduce the ability to
parse attributes independently, allowing us to put the parser in a state
where we can parse AttrList without an Element context.  To re-use that
parsing under an Element context, we can simply attach an AttrList after it
has been parsed.

Option adds no additional size cost to Vec, so we get this for free (except
for the tiny change that initializes the attribute list when we try to push
to it).

I also think this reads better ("attrs: None").  Though it makes the API
slightly more of a pain to work with.

DEV-10863
2021-10-29 16:34:05 -04:00
Mike Gerwitz a9fd1c7557 tamer: Use TokenStream trait alias where applicable
Simple replacement to improve readability.
2021-10-29 14:39:40 -04:00
Mike Gerwitz 7e6cb2c948 tamer: ir::xir::Token::AttrEnd: New token type
The purpose of this token is to implement a lazy streaming attribute
collection operation without a token of lookup, which would complicate
parsing or require that a TokenStream provide a `peek` method.

This is only required for readers to produce, since readers will be feeding
data to parsers.  I have the writer ignoring it.  If you're looking back at
this commit, the question is whether this was a bad idea: it introduces
inconsistencies into the token stream depending on the context, which can be
confusing and error-prone.

The intent is to have the parser throw an explicit error if the new token is
missing in the context in which it is required, which will safely handle the
issue, but does defer it to runtime.  But only readers need auditing, and
there's only one XIR reader at the moment.

DEV-10863
2021-10-29 13:06:27 -04:00
Mike Gerwitz 18ab032ba0 tamer: Begin XIR-based xmlo reader impl
There isn't a whole lot here, but there is additional work needed in various
places to support upcoming changes and so I want to get this commited to
ease the cognitive burden of what I have thusfar.  And to stop stashing.  We
have a feature flag for a reason.

DEV-10863
2021-10-28 21:21:30 -04:00
Mike Gerwitz ba3b576c93 tamer: ir::xir::qname_const_inner: Fully qualified QName paths
This macro was previously using the path of wherever the template expanded
into, which I found to be unexpected considering that I thought the macros
were hygenic and the names bound to the environment in which they were
defined.

In any case, this solves the problem in all cases.

DEV-10863
2021-10-28 21:19:11 -04:00
Mike Gerwitz f0f58a6e16 tamer: obj::xmlo::asg_builder: Remove example for now
Just until the new xmlo reader is ready, since it will be changing slightly
and fails to compile with the feature flag on now.

DEV-10863
2021-10-28 21:17:53 -04:00
Mike Gerwitz e9871541a8 tamer: benches/iter.rs: Basic benchmark
This was forgotten in the previous commit and exists simply to ensure that
the TripIter doesn't add any significant overhead.  The tests are
a handful of nanoseconds apart, on my machine.
2021-10-28 21:17:41 -04:00
Mike Gerwitz f6c5a224c8 tamer: iter::trip: Introduce initial TripIter concept
See the documentation in this commit for more information.

This is pretty significant, in that it's been a long-standing question for
me how I'd like to join together `Result` iterators without having
unnecessarily complex APIs, and also allow for error recovery.  This solves
both of those problems.

It should be noted, however, that this does not yet explicitly implement
error recovery, beyond being able to observe the failure as the result of
the provided callback function.  Proper recovery will be implemented once
there's a use-case.

DEV-11006
2021-10-28 14:50:41 -04:00
Mike Gerwitz 18cadb9c7d tamer: obj::xmlo::reader: Better organize flagged code
This moves the Iterator impl and From<B> back into `quickxml`.  The type of
the new reader is different, taking an iterator instead of a BufRead.  This
will allow us to easily mock for unit tests, without the clustfuckery that
has ensued previously with quick-xml mocking.

DEV-10863
2021-10-25 13:47:26 -04:00
Mike Gerwitz c76fe87acd tamer: obj::xmlo::reader: Move Xmlo{Result,Error,Event}
These will need an API change, but are otherwise shared.  This means that
only the XmloReader is gated.
2021-10-25 12:26:25 -04:00
Mike Gerwitz f7d8aa1e4f tamer: wip-xml-xir-reader flag and setup
The original plan was to modify the existing reader to use the new
XmlXirReader, but that's going to be a lot of ongoing uncommitted work, with
both tests and implementation.  The better option seems to be to reimplement
it, since so many things are changing.

This flag will be short-lived and removed as soon as the implementation is
complete.

DEV-10863
2021-10-25 12:02:46 -04:00
Mike Gerwitz e6f53c20fd tamer: ir::xir::reader: Disable quick-xml check_end_names
XIR must support tag mismatches; XIRT will validate them.

This is currently disabled in the linker's xmlo reader as well.

DEV-10863
2021-10-25 10:58:19 -04:00
Mike Gerwitz d72ab3675c tamer: ir::xir::reader: Comment parsing
Comments re-use Text, but they are _not_ escaped, so we need to take care
with the type to ensure that, if the value were ever used with a
Token::Text, that we don't end up injecting XML.
2021-10-21 22:04:45 -04:00
Mike Gerwitz fdb8e5998c tamer: ir::xir::reader: CData parsing
quick_xml provides us the value escaped, so we can just handle this the same
way as Text for now.

In the future, we may want to distinguish between the two so that we can
reconstruct an identical XML document, but at the moment CData isn't used at
all in TAME sources or outputs, and so I'm not going to worry about it for
now.

DEV-10863
2021-10-21 21:55:15 -04:00
Mike Gerwitz 8b212959c8 tamer: ir::xir::reader: Text and mixed content
It's nice being able to breeze through changes, since that's been a pretty
rare thing so far, given all the foundational work that has been needed.

This should get us pretty damn close to being able to parse the `xmlo` files
for the reader linker, if we're not there already.

DEV-10863
2021-10-21 21:44:04 -04:00
Mike Gerwitz 13a779ec9c tamer: ir::xir::reader: Remove namespace TODO
This isn't XIR's responsibility, and so there's nothing to do here.
2021-10-21 16:52:58 -04:00
Mike Gerwitz 6d25be0ec7 tamer: ir::xir::reader: Refactor common element open parsing
As mentioned in the previous commit, this is just minor cleanup.
2021-10-21 16:51:47 -04:00
Mike Gerwitz e18aeeffac tamer: ir::xir::reader: Parsing of child nodes
This is quick-and-dirty; refactoring can be done later on.  This is also
intended to demonstrate the ease with which additional events can be
added---the hard work is done.
2021-10-21 16:32:19 -04:00
Mike Gerwitz 4c4d89f84f tamer: ir::xir::reader: Initial concept
This is an initial working concept for the reader which handles, so far,
just a single attribute.  But extending it to completion will not be all
that much more work.

This does not have namespace support---that will be added later as part of
XIRT, which is responsible for semantic analysis.  This allows XIR to stay
wonderfully simple, and won't have any impact on the writer (which expects
that QNames are unresolved and contain the namespace prefix to be written).
2021-10-21 16:23:11 -04:00
Mike Gerwitz fc3953e90e tamer: benches/sym.rs: Interner::intern_utf8 benchmarks
These were forgotten in the previous commit.
2021-10-19 13:42:26 -04:00
Mike Gerwitz b8d0da9095 tamer: sym::Interner::intern_utf8
This is the safe version of the existing intern_utf8_unchecked, and exists
as a performance optimization.

We're about to introduce a XIR reader, which is going to intern a _lot_ of
duplicate strings, since it will intern node and attribute names as
well.  Given that, we do not want to spent a lot of time performing UTF-8
checks that have already been performed.

We know that, if an intern is in the pool, it's either already UTF-8 or that
check was bypassed when it was initially interned.  Therefore, if we find an
existing symbol, that can be returned without having to perform any
check.  Otherwise, we intern as we usually would after attempting to convert
the byte slice into a string.

This allows us to continue to have good performance for interning without
sacrificing safety for strings.
2021-10-19 12:56:57 -04:00
Mike Gerwitz 63e5a0d441 tamer: benches/sym.rs: Add additional UTF-8-related tests
The intent of this is to demonstrate how significant of an impact checking
byte arrays for UTF-8 validity will have, since the existing tests do not
make that clear (a static string in Rust is always valid UTF-8).

These benchmarks show that the cost when re-interning an already existing
value is +50%.

This is important, because the new reader will be interning a _lot_ of
duplicate strings, whereas the existing reader operates on byte arrays
without interning unless necessary.  And, when it does, it does so
unchecked.  But we'd rather not do that, since we cannot guarantee that
those XML files are valid (and not modified in some way).

Upcoming commits will have what I think is a reasonable compromise to this,
based on the fact that we'll be encountering _many_ duplicate strings in
parsing XML files.

DEV-10920
2021-10-18 21:35:32 -04:00
Mike Gerwitz 2715f3e845 tamer: sym: Expose raw SymbolId for static symbols
This provides a child `raw` module that exposes a SymbolId representing the
inner value of each of the static newtypes.  This is needed in situations
where the type must match and the type of the static symbol is not
important.

In particular, when comparing against runtime-allocated symbols in `match`
expressions.

It is also worth noting that this commit managed to hit a bug in Rustc that
was fixed on 10/1/2021.  We use nightly, and it doesn't seem that this
occurred in stable, from bug reports.

  - https://github.com/rust-lang/rust/issues/89393
  - 5ab1245303
  - Original issue: https://github.com/rust-lang/rust/issues/72476

The error was:

  compiler/rustc_mir_build/src/thir/pattern/deconstruct_pat.rs:1191:22:
  Unexpected type for `Single` constructor: <u32 as sym::symbol::SymbolIndexSize>::NonZero

  thread 'rustc' panicked at 'Box<dyn Any>', compiler/rustc_errors/src/lib.rs:1146:9

This occurred because we were trying to use `SymbolId` as the type, which
uses a projected type as its inner value: `SymbolId<Ix: SymbolIndexSize>(Ix::NonZero)`.
This was not a problem with the static newtypes because their inner type was
simply `SymbolId<Ix>`, which is not projected.

This is one of the risks of using nightly.

But, the point is: if you receive this error, upgrade your toolchain.
2021-10-18 10:53:53 -04:00
Mike Gerwitz 581b9d4e65 tamer: Use `..` for tuple unimportant variant matches
Tbh, I was unaware that this was supported by tuple variants until reading
over the Rustc source code for something.  (Which I had previously read, but
I must have missed it.)

This is more proper, in the sense that in a lot of cases we not only care
about how many values a tuple has, but if we explicitly match on them using
`_`, then any time we modify the number of values, it would _break_ any code
doing so.  Using this method, we improve maintainability by not causing
breakages under those circumstances.

But, consequently, it's important that we use this only when we _really_
don't care and don't want to be notified by the compiler.

I did not use `..` as a prefix, even where supported, because the intent is
to append additional information to tuples.  Consequently, I also used `..`
in places where no additional fields currently exist, since they may in the
future (e.g. introducing `Span` for `IdentObject`).
2021-10-15 12:28:59 -04:00
Mike Gerwitz 739cf7e6eb tamer: ir::asg::object::IdentObject: Define methods from IdentObjectData
In particular, `name` needn't return an `Option`.  `fragment` also returns a
copy, since it's just a `SymbolId`.  (It really ought to be a newtype rather
than an alias, but we'll worry about that some other time.)

These changes allow us to remove some runtime panics.

DEV-10859
2021-10-14 14:38:02 -04:00
Mike Gerwitz f055cb77c2 tamer: ld::xmle: Narrow Sections types
This moves the logic that sorts identifiers into sections into Sections
itself, and introduces XmleSections to allow for mocking for testing.

This then allows us to narrow the types significantly, eliminating some
runtime checks.  The types can be narrowed further, but I'll be limiting the
work I'll be doing now; this'll be inevitably addressed as we use the ASG
for the compiler.

This also handles moving Sections tests, which was a TODO from the previous
commit.

DEV-10859
2021-10-14 12:40:13 -04:00
Mike Gerwitz ea11cf1416 tamer: ld::xmle::lower: Extract sectioning into Sections
This is the appropriate place to be, now that we've begun narrowing the
types.  We'll be able to do so further; this is just the first step.

This does not yet move the tests, but the code is still tested because it's
tightly coupled with `sort`.  Those will move in the next commit(s).

DEV-10859
2021-10-12 12:15:11 -04:00
Mike Gerwitz 08d92ca663 tamer: ld::xmle::sections: Remove generic object type
xmle sections will only ever contain an object of one type, so there is no
use in making this generic.

I think the original plan was to have this represent, generically, sections
of some object file (like ELF), but doing so would require a significant
redesign anyway, so it makes no sense.  This is easier to reason about.

DEV-10859
2021-10-12 10:35:14 -04:00
Mike Gerwitz 31144d0c9a tamer: benches/asg_lower.rs: Add missing file from previous commit
This was missed in the `lower` module move.
2021-10-12 10:30:35 -04:00
Mike Gerwitz 27480229df tamer: ld (Linking Process): Minor doc update to reflect changes
DEV-10859
2021-10-12 09:49:40 -04:00
Mike Gerwitz df328da71f tamer: ir::asg::SortableAsg: Move into ld::xmle::lower
This has always been a lowering operation, but it was not phrased in terms
of it, which made the process a bit more confusing to understand.

The implementation hasn't changed, but this is an incremental refactoring
and so exposes BaseAsg and its `graph` field temporarily.

DEV-10859
2021-10-12 09:49:33 -04:00
Mike Gerwitz 81ec65742a tamer: {ir::asg=>ld::xmle}::section
Sections, as written, are specific to xmle files.

I think the intent originally was to have this be more generic, but that
doesn't really make sense.

By explicitly coupling it with `xmle` files, that will allow us to turn this
into a proper lowering operation with its own validations that will allow
`xmle::xir` to do its job without having to validate anything itself.
2021-10-12 00:05:44 -04:00
Mike Gerwitz 1c181b568d tamer: ld::poc: Update comment reflecting current state
The linker is feature-complete, but this file has lived on because the
project was on pause for quite some time.
2021-10-11 23:54:24 -04:00
Mike Gerwitz f899ac898e tamer: {obj=>ld}::xmle
This is a linker-specific module.
2021-10-11 23:52:59 -04:00
Mike Gerwitz 5ea5cffd09 tamer: relroot String->SymbolId
This was [one of] the last remaining Strings; SymbolId should be used across
the board.
2021-10-11 16:00:19 -04:00
Mike Gerwitz 7873d46afb tamer: Replace all &'static str in errors with SymbolId
Now that SymbolId implements Display and resolves, this works out well.
2021-10-11 15:39:53 -04:00
Mike Gerwitz 7e9271e189 tamer: span: Primitive Display impl
This outputs enough information to be a little bit useful in the event of an
error.  In the future, we'll want to provide a (likely non-Display)
implementation that provides line number and source file context with
the problem characters indicated, like Rust.
2021-10-11 14:14:43 -04:00
Mike Gerwitz a9140730d9 tamer: sym: Implement Display for SymbolId
This is a significant departure from my original plans---this makes it
_easy_ to display symbol values, despite me not wanting that to occur unless
absolutely necessary.

The reality is, based on the design of the system, they will only occur in
these situations:

  1. Writing to files;
  2. Displaying errors;
  3. Tests; or
  4. People not following the design of the system.

The fourth one is the most risky as people begin to contribute in the
future, but the reality is that those can be fixed as they are encountered,
since if they're not showing up in a profiler, then they must not be causing
much of a problem.
2021-10-11 13:52:35 -04:00
Mike Gerwitz 85909f1590 tamer: sym::SymbolStr: Remove
This removes `SymbolStr` in favor of, simply, `&'static str`.

The abstraction provided no additional safety since the slice was trivially
extracted (and commonly, in practice), and was inconvenient to work with.

This is part of a process of relaxing lookups so that symbols can be
conveniently displayed in errors; rather than trying to prevent the
developer from doing something bad, we'll just rely on conventions, hope
that it doesn't happen, and if it does, address it either at that time or
when it shows up in the profiler.
2021-10-11 12:58:48 -04:00
Mike Gerwitz 68397f1413 tamer: ir::xir: Add missing docs for QName, Prefix, LocalName
The docs still need to be improved, but they can be touched as we go.

This concludes the initial development of XIR.  That was much more involved
that I had originally intended, but the result is good.

DEV-10561
2021-10-11 11:56:03 -04:00
Mike Gerwitz bc5091d2a7 tamer: ir::xir (newtype_symbol!): Remove for now
This does not belong here and was more of a POC at the time.  It can be
added later on when I have the time; I have to move on.
2021-10-11 11:51:51 -04:00
Mike Gerwitz f65ec818ab tamer: obj::xmle::xir: Correct doc typos Xml{e=>}Writer 2021-10-11 11:51:32 -04:00
Mike Gerwitz 3e385d1a1b tamer: obj::xmle::xir: Finalize docs
This could be improved upon, but there will be more work coming up for this
to finalize Sections.

DEV-10561
2021-10-11 11:43:49 -04:00
Mike Gerwitz bc5e8ebe75 tamer: obj::xmle::xir: Extract ElemWrap into ir::xir::iter
This generalizes it a bit and provides tests, which was always the intent;
the existing code was POC to determine if this could be done without
performance degradation (see that commit for more information).
2021-10-11 10:33:24 -04:00
Mike Gerwitz cde08b125c tamer: span (DUMMY_SPAN): New constant
Rather than having to use lazy_static! in all these tests, we can derive an
unlimited number of dummy spans from this one using e.g. `offset_add`.
2021-10-11 10:29:58 -04:00
Mike Gerwitz cf239531e0 tamer: span (offset_add): New method
More will come in the future, including the ability to add two spans.
2021-10-11 10:28:47 -04:00
Mike Gerwitz de3d7ef393 tamer: span: Introduce twospan
The intent is to support the composition and decomposition of spans such
that (A, B) is as documented here.  This only performs the trivial case for
the sake of providing a convenient API when the developer would otherwise
just type (S, S).
2021-10-11 09:56:48 -04:00
Mike Gerwitz 1a2f6bd209 tamer: obj::xmle::xir: Extract ElemWrap into ir::xir::iter 2021-10-11 09:34:17 -04:00
Mike Gerwitz de62a2acbc tamer: ir::asg::section: Reduce fields
This is intended to represent the sections written to the final xmle file,
and there was unnecessary complexity in separating everything.

By reducing this IR further, we can begin to constrain its types to
eliminate some of the runtime panics and error checking we have/had in the
writer.
2021-10-11 09:07:48 -04:00
Mike Gerwitz f70f5653b2 tamer: ir::asg::section: Head and tail can have only one object
This is the beginning of a refactoring to simplify this implementation a
little bit.
2021-10-09 00:27:03 -04:00
Mike Gerwitz 0626629cb3 tamer: Remove old xmle writer and wip-xir-xmle-writer flag
The new writer has reached parity of the old, with the exception of some
edge case explicit error handling that should never occur (which will be
added), and cleanup/docs.

Removing this flag now allows me to perform that cleanup without having to
worry about updating the now-old implementation.

I ran `tameld` with the new writer against our production system with
numerous programs and a significant number of test cases, and diff'd the old
and new xmle files, and everything looks good.
2021-10-08 22:04:42 -04:00
Mike Gerwitz 82727a5d66 tamer: obj::xmle::xir::header: Remove Rust 2018 comment
We're on 2021 now.
2021-10-08 21:43:28 -04:00
Mike Gerwitz d616d9475c tamer: obj::xmle::xir: Complete writer functionality
This is a significant milestone, in the sense that it is the culmination of
the past month or so of work to prove that an Iterator-based XIR will be
viable for the system.

This barely had any impact on the performance from the previous commit
reporting the profiling.  This performs at least as well as the quick-xml
based writer.  In isolated benchmarks, it performs better, but in the real
world, the linker spends most of its time reading xmlo files, and so minor
differences in writing do not have a significant overall impact.

With that said, a lot of cleanup and documentation is still needed.  That is
the subject of the upcoming commits, before this writer can finalized.
2021-10-08 16:37:46 -04:00
Mike Gerwitz 929a6c9815 tamer: obj::xmle::xir::tree: Parse Text into Element
This simply adds support for Text nodes as a child of Element.  This support
unit tests for the upcoming change for xmle fragments.
2021-10-08 16:16:33 -04:00
Mike Gerwitz f0f6f89745 tamer: Makefile.am (bench-build): New target, default for all
Build the benchmarks by default to catch breakages without having to incur
the cost of actually running them.
2021-10-08 09:27:56 -04:00
Mike Gerwitz 75d2ecf4dd tamer: obj::xmle::xir: Consideration of simplified iterators
The previous iterators had to be used in a certain order because they mixed
concerns, out of concern for performance.  This attempts to chain even more
iterators to see how it may perform.

To be clear: this will be cleaned up.  This was just an experiment.

Here were profiles on the average of 50 runs of linking our largest program:

  Baseline, pre-XIR (with fragments removed from output)               0.8082
  XIR writer, pre-ElemWrap, no #[inline]                               0.7844s
  XIR writer, ElemWrap, no #[inline]                                   0.7918s
  XIR writer, ElemWrap, inlines in obj::xmle::xir                      0.7892s
  XIR writer, ElemWrap, inlines in obj::xmle::xir and ir::asg::section 0.7858s
  XIR writer, ElemWrap, inline in only ir::asg::section                0.781s
  Pre-ElemWrap, inlines in ir::asg::section                            0.7772s

These profiles are difficult, because they hit the filesystem so much.  I
write to /dev/null, but it reads 100s of xmlo files from disk.

It's clear that the impact is fairly modest and within a margin of error; as
such, I will continue down the path of writing code that's easier to grok
and maintain, since not doing so would be a micro-optimization relative to
the concerns of the rest of the system at this point.

But the purpose of all of this work was to determine whether an
iterator-based XIR would be viable.  It seems to be competitive.  I'll
finish up the writer reimplementation and move on.
2021-10-07 16:48:58 -04:00
Mike Gerwitz 2821098b40 .gitlab-ci.yml: Skip main build after stage build
Two reasons for this:

  1. It's unnecessary, since it's the same ref, so long as we actually build
     everything as part of the stage job; and
  2. In our environment, the token used doesn't have access to pull from the
     registry.

Fixing the latter item can be done at another time.
2021-10-07 15:55:22 -04:00
Mike Gerwitz f0ca3e88b7 .gitlab-ci.yml: Remove unnecessary SSH auth
We're using token-based auth now, using a project access token.
2021-10-07 13:25:55 -04:00
Mike Gerwitz 072a501ed5 .gitlab-ci.yml: Merge to main after successful stage pipeline 2021-10-07 13:09:40 -04:00
Mike Gerwitz 7f5064c665 tamer: obj::xmle::xir: Write l:map-from
This contains some awkward coupling for opening and closing tags to reduce
the complexity of the `Iterator` types that must be manually
specified.  That may be addressed shortly.
2021-10-05 16:13:47 -04:00
Austin Schaffer d54ef62a0d Fix import ordering 2021-10-04 17:15:02 -04:00
Mike Gerwitz 1a44e04333 tamer: ld: Write is unused outside of flag 2021-10-04 16:34:25 -04:00
Mike Gerwitz e2c9944f1b tamer: Move Sections map from unique from writer into Sections
We're implementing an new XIR-based writer and don't want to have to
duplicate this; it didn't really belong there to begin with.
2021-10-04 16:31:30 -04:00
Mike Gerwitz 004f5dc312 tamer: Read only a single map preproc:from from xmlo files
This was creating a heap-allocated `Vec` for each map symbol despite not
actually needing it.  We do have multiple froms for return map values.

But by the time we may want this type of thing, we'll have a different IR
for it anyway.
2021-10-04 14:59:33 -04:00
Mike Gerwitz 772619f6f0 tamer: Replace explicit array::IntoIter::new with IntoIter
Now that we're on 2021 Edition, the default behavior has changed to be
consistent.
2021-10-02 01:03:19 -04:00
Mike Gerwitz f9c9c95516 tamer: sym::prefill: Static symbol polymorphism
See the docs for a much deeper discussion.  In summary: traits do not
support static methods, and this is the workaround, which relies on unstable
nightly constant function features.

This implementation is tested using `qname_const!`, and will be utilized
with a new static type in a following commit.
2021-10-02 00:58:14 -04:00
Mike Gerwitz 9d87962e96 tamer: Use Rust 2021 Edition
This will be stable Oct 21; this uses nightly for now.
2021-10-02 00:58:14 -04:00
Mike Gerwitz 885d5e4d8f tamer: Switch back to nightly toolchain
This is to support two things:
  1. Early switch to 2021 Edition, which is stable Oct 21; and
  2. To make use of unstable const features.

The rationale is that switching to nightly does not really have any
significant downside for us, given that TAMER is used only by us and
the only risk is that unstable features may change a bit, which can be
mitigated with certain precautions.

The rationale for each unstable feature will be documented as they are used,
including documentation on what would be required to remove it and what
functionality would be lost / need to change in doing so.
2021-10-02 00:58:14 -04:00
Mike Gerwitz 7c61a92d30 tamer: obj::xmle::xir: Minor clean and docs
This is far from fully documented; it's just a start.  I'll document fully
once the implementation is done, to ensure I don't waste time documenting
things that may change.
2021-10-02 00:58:14 -04:00
Mike Gerwitz 42188e80e7 tamer: obj::xmle::xir::test: Extract into own file
These are getting large and messy.

And I now notice that I never completed the header test after
prototyping.  Shame on me.

Also, errata from the previous commit message: the diffs are identical
_except for attribute escaping_ that is unnecessary; we're outputting data
read directly from existing XML files (output by Saxon), so characters are
already escaped as needed.

DEV-10561
2021-10-02 00:58:13 -04:00
Mike Gerwitz 7269e68b00 tamer: obj::xmle::xir: Complete l:dep
The `l:dep` section of the `xmle` file, after formatting (since XIR writes
without newlines and indentation), is now identical to the existing xmle
writer.  I can now move on to the other sections.

Note that the attribute movement in this commit is simply to get the diff to
properly align.  Once the current xmle writer is removed, I'll organize them
a bit more sensibly.

`obj::xmle::xir` also needs documentation, now that it's shown to be viable.
2021-09-30 13:06:30 -04:00
Mike Gerwitz acf55fad81 tamer: Intern desc from xmle on read
The new xmle writer was having to intern before write, which did not make
sense.

This continues with consistently using symbols throughout the system, and
is a smaller size than `String` as a bonus.
2021-09-29 23:31:07 -04:00
Mike Gerwitz 5250571f15 tamer: ir::asg::ident: Use symbols in place of string slice mapping
`IdentKind` needs to be written to `xmle` files and displayed in error
messages.  String slices were used when quick-xml was used for writing,
which will be going away with the new writer.
2021-09-29 23:18:23 -04:00
Mike Gerwitz fa4181770f tamer: src::ir::asg::ident::Dim: Assert n<10
This replaces a TODO with an assertion.
2021-09-29 16:26:41 -04:00
Mike Gerwitz 6864fbc1cd tamer: Start of XIR-based xmle writer
This has been a long time coming, and has been repeatedly stashed as other
parts of the system have evolved to support it.  The introduction of the XIR
tree was to write tests for this (which are sloppy atm).

This currently writes out the `xmle` header and _most_ of the `l:dep`
section; it's missing the object-type-specific attributes.  There is,
relatively speaking, not much more work to do here.

The feature flag `wip-xir-xmle-writer` was introduced to toggle this system
in place of `XmleWriter`.  Initial benchmarks show that it will be
competitive with the quick-xml-based writer, but remember that is not the
goal: the purpose of this is to test XIR in a production system before we
continue to implement it for a frontend, and to refactor so that we do not
have multiple implementations writing XML files (once we echo the source XML
files).

I'm excited to get this done with so that I can move on.  This has been
rather exhausting.
2021-09-28 14:52:53 -04:00
Mike Gerwitz 863d990cbd tamer: sym: 16-bit static symbol prefill
The 16-bit interner at present will be used only for span contexts.  In the
future, this interner may become specialized specifically for that, but for
now let's just re-use what we already have so that I can move on.

DEV-10733
2021-09-28 10:39:46 -04:00
Mike Gerwitz 96b16c6de9 tamer: sym::prefill::test::global_sanity_check: Note duplicate strings
I want to make it clear in the assertion that the problem could be caused by
duplicate strings.  We do not sort by string, because in part we may in the
future want to group certain symbols together in some arbitrary way so we
can compare ranges (using the markers).

If that doesn't end up happening, it may be better to just sort by string
to obviate the problem.
2021-09-24 16:25:29 -04:00
Mike Gerwitz db8a098452 tamer: sym: Minor documentation refinement
Mostly rewording.
2021-09-24 10:11:19 -04:00
Mike Gerwitz c71d36b154 tamer: sym::prefill: All-caps constants for static symbols
It's really awkward not having them caps, when not only are constants
expected to be, but also that we cannot maintain consistency between the
string and the identifier name in even the simplest of cases.

(We could use `r#`, but that's too cumbersome.)
2021-09-23 23:48:28 -04:00
Mike Gerwitz 785ca0fe9e tamer: sym::prefill: Remove StaticSymbolId in favor of refined types
`StaticSymbolId` was created before the more specific types, which render it
unnecessary.  If we need a generic type, it can be re-introduced, but using
`static_symbol_newtypes!`.
2021-09-23 23:35:45 -04:00
Mike Gerwitz 15ff00b3cf tamer: sym: Only prefill 32-bit global interner
This is the interner that is intended to be used with the majority of the
system; the 16-bit interner is left around for the moment, but will likely
later become specialized.
2021-09-23 16:11:17 -04:00
Mike Gerwitz e91aeef478 tamer: Remove Ix generalization throughout system
This had the writing on the wall all the same as the `'i` interner lifetime
that came before it.  It was too much of a maintenance burden trying to
accommodate both 16-bit and 32-bit symbols generically.

There is a situation where we do still want 16-bit symbols---the
`Span`.  Therefore, I have left generic support for symbol sizes, as well as
the different global interners, but `SymbolId` now defaults to 32-bit, as
does `Asg`.  Further, the size parameter has been removed from the rest of
the code, with the exception of `Span`.

This cleans things up quite a bit, and is much nicer to work with.  If we
want 16-bit symbols in the future for packing to increase CPU cache
performance, we can handle that situation then in that specific case; it's a
premature optimization that's not at all worth the effort here.
2021-09-23 14:52:54 -04:00
Mike Gerwitz ed245bb099 tamer: sym::prefill: Initial typed static symbol concept
We'll see how the syntax evolves over time.  It's not ideal to have to
specify the type, rather than having the compiler infer it, but I don't much
feel like getting into my first procedural macro right now, so we'll stick
with this approach for the time being.

This will set the stage to be able to safely e.g. create QNames statically
at compile-time and would allow us to make any attempts to bypass it
unsafe.
2021-09-23 00:37:39 -04:00
Mike Gerwitz b972b0b202 tamer: sym::StaticSymbolId: Introduce
Previously, we were allocating only u32 versions of `SymbolId` for the
statically allocated symbols.  This introduces a new symbol type with a very
small datatype (8 bits) that is able to cast into any `SymbolId`.  This is
explained in the docs.

We'll be taking this typing further in future commits so that static symbols
are better-suited for compile-time guarantees for static newtype
construction.

DEV-10710
2021-09-22 21:37:06 -04:00
Mike Gerwitz c87147c277 configure.ac: Bump Rust 1.{53=>54} for using macros in attribute values
The previous commit uses `concat!` for doc generation.  I forgot that this
was only recently stabalized.
2021-09-22 16:47:17 -04:00
Mike Gerwitz 366fef714b tamer: sym::prefill: Introduce static symbols
This is the beginning of static symbols, which is becoming increasing
necessary as it's quite a pain to have to deal with interning static strings
any place they're used.

It's _more_ of a pain to do that in conjunction with newtypes (e.g. `QName`,
`AttValue`, etc) that make use of `SymbolId`; this will allow us to
construct _those_ statically as well, and additional work to support that
will be coming up.

DEV-10701
2021-09-22 16:08:40 -04:00
Mike Gerwitz e0a209d417 tamer: bench: xir: Reduce writer benchmark memory usage
These were using GiB of memory, which is ...unnecessary.

I reduced the iteration count significantly, but it was still wasting a lot
of time and memory and needed `with_capacity` to reduce the number of copies
after reallocation.

It is not typical that a buffer would contain this much information.
2021-09-21 16:21:32 -04:00
Mike Gerwitz aee781a6fb tamer: bench: xir: Fix broken benchmark
This broke when I removed `SelfClose`.  I used to run
`make all fmt check bench` before every push, but they take a while to run,
in part because it uses nightly and has to recompile too.

But it looks like I need to be more diligent again.
2021-09-21 16:09:50 -04:00
Mike Gerwitz b348892276 tamer: ir::xir::tree: Introduce attribute fragment parsing
This is exactly was I said I was _not_ going to do in the previous commit,
but apparently hacking late at night had me forget the whole reason that
XIRT is being introduced now---unit tests.  I'll be emitting a XIR stream
and I need to parse it for convenience in the tests.

So, here's a good start.  Next will be some generalizations that are useful
for the tests as well.  This is pretty bare, but accomplishes the task.

See docs for more info.
2021-09-21 16:07:38 -04:00
Mike Gerwitz a5afc76568 tamer: ir::xir::tree: Extract Attr{,List} into new module
The `tree` module is getting more difficult to navigate.  The tests still
remain where they were, since a bunch of concerns are mixed together.  Any
tests specific only to this module will be added here.
2021-09-21 10:43:23 -04:00
Mike Gerwitz fe7b64fe62 tamer: ir::xir::tree::AttrName: Remove unused, rename {Ele=>}AttrName
Attributes used to be able to be emitted standalone, but that was abandoned
a while back to clean things up a bit.  This cleanup was missed.
2021-09-21 09:29:56 -04:00
Mike Gerwitz c6a7988bc8 tamer: ir::xir: Add Token::AttrValueFragment with writer support
This is implemented only for the writer, since its use case is to be able to
concatenate strings without copying during writing.

It doesn't really make sense to support this in XIR Tree, since a reader
should never produce this.  But if we ever run into this (e.g. due to some
internal processing pipeline), we'll address it then; XIR Tree might have to
do copying, then, but should probably wait until encountering all fragments
before interning.  That'd be a distraction right now.
2021-09-21 00:16:30 -04:00
Mike Gerwitz e95afe2658 tamer: ir::xir::tree::Element::open: Fix doc typo 2021-09-21 00:16:30 -04:00
Mike Gerwitz 3bb6f0cf35 tamer: ir::asg::ident: AsRef impls for SymbolId types
This commit will make more sense once the broader context is committed, but
it's needed for lowering from `Sections` into a XIR stream.

This will also change once we pre-allocate symbols, like rustc, when the
interner is initialized.

This is my first use of the `paste` crate, which is used to generate
identifiers.  So this is partly an experiment, and it seems much better than
having to write a proc macro, at least at this point in time.  If this code
stays around, it'll probably be generalized further and used elsewhere, but
I'd prefer not to go this route long-term.
2021-09-20 16:50:40 -04:00
Mike Gerwitz 12daddcc2d tamer: ir::xir::tree::Element: Open element constructor
This simply moves the construction into `Element`.
2021-09-16 10:52:00 -04:00
Mike Gerwitz ea50e1112a tamer: ir::xir::tree: Extract tests into own file
This file's getting large, and will only grow more complex.
2021-09-16 10:18:02 -04:00
Mike Gerwitz 3484336b1d tamer: ir::xir::tree::Stack: Encapsulate ElementStack manipulation
This moves some logic into `ElementStack` (which would be part of `Stack` if
variants were their own types), rather than peering so deeply into its
data.
2021-09-16 10:07:37 -04:00
Mike Gerwitz a49ac23aeb tamer: ir::xir::tree: Child element attribute parsing
This correctly retains and restores the parent stack after processing an
attribute for a child element.

This does increase the size of [`Stack`] a bit, but we can evaluate whether
it's too large at a later time.  It's currently 832 bits with `Ix=u32`,
which is large, but the question is whether it matters; we'll see as we
begin to use it.
2021-09-15 16:46:15 -04:00
Mike Gerwitz 61e493066c tamer: ir::xir::tree: Clean up parser implementation
This moves most of the parsing logic into `Stack`, which rightfully owns the
stack manipulation and state transitions.  `ParserState` becomes exactly
what it says it is---a management of the persistent state of the parser, and
is also responsible for digesting tokens and dispatching their data to the
proper event.

This approach has a number of benefits over the old design: it's
self-documenting, making the intent clear; and it is easier to reason about
the subset of states (for both humans and Rusts) than a large match of
transitions.

This contains a number of TODO items that will be addressed shortly.  It
also obviated that the previous commit was incomplete---it doesn't persist
`pstack` for attributes on child elements!  That'll be fixed too.
2021-09-15 16:33:08 -04:00
Mike Gerwitz 366ecca8ea tamer: ir::xir::tree: Initial child element parsing
This modifies the tree parser to handle child elements.  It's mostly
proof-of-concept code; the next commit will clean it up a bit so that it's
largely self-documenting.
2021-09-15 11:19:08 -04:00
Mike Gerwitz 51507ccdad tamer: ir::xir: Combine Token::{SelfClose, Close} variants
This removes `SelfClose` and merges it with `Close` by making the first
parameter an `Option`.  This isn't really ideal, but it really simplifies
pattern matching, especially for the next commit.  I'll have more details
there.

The primary motivation was lack of stabalization for binding after `@` in
matches, e.g. `Foo(name, ele) | ele @ Element { name, .. }`.  It looks like
it's ready, though; maybe next Rust release?

  https://github.com/rust-lang/rust/issues/65490

I don't know if I'll revert this change after then.  This seems plenty
clear, albeit more verbose.
2021-09-13 13:06:20 -04:00
Mike Gerwitz 1c40b9c504 tamer: ir::xir::tree: Closing element parsing with balance check
This introduces parser errors, but does not yet support error recovery; that
problem will be discussed in a commit in the near future, after the writer
is sorted out a bit more.

DEV-10561
2021-09-13 10:45:38 -04:00
Mike Gerwitz 5979e1fb90 tamer: ir::xir::tree: Correct italic formatting in docs
I was using an Org mode format.
2021-09-13 09:47:39 -04:00
Mike Gerwitz fd8a05164d tamer: ir::xir::tree: Remove Tree::Attr, add AttrList
The idea, previously, was that parsing could begin at attributes selectively
and be parsed independently.  But that's really awkward with `Tree`, since
it effectively allows orphan attributes as children of an
`Element`.  Nonsense.

Instead, if we truly only want an attribute list, we can offer a function to
create a parser with an empty `Stack::BuddingElement` that can accumulate
them.
2021-09-09 14:40:58 -04:00
Mike Gerwitz 4987bc39b0 tamer: ir::xir::tree::parser_from: Yield parsed trees
Previously, `parser_from` was a simple wrapper around `parse`; now, this
provides a more convenient API where `next` will yield the next parsed
object.

See docs for much more information and rationale.
2021-09-09 13:05:11 -04:00
Mike Gerwitz 1452a4186a tamer: convert: Add missing method-level docs 2021-09-08 16:12:53 -04:00
Mike Gerwitz 2586827d64 tamer: convert::{ExpectFrom, ExpectInto}: New traits
These traits are intended to eliminate boilerplate, primarily in tests, in
situations where from/into is not expected to fail.

Given that TAMER must only panic for internal compiler errors, this should
not often be used outside of test cases.  Further, there may be better
options in the future (e.g. QNames could be statically compiled rather than
trying to convert at runtime, in this case).
2021-09-08 16:03:44 -04:00
Mike Gerwitz 12bb88e4b5 tamer: ir::xir::tree: Introduce XIR tree
This begins to introduce the XIR tree.  I was originally going to wait on
this until after implementing the xmle writer in terms of XIR, but writing
unit tests is too much of a pain on the stream, so now is as good of a time
as any.

This has very limited support so far; it'll be added to as time goes on.
2021-09-08 13:56:04 -04:00
Mike Gerwitz ab093046e9 tamer: ir::asg::section: Provide iterators for major section groups
These groups happen to correspond with the sections of the xmle file, which
suggests again that this lives in the wrong place.  But I should really have
my focus elsewhere right now, so I don't know if I'll go any further right
now.  I guess we'll see as the writer is reimplemented.
2021-09-01 11:21:44 -04:00
Mike Gerwitz 1fa9614698 tamer: ir::asg::section: Improve iteration
`SectionsIter` was introduced to remove that responsibility from xmle
writer, since that's currently being reimplemented using XIR.

The existing iterator has been renamed SectionIter{ator=>} for a more
idiomatic name for iterator structs, and now has a static type rather than
relying on dynamic dispatch.  The author of that code wasn't sure how to
handle it otherwise.  (Which is understandable, since we were both still
getting acquainted with Rust.)  There's no notable change in performance in
my benchmarking.

This abstraction is a bit awkward, in that it's named for object file
sections, but they aren't.  Further, it's coupled with the ASG via
`SortableAsg` and perhaps should be generalized into a sorting routine that
takes a function for sorting, so that `Sections` can be moved into xmle's
packages.
2021-09-01 09:14:51 -04:00
Mike Gerwitz b80064f59e tamer: configure: Check for Rust 1.{52=>53}.
Or-pattern syntax is used; I had forgotten to bump this version.

For example, match on `Foo(Bar | Baz)` vs. `Foo(Bar) | Foo(Baz)`.
2021-08-30 15:19:14 -04:00
Mike Gerwitz 9331858c6d doc: Give @mdash macro an argument
This macro is used to consume whitespace so that the following sentence can
start on the next line without producing any whitespace in the output.  Its
argument is, therefore, whitespace.

This used to work in earlier versions of Texinfo, but around 6.{6,7} it
began failing because an argument was provided when it wasn't defined with
one.
2021-08-30 10:41:49 -04:00
Mike Gerwitz 0a8fb71c1b tamer: tameld: Use buffered writes
This was an oversight.  The difference is significant.  I had my suspicions
about this when I noticed the huge difference in time between writing to
/dev/null vs. an actual file during profiling.

On one of our systems, here's the number of syscalls _before_ this change:

  $ strace -c target/release/tameld --emit xmle -o foo foo.xmlo
  % time     seconds  usecs/call     calls    errors syscall
  ------ ----------- ----------- --------- --------- ----------------
   85.05    4.966192          16    318473           write
    7.23    0.421977          13     32298           lstat
    6.53    0.381424          15     25113           read
    0.75    0.043691          13      3350           readlink
    0.25    0.014713          61       241           close
    0.12    0.007167          30       241           openat
    0.05    0.003175         151        21           munmap
    0.01    0.000488          14        35           brk
    0.01    0.000292           9        33           mmap
    0.00    0.000266          38         7           mremap
    0.00    0.000004           1         3           sigaltstack
    0.00    0.000000           0         6           fstat
    0.00    0.000000           0         1           poll
    0.00    0.000000           0        11           mprotect
    0.00    0.000000           0         7           rt_sigaction
    0.00    0.000000           0         1           rt_sigprocmask
    0.00    0.000000           0         6         6 access
    0.00    0.000000           0         1           execve
    0.00    0.000000           0         1           arch_prctl
    0.00    0.000000           0         1           sched_getaffinity
    0.00    0.000000           0         1           set_tid_address
    0.00    0.000000           0         1           set_robust_list
    0.00    0.000000           0         2           prlimit64
  ------ ----------- ----------- --------- --------- ----------------
  100.00    5.839389                379854         6 total

And _after_:

  $ strace -c target/release/tameld --emit xmle -o foo foo.xmlo
  % time     seconds  usecs/call     calls    errors syscall
  ------ ----------- ----------- --------- --------- ----------------
   45.21    0.435010          13     32298           lstat
   40.09    0.385752          15     25113           read
    6.14    0.059113          21      2809           write
    4.75    0.045687          14      3350           readlink
    2.51    0.024115         100       241           close
    0.84    0.008045          33       241           openat
    0.26    0.002468         118        21           munmap
    0.06    0.000580          17        35           brk
    0.06    0.000566          17        33           mmap
    0.03    0.000279          40         7           mremap
    0.02    0.000181          16        11           mprotect
    0.01    0.000087          15         6         6 access
    0.01    0.000082          12         7           rt_sigaction
    0.01    0.000075          13         6           fstat
    0.00    0.000027           9         3           sigaltstack
    0.00    0.000024          12         2           prlimit64
    0.00    0.000018          18         1           execve
    0.00    0.000016          16         1           poll
    0.00    0.000013          13         1           sched_getaffinity
    0.00    0.000012          12         1           rt_sigprocmask
    0.00    0.000012          12         1           arch_prctl
    0.00    0.000012          12         1           set_robust_list
    0.00    0.000011          11         1           set_tid_address
  ------ ----------- ----------- --------- --------- ----------------
  100.00    0.962185                 64190         6 total

What a difference!

There's still a lot of other red flags in there; those can be addressed
separately.

This was originally written as I was learning Rust, and I suspect that I
didn't realize that File wasn't buffered at the time.

For the above link: times go from 1.23s pre-change to 0.85s after:

  0.77user 0.44system 0:01.23elapsed 99%CPU (0avgtext+0avgdata 48520maxresident)k
  0inputs+43952outputs (0major+12825minor)pagefaults 0swaps

  0.69user 0.15system 0:00.85elapsed 98%CPU (0avgtext+0avgdata 48396maxresident)k
  0inputs+43952outputs (0major+12823minor)pagefaults 0swaps
2021-08-20 12:14:42 -04:00
Mike Gerwitz c9a2ae533f tamer: xir (XmlWriter)[write_new]: Correct #[must_use] declaration
The return value has no meaningful side-effects at all; the write operation
failing isn't worth pointing out, since it has to be used regardless.

The normal `write` does have useful side-effects, of course.
2021-08-20 11:38:58 -04:00
Mike Gerwitz 59d578e669 tamer: xir (XmlWriter)[write_new]: New method
This change was primarily intended to clean up unit tests.  Since it
allocates and returns a new buffer, I do not expect this to have much use
within TAMER itself in the near future.  Maybe in later tooling.

If this is abused, person from the future: add `#[cfg(test)]` to its
definition.
2021-08-20 11:37:01 -04:00
Mike Gerwitz cd1eae95ca tamer: xir: {NodeStream=>Token}
I decided not to do this in a previous commit because I had documented
"NodeStream" elsewhere, so I'd like it to be in the Git history to
understand its evolution.

This never was a "Node" stream beyond the initial concept phase, because it
represents tokens that aren't themselves nodes.  It is intended to generate
XML nodes, but may need to accommodate non-nodes (e.g. XML declarations) in
the future.

The name originated from `Node`, which was a tree-based IR that was
initially conceived, but removed because it's not yet needed.  What we need
is a streaming IR for xmle writing, and then for reading and echoing back
out XML for the new frontend.
2021-08-20 10:30:27 -04:00
Mike Gerwitz a23bae5e4d tamer: XIR: Working concept
This is a working streaming IR for XML.  I want to get this committed before
I go further cleaning it up and integrating it into the xmle writer.

This is lacking detailed documentation, and the names of things may end up
changing.

Initial benchmarks do show that it has a ~2x performance improvement over
quick-xml when dealing with two attributes on a node, and I suspect that
improvement will increase with the number of attributes.  We will see how it
compares in real-world benchmarks once the linker has been modified to use
it.

The goal isn't to _avoid_ quick-xml---it'll be used in the future for things
like escaping that would be a huge waste to implement ourselves.  It just so
happened that quick-xml was not beneficial for these changes; indeed, its
own writer is fairly simple for the portions that were implemented here, so
there's no use in fighting with its API, particularly around attributes and
our need to explicitly control whitespace (with the intent of handling code
formatters in the future).

To put this into perspective: the reason this work is being done isn't to
refactor the linker, or to speed it up, but to generalize XML writing and
provide a suitable IR for use in the compiler.  The first step of the
frontend is to essentially echo the XML token stream back out so we can
incrementally parse it and do something useful, to incrementally rewrite the
compiler in Rust.
2021-08-20 10:16:36 -04:00
Mike Gerwitz c211ada89b tamer: benches (memchr): Add missing bench attr
This benchmark was not being run.
2021-08-19 23:14:33 -04:00
Mike Gerwitz e217478a46 tamer: Makefile.am (CARGO_BENCH_FLAGS): New env var 2021-08-19 16:43:14 -04:00
Mike Gerwitz fc235b7ecc tamer: memchr benches
This adds benchmarking for the memchr crate.  It is used primarily by
quick-xml at the moment, but the question is whether to rely on it for
certain operations for XIR.

The benchmarking on an Intel Xeon system shows that memchr and Rust's
contains() perform very similarly on small inputs, matching against a single
character, and so Rust's built-in should be preferred in that case so that
we're using APIs that are familiar to most people.

When larger inputs are compared against, there's a greater benefit (a little
under ~2x).

When comparing against two characters, they are again very close.  But look
at when we compare two characters against _multiple_ inputs:

  running 24 tests
  test large_str:1️⃣:memchr_early_match                 ... bench:       4,938 ns/iter (+/- 124)
  test large_str:1️⃣:memchr_late_match                  ... bench:      81,807 ns/iter (+/- 1,153)
  test large_str:1️⃣:memchr_non_match                   ... bench:      82,074 ns/iter (+/- 1,062)
  test large_str:1️⃣:rust_contains_one_byte_early_match ... bench:       9,425 ns/iter (+/- 167)
  test large_str:1️⃣:rust_contains_one_byte_late_match  ... bench:     123,685 ns/iter (+/- 3,728)
  test large_str:1️⃣:rust_contains_one_byte_non_match   ... bench:     123,117 ns/iter (+/- 2,200)
  test large_str:1️⃣:rust_contains_one_char_early_match ... bench:       9,561 ns/iter (+/- 507)
  test large_str:1️⃣:rust_contains_one_char_late_match  ... bench:     123,929 ns/iter (+/- 2,377)
  test large_str:1️⃣:rust_contains_one_char_non_match   ... bench:     122,989 ns/iter (+/- 2,788)
  test large_str:2️⃣:memchr2_early_match                ... bench:       5,704 ns/iter (+/- 91)
  test large_str:2️⃣:memchr2_late_match                 ... bench:      89,194 ns/iter (+/- 8,546)
  test large_str:2️⃣:memchr2_non_match                  ... bench:      85,649 ns/iter (+/- 3,879)
  test large_str:2️⃣:rust_contains_two_char_early_match ... bench:      66,785 ns/iter (+/- 3,385)
  test large_str:2️⃣:rust_contains_two_char_late_match  ... bench:   2,148,064 ns/iter (+/- 21,812)
  test large_str:2️⃣:rust_contains_two_char_non_match   ... bench:   2,322,082 ns/iter (+/- 22,947)
  test small_str:1️⃣:memchr_mid_match                   ... bench:       4,737 ns/iter (+/- 842)
  test small_str:1️⃣:memchr_non_match                   ... bench:       5,160 ns/iter (+/- 62)
  test small_str:1️⃣:rust_contains_one_byte_non_match   ... bench:       3,930 ns/iter (+/- 35)
  test small_str:1️⃣:rust_contains_one_char_mid_match   ... bench:       3,677 ns/iter (+/- 618)
  test small_str:1️⃣:rust_contains_one_char_non_match   ... bench:       5,415 ns/iter (+/- 221)
  test small_str:2️⃣:memchr2_mid_match                  ... bench:       5,488 ns/iter (+/- 888)
  test small_str:2️⃣:memchr2_non_match                  ... bench:       6,788 ns/iter (+/- 134)
  test small_str:2️⃣:rust_contains_two_char_mid_match   ... bench:       6,203 ns/iter (+/- 170)
  test small_str:2️⃣:rust_contains_two_char_non_match   ... bench:       7,853 ns/iter (+/- 713)

Yikes.

With that said, we won't be comparing against such large inputs
short-term.  The larger strings (fragments) are copied verbatim, and not
compared against---but they _were_ prior to the previous commit that stopped
unencoding and re-encoding.

So: Rust built-ins for inputs that are expected to be small.
2021-08-18 14:23:03 -04:00
Mike Gerwitz 1cdb3fbbc5 tamer: tameld: Skip fragment unescaping only to re-escape on write
Fragments' text were unescaped on reading, producing an owned String and
spending time parsing the text to unescape.  We were then copying that into
an internement pool (so, copying twice, effectively).

Further, we were then _re-escaping_ on write.

This was all wasteful, since we do not do any manipulation of the fragment
before outputting to the xmle file; we know that Saxon produced properly
escaped XML to begin with, and can trust to propagate it.

This also introduces a new global `clone_uninterned_utf8_unchecked` method.

In profiling this change, I tested (a) before this change, (b) after writing
without escaping, and (c) after both reading escaped and writing without
escaping.

     (a)              (b)              (c)
  sec   mem (B)    sec     B        sec     B
0:00.95 47896 -> 0:00.91 47988 -> 0:00.87 48288
0:00.40 30176 -> 0:00.37 25656 -> 0:00.36 25788
0:00.39 45672 -> 0:00.37 45756 -> 0:00.35 34952
0:00.39 20716 -> 0:00.38 19604 -> 0:00.36 19956
0:00.33 16836 -> 0:00.32 16988 -> 0:00.31 16892
0:00.23 15268 -> 0:00.23 15236 -> 0:00.22 15312
0:00.44 20780 -> 0:00.44 20048 -> 0:00.41 20148
0:00.54 44516 -> 0:00.50 36964 -> 0:00.49 36728
0:00.62 55976 -> 0:00.57 46204 -> 0:00.54 41468
0:00.31 28016 -> 0:00.30 27308 -> 0:00.28 23844
0:00.23 15388 -> 0:00.22 15316 -> 0:00.21 15304
0:00.05 4888  -> 0:00.05 4760  -> 0:00.05 4948
0:00.41 19756 -> 0:00.41 19852 -> 0:00.40 19992
0:00.47 20828 -> 0:00.46 20844 -> 0:00.44 20968
0:00.27 18152 -> 0:00.26 18184 -> 0:00.25 18312

Interestingly, the peak memory usage increases very slightly between the
second and third steps (though decreases from the first), likely because the
raw (encoded) is larger than the unencoded text (e.g. `&gt;` takes more
space than `>`).
2021-08-18 11:39:06 -04:00
Mike Gerwitz f97141f5c5 tamer: tameld: Use uninterned symbols for reader
Fragments were previously represented by `String` to avoid the cost of
interning (hashing and copying).  This change modifies it to use uninterned
symbols, which does still have a copy overhead but it does not hash.

Initial tests shows a small performance decrease of about 15% and a small
memory increase of similar proportion.  However, once I realized that I was
not clearing buffers from quick_xml events and implemented that change in a
previous commit, this change ended up being approximately on par with
`String`, despite the copying of some pretty large fragments.

YMMV, though, and perhaps on less powerful systems time may increase
slightly.

The upcoming XIR (XML IR) was originally going to support both owned strings
and symbols, but now we'll just use uninterned symbols; I can't rationalize
complicating the API at this time when it will provide an almost
imperceivable performance benefit.  If ever that changes in the future,
that change will be entertained.

The end result is that the fate of a fragment's underlying memory is
determined by whatever is processing the data, _not_ by the API itself---the
API was previously forcing use of a String, whereas now it's up to the
caller to determine whether we want comparable interns.  For fragments,
that's not likely ever to be the case, especially considering that the
representation will change so drastically in the future.
2021-08-16 14:05:32 -04:00
Mike Gerwitz d96dcad7d8 tamer: tameld: Reduce peak memory usage
This clears the buffers used by quick_xml, which was apparently forgotten
during initial development (I think I expected it to re-use the previously
allocated space automatically).

This has significant effects in some cases.  For example, one of our UI
builds drops from ~9KiB to ~5KiB peak memory usage.  Other builds for larger
suppliers are only slightly effected because of some of their massive
fragments.
2021-08-16 13:38:14 -04:00
Mike Gerwitz ce233ac01d tamer: sym: Uninterned symbols
This adds support for uninterned symbols.  This came about as I was creating
Xir (not yet committed) where I had to decide if I wanted `SymbolId` for all
values, even though some values (e.g. large text blocks like compiled code
fragments for xmle files) will never be compared, and so would be wastefull
hashed.

Previous IRs used `String`, but that was clumsy; see documentation in this
commit for rationale.
2021-08-13 22:54:04 -04:00
Mike Gerwitz a008d11fb3 .gitlab-ci.yml (deploy): Deploy on main branch
The switch to the `main` branch follows our conventions for other
repositories as we switch to trunk-based development.

Given that main will always be in a deployable state, there's no use in
waiting for tags.
2021-08-13 15:16:40 -04:00
Mike Gerwitz 0ff0f88e5f tamer: Introduce span
This is an initial implementation optimized for expected use
cases.  Hopefully that pans out and doesn't come back to bite me.

Regarding the context: it only allows for interned paths atm, which are
strings (and so much be valid UTF-8, which is fine for us, but sucks for
something more general-purpose).  I'll be curious if the context needs
extension later on, or if different contexts will be stored in IRs (e.g. to
store a template application site as well as the location of the expansion
within the template body).
2021-08-13 15:16:39 -04:00
Mike Gerwitz 29ab4b9bfc tamer: sym: Disallow SymbolId construction outside of module
SymboldIds must only be constructed by interners, otherwise we lose
confidence in the type.

This offers an associated function to construct raw SymbolIds from integers
for testing purposes.
2021-08-13 11:54:11 -04:00
Mike Gerwitz d11b4220b2 Revert "tamer: Cargo.toml (dependencies)[lazy_static]: Remove (now used)"
This reverts commit 4fd6313cd2.

...and now I need it for tests.
2021-08-12 16:08:34 -04:00
Mike Gerwitz 4fd6313cd2 tamer: Cargo.toml (dependencies)[lazy_static]: Remove (now used)
The previous commit removed all uses.
2021-08-11 16:26:36 -04:00
Mike Gerwitz 9deb393bfd tamer: Global interners
This is a major change, and I apologize for it all being in one commit.  I
had wanted to break it up, but doing so would have required a significant
amount of temporary work that was not worth doing while I'm the only one
working on this project at the moment.

This accomplishes a number of important things, now that I'm preparing to
write the first compiler frontend for TAMER:

  1. `Symbol` has been removed; `SymbolId` is used in its place.
  2. Consequently, symbols use 16 or 32 bits, rather than a 64-bit pointer.
  3. Using symbols no longer requires dereferencing.
  4. **Lifetimes no longer pollute the entire system! (`'i`)**
  5. Two global interners are offered to produce `SymbolStr` with `'static`
     lifetimes, simplfiying lifetime management and borrowing where strings
     are still needed.
  6. A nice API is provided for interning and lookups (e.g. "foo".intern())
     which makes this look like a core feature of Rust.

Unfortunately, making this change required modifications to...virtually
everything.  And that serves to emphasize why this change was needed:
_everything_ used symbols, and so there's no use in not providing globals.

I implemented this in a way that still provides for loose coupling through
Rust's trait system.  Indeed, Rustc offers a global interner, and I decided
not to go that route initially because it wasn't clear to me that such a
thing was desirable.  It didn't become apparent to me, in fact, until the
recent commit where I introduced `SymbolIndexSize` and saw how many things
had to be touched; the linker evolved so rapidly as I was trying to learn
Rust that I lost track of how bad it got.

Further, this shows how the design of the internment system was a bit
naive---I assumed certain requirements that never panned out.  In
particular, everything using symbols stored `&'i Symbol<'i>`---that is, a
reference (usize) to an object containing an index (32-bit) and a string
slice (128-bit).  So it was a reference to a pretty large value, which was
allocated in the arena alongside the interned string itself.

But, that was assuming that something would need both the symbol index _and_
a readily available string.  That's not the case.  In fact, it's pretty
clear that interning happens at the beginning of execution, that `SymbolId`
is all that's needed during processing (unless an error occurs; more on that
below); and it's not until _the very end_ that we need to retrieve interned
strings from the pool to write either to a file or to display to the
user.  It was horribly wasteful!

So `SymbolId` solves the lifetime issue in itself for most systems, but it
still requires that an interner be available for anything that needs to
create or resolve symbols, which, as it turns out, is still a lot of
things.  Therefore, I decided to implement them as thread-local static
variables, which is very similar to what Rustc does itself (Rustc's are
scoped).  TAMER does not use threads, so the resulting `'static` lifetime
should be just fine for now.  Eventually I'd like to implement `!Send` and
`!Sync`, though, to prevent references from escaping the thread (as noted in
the patch); I can't do that yet, since the feature has not yet been
stabalized.

In the end, this leaves us with a system that's much easier to use and
maintain; hopefully easier for newcomers to get into without having to deal
with so many complex lifetimes; and a nice API that makes it a pleasure to
work with symbols.

Admittedly, the `SymbolIndexSize` adds some complexity, and we'll see if I
end up regretting that down the line, but it exists for an important reason:
the `Span` and other structures that'll be introduced need to pack a lot of
data into 64 bits so they can be freely copied around to keep lifetimes
simple without wreaking havoc in other ways, but a 32-bit symbol size needed
by the linker is too large for that.  (Actually, the linker doesn't yet need
32 bits for our systems, but it's going to in the somewhat near future
unless we optimize away a bunch of symbols...but I'd really rather not have
the linker hit a limit that requires a lot of code changes to resolve).

Rustc uses interned spans when they exceed 8 bytes, but I'd prefer to avoid
that for now.  Most systems can just use on of the `PkgSymbolId` or
`ProgSymbolId` type aliases and not have to worry about it.  Systems that
are actually shared between the compiler and the linker do, though, but it's
not like we don't already have a bunch of trait bounds.

Of course, as we implement link-time optimizations (LTO) in the future, it's
possible most things will need the size and I'll grow frustrated with that
and possibly revisit this.  We shall see.

Anyway, this was exhausting...and...onward to the first frontend!
2021-08-11 14:24:55 -04:00
Mike Gerwitz 71011f5724 tamer: sym: Split into multiple modules
This helps to organize a bit better as I prepare to introduce singleton
interners.
2021-08-02 23:54:37 -04:00
Mike Gerwitz 01722c9c3b tamer: Symbol{Index=>Id}
The former was a misnomer (it represents an index _entry_).  This name is
also shorter, which is nice, considering how often it'll be used.
2021-07-30 13:32:32 -04:00
Mike Gerwitz 0fc8a1a4df tamer: Remove default SymbolIndex (et al) index type
Oh boy.  What a mess of a change.

This demonstrates some significant issues we have with Symbol.  I had
originally modelled the system a bit after Rustc's, but deviated in certain
regards:

  1. This has a confurable base type to enable better packing without bit
     twiddling and potentially unsafe tricks I'd rather avoid unless
     necessary; and
  2. The lifetime is not static, and there is no global, singleton interner;
     and
  3. I pass around references to a Symbol rather than passing around an
     index into an interner.

For #3---this is done because there's no singleton interner and therefore
resolving a symbol requires a direct reference to an available interner.  It
also wasn't clear to me (and still isn't, in fact) whether more than one
interner may be used for different contexts.

But, that doesn't preclude removing lifetimes and just passing around
indexes; in fact, I plan to do this in the frontend where the parser and
such will have direct interner access and can therefore just look up based
on a symbol index.  We could reserve references for situations where
exposing an interner would be undesirable.

Anyway, more to come...
2021-07-29 14:26:40 -04:00
Mike Gerwitz e6ad2be5b9 tamer: sym: Primitive-based SupportedSymbolIndex
As mentioned in the previous commit, this flips the types such that the base
type if the primitive and the associated type is the `NonZero*` type; this
is much more natural, concise, and allows Rust to infer the proper type in
most every situation.

The next step will be to stop defaulting the index type for SymbolIndex and
related, since we are about to care very much what size it is (compiler
vs. linker).
2021-07-28 15:21:24 -04:00
Mike Gerwitz e562d7fcc8 tamer: sym: Begin SymbolIndex base data generalization
This was previously a NonZeroU32, but it was intended to support NonZeroU16
as well for packages, so that we can fit symbols into smaller spaces.  In
particular, the upcoming Span wants to fit within 8 bytes, and so requires a
smaller SymbolIndex type.

I'm unhappy with this current implementation, and so comments are unfinished
and there are a couple ignores for dead code warnings.  I want to flip the
`SupportedSymbolIndex` trait so that users can specify the primitive rather
than the NonZero* type, which is really awkward-looking and verbose,
especially if you have to do `SymbolIndex::<NonZeroU32>::from_int` or
something.  It also prevents (at least in the cases I've observed) Rust from
inferring the proper type for you based on the argument you provide.

So, the goal will be `SymbolIndex::<u32>::from_int(n)`, for example.
2021-07-28 15:21:15 -04:00
Mike Gerwitz ca6ef3ed36 tamer: frontend: Begin basic XML parsing
The first step in the process is to emit the raw XML events that can then be
immediately output again to echo the results into another file.  This will
then allow us to begin parsing the input incrementally, and begin to morph
the output into a real `xmlo` file.
2021-07-27 00:37:13 -04:00
Mike Gerwitz d9dcfe8777 tamer: Introduce tpwrap module to contain quick_xml::Error adapter
This adapter exists to implement PartialEq so that it can be derived on
Error objects.  This is used primarily (well, exclusively atm) for tests.
2021-07-23 23:23:55 -04:00
Mike Gerwitz fb8422d670 tamer: Initial frontend concept
This introduces the beginnings of frontends for TAMER, gated behind a
`wip-features` flag.

This will be introduced in stages:

  1. Replace the existing copy with a parser-based copy (echo back out the
     tokens), when the flag is on.
  2. Begin to parse portions of the source, augmenting the output xmlo (xmli
     at the moment).  The XSLT-based compiler will be modified to skip
     compilation steps as necessary.

As portions of the compilation are implemented in TAMER, they'll be placed
behind their own feature flags and stabalized, which will incrementally
remove the compilation steps from the XSLT-based system.  The result should
be substantial incremental performance improvements.

Short-term, the priorities are for loading identifiers into an IR
are (though the order may change):

  1. Echo
  2. Imports
  3. Extern declarations.
  4. Simple identifiers (e.g. param, const, template, etc).
  5. Classifications.
  6. Documentation expressions.
  7. Calculation expressions.
  8. Template applications.
  9. Template definitions.
  10. Inline templates.

After each of those are done, the resulting xmlo (xmli) will have fully
reconstructed the source document from the IR produced during parsing.
2021-07-23 22:24:08 -04:00
Mike Gerwitz 60372d2960 tamer: Makefile.am (all): Binaries and doc
`all` was previously the target for binaries only.
2021-07-23 22:23:10 -04:00
Mike Gerwitz 6ec1a49506 tamer: Makefile.am: Include feature flags for doc generation and tests
This was forgotten in the previous commit.
2021-07-23 15:56:33 -04:00
Mike Gerwitz f1a3273ee3 tamer: configure.ac: Configure-time feature flags (via Cargo) 2021-07-23 10:16:44 -04:00
Mike Gerwitz 5aaa1106cb tamer: obj::xmlo::reader::mock: Extract into crate::test::quick_xml
Other mocks exist here, and here it can be re-used for the upcoming XML
frontend.
2021-07-22 15:32:30 -04:00
Mike Gerwitz 2e50af1220 Copyright year update 2021 2021-07-22 15:00:15 -04:00
Mike Gerwitz e5bbd49166 tamer: obj::xmlo::reader: Extract tests separate file
The file's getting a bit large and the tests are rather complex.  Further,
LSP does better on smaller, less complex files.
2021-07-22 14:39:06 -04:00
Mike Gerwitz 1f24cfdf25 Remove :map: sym-dep generation
This was incorrect to begin with---it does not make sense that an input
mapping should depend upon the identifier that it maps to, in the sense that
we make use of these dependencies.  If we add weak symbol references in the
future, then this can be reintroduced.

By removing this, we free tameld from having to perform the check itself.

.rev-xmlo bumped to force rebuilding of object files since the linker now
expects that no such dependencies will exist within them.
2021-07-22 14:27:15 -04:00
Mike Gerwitz 8a2cc28ddb RELEASES.md: Update for v18.0.3 2021-07-21 15:05:52 -04:00
Mike Gerwitz c90566056d RELEASES.md: NEXT summary 2021-07-21 15:04:59 -04:00
Mike Gerwitz 90c6b51fd5 tamer: tameld: Place constants into static section in executable
This is something that changed when the TAMER POC was initially created, as
I was learning Rust.  I don't recall the original reason why this was moved,
but it could have been moved back long ago.

In our systems, constants can hold tables (as matrices) with tens or
hundreds of thousands of rows, and there are a number of them in certain
projects.  As an example, the YAML-based test cases for one of our systems
went from ~2m30s to ~45s after this change was made.  Much of the cost
savings comes from saving GC.
2021-07-21 14:53:15 -04:00
Mike Gerwitz 53360548da tame: Ignore duplicate conjunctive predicates in value list optimization error
This can occur in generated code (e.g. from proguic if a question-based
predicate inherits a predicate already specified).  This commit does not
change anything that's emitted; it merely allows proceeding.

TAMER can be smarter about this; I don't want to invest more time into
generalizing deduplication of predicates.
2021-07-19 14:53:25 -04:00
Mike Gerwitz 5dab913ecb RELEASES.md: Update for v18.0.2 2021-07-15 23:50:53 -04:00
Mike Gerwitz b2323e80ef RELEASES.md: Summary of NEXT 2021-07-15 23:50:00 -04:00
Mike Gerwitz 2ad0d1425a compiler: Correct handling of TRUE matches
There was a bug whereby TRUE matches would keep whatever value was being
matched on, even if it was not a boolean.  That was an oversight from the
proof-of-concept code, and this fixes it; that's why this is behind a flag!

This also adjusts the class aliasing optimization so that it doesn't check
for a `TRUE` symbol name, which was a bad idea to begin with.

This change also ends up expanding `lv:match[@value="TRUE"]` into the long
form, where it didn't previously; this will result in slightly larger xmlo
files in some cases, but it's nothing significant, and it does not impact
compilation times.
2021-07-15 14:55:32 -04:00
Mike Gerwitz 37977a8816 entry-form.xsl: Correctly generate HTML for params with imported types
This is a nearly-10-year-old bug that was introduced when the Summary Page
was modified to use the then-new symbol table.  The compiler previously
concatenated all packages into a single XML tree and processed that, so no
package resolution was necessary here before.
2021-07-14 09:59:45 -04:00
Mike Gerwitz 513b8d7b86 worksheet.xsl: Allow package name to auto-generate
A long time ago (about a decade), package names were required, but they are
now generated by the compiler relative to the root path.  The name here was
incorrect, which was generating an incorrect path for the linked symbols,
which was causing problems with the Summary Page.
2021-07-14 09:51:08 -04:00
Mike Gerwitz f5ba4b013b summary: Make Summay Page compiler less chatty
It produces a lot of output that either results in spam (internal errors) or
pollutes the log with unnecessary information.
2021-07-01 13:54:34 -04:00
Mike Gerwitz bc9c667c9d RELEASES.md: Update for v18.0.1 2021-06-24 10:37:25 -04:00
Mike Gerwitz d0e3a5622c Remove class-level notice for new system
This was not intentionally committed.
2021-06-24 09:59:00 -04:00
Mike Gerwitz 9a62bb2ace RELEASES.md: Update for v18.0.0 2021-06-23 12:54:25 -04:00
Mike Gerwitz eef2a5d4bc Compiler runtime optimizations with classification system rewrite
See RELEASES.md for a list of changes.

This was a significant effort that began about six months ago, but was
paused at a number of points.  Rather than risking further pauses from
interruptions, the new classification system has been gated behind a
package-level feature flag, since it causes BC breaks in certain buggy
situations.

Since this flag was introduced late, there is the potential that it causes
bugs when new optimizations are mixed with the old system.
2021-06-23 12:48:42 -04:00
Mike Gerwitz dd432d249d RELEASES.md: Update with compiler optimizations 2021-06-23 12:46:37 -04:00
Mike Gerwitz 4e859148c0 tools/pkg-graph: Debugging tool to output graph of package dependencies 2021-06-23 11:44:36 -04:00
Mike Gerwitz e9598b7cb5 Correct short runtime var declarations
They were not actually defined before being aliased.
2021-06-23 11:44:36 -04:00
Mike Gerwitz 6f2b4090cd Correct behavior of matrix matching with separate index sets in new system
This behavior was largely correct, but was not commutative if the size of
the matrices (rows or columns) was smaller than a following match.
2021-06-23 11:44:36 -04:00
Mike Gerwitz e90ebd226c Remove arrow functions from classifier runtime
We need to support as far back as IE11, unfortunately, which is ES5.
2021-06-23 11:44:36 -04:00
Mike Gerwitz 934824b2ee Reintroduce legacy classification system, place new behind flag
This largely reintroduces the legacy classification system, but there are a
number of things that are not affected by the flag.  For example:

  1. Alias classifications are still optimized when the flag is off;
  2. Classifications without predicates emit slightly different code than
     before, though their functionality has not changed;
  3. There's been a lot of refactoring and minor optimizations that are
     unaffected by the flag;
  4. lv:match/@pattern will now emit a warning; and
  5. Cleaning and casting of input data is not gated.

This allows us to incrementally migrate to the new system where behavior may
be different, but this is admittedly a bit dangerous in that the new system
was aggressively tested and reasoned about, so reintroducing the legacy
system may combine in unexpected ways.
2021-06-23 11:44:36 -04:00
Mike Gerwitz 5f6cb4cf51 .rev-xmlo: Bump version
The old and new classification systems are currently incompatible, but if the
old is reintroduced, this commit can go away.
2021-06-23 11:44:36 -04:00
Mike Gerwitz 7dbb653624 Inline intermediate any/all classifications
This is another significant milestone.

The next logical step with classification optimization is to inline all of
those intermediate classifications generated from any and all blocks, since
there are so many of them.  This means having the parent classification
absorb all dependencies; not output dependencies for the classification; not
compile the assignments for those classifications; and to inline them at the
match site.  They’re used only once, since they’re generated for each
individual block.

We need to keep the actual classification generation around (and just inline
them) for now, probably until TAMER, because we depend upon their symbol for
determining their dimensionality, which we need for the optimization work we
just did---we must inline them into the proper group (matrix, vector, or
scalar).

The optimization work done up to this point had inlining in mind---only a
little bit of work was needed to make sure that every classification can
simply be stripped of its assignment and be a valid expression that can be
inlined in place of the original reference.

The result of that was predictably significant for the `ui/package` program
that I've been testing with:

  - 4,514 classifications were inlined;
  - The file size dropped to 7.5MiB (from 8.2MiB previously---remember that
    we started at 16MiB); and
  - GC ticks were cut in half, from 67->31.

Unfortunately, this optimization added nearly 1m of time to the compilation
of that program.  Speaking from the future: the UI build optimizations in
liza-proguic were introduced to offset this difference (and provide a net
gain in performance).
2021-06-23 11:44:36 -04:00
Mike Gerwitz 97caefab1b Extract classify/@terminate into own template
Note that next-match does not cause a return from the template, as odd as it
looks.
2021-06-23 11:44:36 -04:00
Mike Gerwitz 1517a03994 Combine all class optimizations into one 2021-06-23 11:44:36 -04:00
Mike Gerwitz d1dae3e1b1 Explicit types for match raising 2021-06-23 11:44:36 -04:00
Mike Gerwitz 5adf1b7589 Combine all m* optimizations
With the recent refactoring, it's clear that these are the same thing.
2021-06-23 11:44:35 -04:00
Mike Gerwitz a563c3ce62 Remove lv:match checks from class optimization checks
We handle all cases now, and prohibited @pattern that wasn't.
2021-06-23 11:44:35 -04:00
Mike Gerwitz e3fd9388bb Abstract function wrapping for class type raising
This will let us clean up the implementation a bit more.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 10089659b1 Extract lv:classify compilation into function
To support following commits for inlining.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 0cd6d40dd9 compiler: Remove whitespace from vector/matrix constants 2021-06-23 11:44:35 -04:00
Mike Gerwitz 9dbda93b4f {precision=>p} to reduce byte count 2021-06-23 11:44:35 -04:00
Mike Gerwitz f14417f32a Remove unused domains var 2021-06-23 11:44:35 -04:00
Mike Gerwitz e0907c6db2 compiler: Do not output whitespace between nodes 2021-06-23 11:44:35 -04:00
Mike Gerwitz 4ee050323a Apply hositing optimization to classify/@any
This convets disjunctive classifications into conjunctive and places an
<any> within it.

This ends up handling all the generated qwhen classifications from proguic,
which were probably converted into <any> by a previous optimization pass.

The UI program I've been using to test these compiler optimizations has
decreased in size down from 8.2MiB since the beginning of this branch; we
started at ~16MiB.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 658e55f2fa Hoist any-all common predicate for binary conjunctive classifications
See comments.  This is meant to help mitigate the damage done by one of our
code generation systems.  The benefit is significant, allowing the code
generator to remain simple.  By placing this optimization within the
compiler, hand-written and template-generated code also benefit.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 25d500fec5 Generalized value list optimization
Note that this was also broken for vectors and scalars by the commit that
expanded non-TRUE @value.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 8e457dab34 Strip single-predicate any/all instead of extracting
Rather than extracting every any/all into their own classifications,
eliminate them (and replace them with their body) if they contain only one
predicate.  This is most likely to happen after template expansion, and
there were an alarming number of them in our system.

Stripping them out of one of our programs saved ~0.2MiB of output, and
removed many intermediate classifications.  It removed ~1,075 lines, which
should correspond closely to the actual number of classifications.

Discovering this required stripping the template barriers, which was done in
a previous commit.

Unfortunately, the performance improvement from this wasn't significantly,
largely because of the nondeterminisim of GC, which can easily mask the
gains.  But a new line `v8::internal::FixedArray::set(int,
v8::internal::Object)` appeared in the profiler output, making me wonder
whether the JIT is starting to understand more interesting properties of the
system.

`mprotect` and `v8::internal::heap_internals::GenerationalBarrier` also
appeared, which are related to GC.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 2d519947f7 Strip template barriers from expanded classifications
The barriers deeply frustrate static analysis.
2021-06-23 11:44:35 -04:00
Mike Gerwitz f8b166a42d Remove lv:join
This is a long-forgotten and long-unused feature that has been
long-superceded by symbol table introspection in inline-template.
2021-06-23 11:44:35 -04:00
Mike Gerwitz c191af8d53 Remove anyValue and related code
!!!

(Message from the future: this ends up being reintroduced and the new
classification system being placed behind a feature toggle.  But it will be
eliminated eventually.)
2021-06-23 11:44:35 -04:00
Mike Gerwitz 8147bec24f m*v*s*!
This is a major milestone for class optimization---the old anyValue-based
system is no longer in use; the classification system has been wholly
rewritten.

The ticks in the sampling profiler are now where they should be, open to
further optimization with a much more solid foundation.

  [JavaScript]:
     ticks  total  nonlib   name
        5    0.6%    3.0%  LazyCompile: *vu [...]/ui/package.strip.js:25191:16
        5    0.6%    3.0%  LazyCompile: *M [...]/ui/package.strip.js:25267:15
        3    0.4%    1.8%  LazyCompile: *vmu [...]/ui/package.strip.js:25144:17
        3    0.4%    1.8%  LazyCompile: *ve [...]/ui/package.strip.js:25204:16
        2    0.2%    1.2%  LazyCompile: *precision [...]/ui/package.strip.js:25137:23
        2    0.2%    1.2%  LazyCompile: *me [...]/ui/package.strip.js:25178:16
        2    0.2%    1.2%  LazyCompile: *cmatch [...]/ui/package.strip.js:25495:20
        2    0.2%    1.2%  LazyCompile: *ceq [...]/ui/package.strip.js:25273:17
        1    0.1%    0.6%  LazyCompile: *init_defaults [...]/ui/package.strip.js:25624:27
        1    0.1%    0.6%  LazyCompile: *MM [...]/ui/package.strip.js:25268:16
        1    0.1%    0.6%  LazyCompile: *E [...]/ui/package.strip.js:25239:15
        1    0.1%    0.6%  LazyCompile: *<anonymous> [...]/ui/package.strip.js:25184:13
        1    0.1%    0.6%  LazyCompile: *<anonymous> [...]/ui/package.strip.js:25171:13

Much better than the 102 ticks that anyValue was taking some time ago!

A lot of time used to be spent compiling functions as well, a lot of which
was removed by previous commits, bringing us to:

 [C++]:
   ticks  total  nonlib   name
     50    5.9%   30.5%  node::contextify::ContextifyContext::CompileFunction(v8::FunctionCallbackInfo<v8::Value> const&)
     20    2.4%   12.2%  write
      9    1.1%    5.5%  node::native_module::NativeModuleEnv::CompileFunction(v8::FunctionCallbackInfo<v8::Value> const&)
      6    0.7%    3.7%  __pthread_cond_timedwait
      4    0.5%    2.4%  mmap

All of this work has simplified the output enough that it's obviated a slew
of other optimizations that can be done in future work, though a lot of that
may wait for TAMER, since performing them in XSLT will be difficult and not
performant; the compiler is slow enough as it is.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 542ff46b6d m*v*s0 optimization
...getting close!
2021-06-23 11:44:35 -04:00
Mike Gerwitz 606a3fe987 m1v*s0 optimization 2021-06-23 11:44:35 -04:00
Mike Gerwitz a63eb4c5e6 m1v1s0*: Remove cmp args and support c:*/@anyOf
This supports all currently-optimized transformations, whereas previously we
were permitting only lv:match[@value].
2021-06-23 11:44:35 -04:00
Mike Gerwitz b735d91955 m0v*s* optimization
Building up to finalizing m*v*s*!

For context, here is the survey prior to this commit:

   3476 m0v1s1
   3385 m1v2s0
    582 m1v1s0
    531 m2v0s0
    225 m3v0s0
    171 m0v2s1
    169 m4v0s0
    135 m5v0s0
    102 m1v0s0
     85 m0v1s5
     71 m6v0s0
     67 m14v0s0
     57 m7v0s0
     50 m0v2s5
     48 m8v0s0
     41 m9v0s0
     39 m1v0s1
     39 m10v0s0
     34 m0v1s2
     26 m0v1s9
     22 m12v0s0
     20 m11v0s0
     20 m0v2s4
     20 m0v1s3
     19 m15v0s0
     19 m0v4s7
     17 m0v5s7
     17 m0v3s1
     17 m0v1s6
     16 m13v0s0
     16 m0v4s9
     16 m0v2s8
     16 m0v1s4
     15 m0v5s4
     15 m0v4s3
     15 m0v3s9
     15 m0v3s5
     15 m0v2s6
     14 m0v12s10
     13 m0v5s14
     13 m0v3s8
     12 m18v0s0
     12 m0v4s4
     12 m0v4s2
     12 m0v3s7
     12 m0v3s2
     12 m0v2s2
     12 m0v12s6
     11 m0v7s7
     11 m0v6s2
     11 m0v5s2
     11 m0v53s9
     11 m0v2s60
     11 m0v28s1
     11 m0v23s8
     11 m0v13s6
     10 m17v0s0
      9 m0v2s3
      8 m0v11s10
      7 m85v0s0
      7 m20v0s0
      7 m0v4s5
      7 m0v1s8
      6 m87v0s0
      6 m35v0s0
      6 m33v0s0
      6 m30v0s0
      6 m19v0s0
      6 m16v0s0
      6 m0v5s6
      5 m21v0s0
      5 m0v7s9
      5 m0v3s10
      4 m53v0s0
      4 m50v0s0
      4 m28v0s0
      4 m114v0s0
      4 m0v6s10
      4 m0v5s8
      4 m0v3s6
      4 m0v3s3
      4 m0v1s7
      4 m0v13s10
      3 m86v0s0
      3 m24v0s0
      3 m23v0s0
      3 m0v6s4
      3 m0v5s5
      3 m0v4s6
      3 m0v3s19
      3 m0v2s12
      3 m0v1s11
      3 m0v11s9
      3 m0v11s1
      2 m99v0s0
      2 m97v0s0
      2 m95v0s0
      2 m79v0s0
      2 m74v0s0
      2 m71v0s0
      2 m60v0s0
      2 m5v18s7
      2 m55v0s0
      2 m49v0s0
      2 m419v0s0
      2 m374v0s0
      2 m34v0s0
      2 m32v0s0
      2 m31v0s0
      2 m27v0s0
      2 m201v0s0
      2 m1v1s1
      2 m1v13s3
      2 m161v0s0
      2 m159v0s0
      2 m157v0s0
      2 m151v0s0
      2 m144v0s0
      2 m142v0s0
      2 m0v9s7
      2 m0v9s11
      2 m0v8s9
      2 m0v8s7
      2 m0v8s19
      2 m0v7s12
      2 m0v6s6
      2 m0v6s11
      2 m0v5s9
      2 m0v5s3
      2 m0v5s11
      2 m0v5s1
      2 m0v4s8
      2 m0v4s11
      2 m0v3s4
      2 m0v3s20
      2 m0v3s15
      2 m0v3s12
      2 m0v2s7
      2 m0v2s16
      2 m0v2s11
      2 m0v29s20
      2 m0v19s7
      2 m0v19s3
      2 m0v17s12
      2 m0v16s16
      2 m0v15s23
      2 m0v15s10
      2 m0v13s9
      2 m0v13s15
      2 m0v11s8
      2 m0v10s15
      1 m94v0s0
      1 m93v0s0
      1 m92v0s0
      1 m90v0s0
      1 m81v0s0
      1 m76v7s0
      1 m76v0s0
      1 m70v0s0
      1 m68v0s0
      1 m66v11s11
      1 m64v0s0
      1 m58v0s0
      1 m54v0s0
      1 m51v0s0
      1 m514v20s19
      1 m4v4s7
      1 m48v0s0
      1 m481v20s14
      1 m451v0s0
      1 m44v0s0
      1 m43v0s0
      1 m42v0s0
      1 m3v16s7
      1 m38v4s6
      1 m38v0s0
      1 m370v0s0
      1 m2v2s3
      1 m2v2s0
      1 m2v25s25
      1 m29v0s0
      1 m26v0s0
      1 m25v0s0
      1 m22v0s0
      1 m213v0s0
      1 m1v3s0
      1 m1454v3215s1422
      1 m13v11s37
      1 m1374v1s0
      1 m131v0s0
      1 m10v30s23
      1 m102v0s0
      1 m0v9s9
      1 m0v9s8
      1 m0v9s12
      1 m0v8s12
      1 m0v7s4
      1 m0v7s15
      1 m0v7s11
      1 m0v6s9
      1 m0v6s8
      1 m0v6s5
      1 m0v6s20
      1 m0v6s16
      1 m0v6s12
      1 m0v4s17
      1 m0v4s10
      1 m0v4s1
      1 m0v46s23
      1 m0v3s17
      1 m0v3s16
      1 m0v33s21
      1 m0v32s38
      1 m0v2s9
      1 m0v2s10
      1 m0v23s30
      1 m0v22s9
      1 m0v22s31
      1 m0v20s29
      1 m0v18s24
      1 m0v18s10
      1 m0v17s26
      1 m0v17s14
      1 m0v16s9
      1 m0v16s27
      1 m0v15s20
      1 m0v15s14
      1 m0v15s11
      1 m0v14s6
      1 m0v14s5
      1 m0v14s13
      1 m0v13s7
      1 m0v13s20
      1 m0v12s9
      1 m0v12s8
      1 m0v11s11
      1 m0v10s17
      1 m0v10s14
      1 m0v10s11
      1 m0v10s10

There are some horridly large ones in there!  They were missing from output
in previous commits because of how I was gathering information.

Those large ones come from liza-proguic's __proguiClasses.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 8045c9d99a Remove v{u,e} second argument; always match truthful
Add optimization notes, note the impact on FALSE with implicit 0 (see mega
commit).
2021-06-23 11:44:35 -04:00
Mike Gerwitz 5ae5c226f9 lv:match/c:* optimizations for v* and s*
This will make m1v*s0 worth doing now.
2021-06-23 11:44:35 -04:00
Mike Gerwitz db88f6aba5 div function 2021-06-23 11:44:33 -04:00
Mike Gerwitz fc96880b85 lv:match/c:* optimizations
A large number of classification optimizations were being thwarted by my not
handling this case.
2021-06-22 15:00:58 -04:00
Mike Gerwitz 73696657fc Optimize @anyOf m0v0s* 2021-06-22 15:00:58 -04:00
Mike Gerwitz 5d9970c853 Optimize @anyOf m0v*s0
This sets the foundation to applying this optimization to the others as
well.
2021-06-22 15:00:58 -04:00
Mike Gerwitz f86eaf6aa2 More concise anyOf checks
These also use unary functions, which will be able to be composed
for upcoming changes.
2021-06-22 15:00:58 -04:00
Mike Gerwitz e59a3b3ff5 Remove unnecessary debug output (writes are very slow)
This shaves ~1m off of the total build time for our largest system.  Output
is impressively slow.

Around this point in time, we have the following profile from V8's sampling
profiler:

  [JavaScript]:
     ticks  total  nonlib   name
       36    2.8%   10.7%  LazyCompile: *anyValue [...]/ui/package.strip.new.js:31020:22
        3    0.2%    0.9%  LazyCompile: *m1v1u [...]/ui/package.strip.new.js:30941:19
        2    0.2%    0.6%  LazyCompile: *precision [...]/ui/package.strip.new.js:30934:23
        1    0.1%    0.3%  LazyCompile: *vu [...]/ui/package.strip.new.js:30964:16
        1    0.1%    0.3%  LazyCompile: *init_defaults [...]/ui/package.strip.new.js:31341:27
2021-06-22 15:00:58 -04:00
Mike Gerwitz d828ad6a1f Extract optimized vec and scalar matches into functions
The vector one will be reused by m1v1 to become m1v*.
2021-06-22 15:00:58 -04:00
Mike Gerwitz 917977effc Use Em instead of destructuring for m1v1
Similar to previous commit.
2021-06-22 15:00:58 -04:00
Mike Gerwitz 3a6695c873 Use E instead of destructuring for v{u,e} functions
This also has an added benefit: that it's ES5-compatible.  Aside from the
arrow functions that need to be removed in future commits.
2021-06-22 15:00:58 -04:00
Mike Gerwitz cfbdc35a55 m0v*s0 single-distinct-@on optimization
I have been wanting to do this for many years.  This is quite
gratifying.  Here is some example output:

  c['foo']=E(A['fooState']=A['state'].map(s => +[2,7,8,9,10,11,19,20,21,22,26,28,31,32,35,39,40,41,46,47,44].includes(s)));

Previously, it looked like this:

  classes['foo'] = (function(){var result,tmp;  tmp = anyValue(
  args['state'], 2, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1924'] || ( debug['d1124644e1924'] = [] ) ).push( tmp
  );/*!-*/ result = tmp; tmp = anyValue( args['state'], 7, args['fooState'],
  false, false ) ;/*!+*/( debug['d1124644e1925'] || ( debug['d1124644e1925'] =
  [] ) ).push( tmp );/*!-*/ result = result || tmp; tmp = anyValue(
  args['state'], 8, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1926'] || ( debug['d1124644e1926'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 9,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1927'] || (
  debug['d1124644e1927'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 10, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1928'] || ( debug['d1124644e1928'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 11,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1929'] || (
  debug['d1124644e1929'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 19, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1930'] || ( debug['d1124644e1930'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 20,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1931'] || (
  debug['d1124644e1931'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 21, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1932'] || ( debug['d1124644e1932'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 22,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1933'] || (
  debug['d1124644e1933'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 26, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1934'] || ( debug['d1124644e1934'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 28,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1936'] || (
  debug['d1124644e1936'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 31, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1937'] || ( debug['d1124644e1937'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 32,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1938'] || (
  debug['d1124644e1938'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 35, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1939'] || ( debug['d1124644e1939'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 40,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1940'] || (
  debug['d1124644e1940'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 41, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1941'] || ( debug['d1124644e1941'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 46,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1942'] || (
  debug['d1124644e1942'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 44, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1943'] || ( debug['d1124644e1943'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; return tmp;})();

The source XML is:

  <classify as="foo" yields="fooState"
            desc="Foo">
    <any>
      <match on="state" value="STATE_AL" />
      <match on="state" value="STATE_CT" />
      <match on="state" value="STATE_DC" />
      <match on="state" value="STATE_DE" />
      <match on="state" value="STATE_FL" />
      <match on="state" value="STATE_GA" />
      <match on="state" value="STATE_LA" />
      <match on="state" value="STATE_MA" />
      <match on="state" value="STATE_MD" />
      <match on="state" value="STATE_ME" />
      <match on="state" value="STATE_MS" />
      <match on="state" value="STATE_NC" />
      <match on="state" value="STATE_NH" />
      <match on="state" value="STATE_NJ" />
      <match on="state" value="STATE_NY" />
      <match on="state" value="STATE_PA" />
      <match on="state" value="STATE_RI" />
      <match on="state" value="STATE_SC" />
      <match on="state" value="STATE_VA" />
      <match on="state" value="STATE_VT" />
      <match on="state" value="STATE_TX" />
    </any>
  </classify>
2021-06-22 15:00:58 -04:00
Mike Gerwitz a2f846f9c4 {gen,}classes name reduction to reduce byte count 2021-06-22 15:00:58 -04:00
Mike Gerwitz a880605511 Optimal m0v0s* single-distinct-@on scalar match
See comments for more information.

This will require a polyfill for Array.prototype.includes for IE11, if we
stick with it.
2021-06-22 15:00:58 -04:00
Mike Gerwitz 1f72f756ca m0v0s* optimization 2021-06-22 15:00:57 -04:00
Mike Gerwitz d352919807 m0v*s0 optimization 2021-06-22 15:00:57 -04:00
Mike Gerwitz 736d9278bf Temporarily output mvs lengths for unoptimized classifications
This allows us to easily see their shape looking at the compiled code.  See
the previous commit for more of an explanation and examples.  And future
commits.

This allows us to analyze the compiler runlog and determine the frequency of
certain shapes to prioritize optimization efforts.
2021-06-22 15:00:57 -04:00
Mike Gerwitz d9bbf0282e m1v1 classification optimizations
This is a proof-of-concept.  It also contains arrow functions, which do not
exist in ES5.

The notation m#v#s# refers to matrix, vector, and scalar counts of a
classification.  This optimization therefore focuses on classifications with
a single vector and a single matrix.

I'd like to note that this commit message was written in retrospect, months
later, after I returned to these proof-of-concept commits to finalize
them.  I'll try my best to have things make sense in a historical context
based on my notes.

The choice to focus on m1v1 was based on taking survey of the shape of
classifications in our largest rating system.  m1v*, and specifically m1v1,
was the largest by far, followed by v1s1.  Here's an example program used
for a UI:

  $ grep -h 'internal: [svm][0-9]\+[svm][0-9]\+ ' run*.log > result
  $ cut -d' ' -f2 result | sort | uniq -c | sort -rn
    10056 m1v1
     1788 m1v2
      473 v1s1
       18 v2s1
       13 v1s5
        8 v1s3
        7 v1s2
        4 v2s5
        2 v4s4
        2 v4s2
        2 v2s8
        2 v2s6
        2 v1s9
        2 v1s4
        1 v7s7
        1 v6s2
        1 v5s7
        1 v5s5
        1 v5s4
        1 v5s2
        1 v4s9
        1 v4s7
        1 v4s3
        1 v3s9
        1 v3s7
        1 v3s5
        1 v3s2
        1 v3s1
        1 v33s21
        1 v2s60
        1 v2s4
        1 v2s3
        1 v2s2
        1 v28s1
        1 v23s8
        1 v22s9
        1 v1s8
        1 v1s6
        1 v18s24
        1 v15s14
        1 v14s6
        1 v14s5
        1 v13s7
        1 v13s6
        1 v12s6
        1 v11s1
        1 m76v7
        1 m3v1
        1 m1v3
        1 m1374v1

The excessively large ones (like the last one) are aggregate classifications
that are generated by a template.  But note the first count.

Here's another example, one of the raters:

   8812 m1v1
    311 v1s1
     17 v2s1
     14 v1s5
      4 v2s5
      4 v1s6
      4 v11s10
      3 v3s1
      3 v1s8
      2 v5s14
      2 v4s7
      2 v3s9
      2 v3s5
      2 v2s4
      2 v1s9
      2 v1s4
      2 v1s2
      1 v8s7
      1 v7s7
      1 v7s15
      1 v6s4
      1 v6s2
      1 v6s10
      1 v5s8
      1 v5s7
      1 v5s4
      1 v5s2
      1 v53s9
      1 v4s9
      1 v4s4
      1 v4s3
      1 v4s2
      1 v4s11
      1 v3s8
      1 v3s7
      1 v3s20
      1 v3s2
      1 v3s19
      1 v3s15
      1 v2s8
      1 v2s60
      1 v2s6
      1 v2s2
      1 v2s12
      1 v29s20
      1 v28s1
      1 v23s8
      1 v1s3
      1 v15s23
      1 v13s6
      1 v13s20
      1 v12s6
      1 v12s10
      1 v11s1
      1 m1v2
      1 m1s1

Given these examples, m1v1 is an easy first choice for this commit.

The general pattern for this commit and those that follow is to match on a
specific shape of classification that we're optimizing for, falling back to
the old anyValue-based system for all other cases, with the intent of
eventually removing it.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 5a816a4701 Ensure all params are numeric
This has long been a curse, and I don't know why I didn't resolve it sooner.

This makes explicit some of the odd things that this is doing, to maintain
the previous behavior.  Changing that behavior would be ideal, but ought to
be done separately and put behind a feature flag.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 250c230d94 Revert "REMOVE ME: Use variables in place of object for generated class yields"
This reverts commit e2d9467633bb75d79dbc8fe9f8971bfa412ea59f.

BUT: it does cause more data to be returned, perhaps unnecessarily.  See if
that may offset the slight increase in GC cost.

Further, we may end up getting rid of some of these generated values; check
after we do some class optimizations.
2021-06-22 15:00:57 -04:00
Mike Gerwitz ec196146e2 REMOVE ME: Use variables in place of object for generated class yields
This was a waste of time; it actually reduces performance slightly and increased
GC, unintuitively enough.

Leaving commit here and reverting to keep it for reference.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 9784ef9326 Remove unused lv:assuming
This was going to be a feature to permit testing (I think?), but it has
never been used and was abandoned long ago.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 05736abe23 compiler/js (lv:classify): Extract @yields dest name into function
I will be changing how this work shortly.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 6512ea245a Omit _CMATCH_ generation if no predicates, alias if one
I would like for _CMATCH_ to eventually go away entirely, but this is an
improvement in the meantime.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 17db2d0df8 Use more concise var refs in generated code to reduce byte count 2021-06-22 15:00:57 -04:00
Mike Gerwitz 3c47858c73 Extract empty classify into own template
Simplify main template.
2021-06-22 15:00:57 -04:00
Mike Gerwitz ce0f51db2f compiler/js-calc: Make unknown calculation type a compile-time error
When the Summary Page was _first written_ (the first part of TAME), it was
compiled in the browser---development consisted of refreshing the page,
which was familiar to how we wrote PHP at the time.  No compile process.

In that situation, we couldn't have the XSLT stylesheet failing to
translate.  But of course those days are long since gone, and this must be a
compile-time error.

It shouldn't ever get to this point, granted.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 9ed6d40386 compiler/js (lv:classify): Remove unused noclass
This existed back when the classifier was compiled separately from the
rate function; they are now one and the same.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 894f7ffab8 compiler/js (lv:classify): Remove unused $ignores 2021-06-22 15:00:57 -04:00
Mike Gerwitz d0532fe75a Simplify predmatch and eliminate when no predicate 2021-06-22 15:00:57 -04:00
Mike Gerwitz 8d25d60c60 Significantly reduce parenthesis and whitespace in output
The intent here is simply to reduce byte count, as well as make the
generated code easier to read and find patterns in for future
optimizations.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 3434efcdef Remove unnecesary ||0 defaults 2021-06-22 15:00:57 -04:00
Mike Gerwitz 1c07968375 Remove unused result intermediate value 2021-06-22 15:00:57 -04:00
Mike Gerwitz 80e3029fa0 Remove function wrapper from c:when 2021-06-22 15:00:57 -04:00
Mike Gerwitz 603e9fb342 Optimize single-true-match classes into aliases
Single-predicate classifications matching on TRUE can be optimized into
aliases.  These sometimes occur in hand-written code, but can also be
generated by templates.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 3eca3cf8dc Modernization of some runtime JS functions
We still can't use arrow functions, since the output must be ES5-compatible.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 525d138d33 Remove function wrapper from generic c:* 2021-06-22 15:00:57 -04:00
Mike Gerwitz 459a25e943 Replace toFixed to truncate rate blocks
toFixed required converting to a string and back, which had miserable
performance.  This avoids that cost.
2021-06-22 15:00:57 -04:00
Mike Gerwitz d27cedc70c Remove lv:rate function wrapper 2021-06-22 15:00:57 -04:00
Mike Gerwitz ef5a7c58d8 Remove function wrapper from class blocks
These are unneeded.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 74f2849e8b Do not execute unnecessary code paths
Benchmarking showed virtually no benefit, surprisingly.  But this can be
used in conjunction with other optimizations in the future.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 9c63337fc6 README.md: Mention Rust in upper paragraph alongside XSLT 2021-06-22 12:17:33 -04:00
Mike Gerwitz 8b5053d475 README.md: Mention TAMER 2021-06-22 12:15:50 -04:00
Mike Gerwitz 93fb1f1bdd tamer: Rust v1.{48=>53}.0 for rustdoc tool lints
A previous commit used a rustdoc tool lint, but that support wasn't added
until 1.52.0 (2021-05-06).

Note that this represents the minimum _required_ version to build TAMER; you
can use a later version.
2021-06-22 09:07:53 -04:00
Mike Gerwitz 716556c39f tamer: Rust 1.{42=>48}.0 for stable intra-doc links without nightly 2021-06-21 13:10:00 -04:00
Mike Gerwitz 96ea0302cc tamer: Cargo.lock: Dependency updates
This project has been on pause for over a year.
2021-06-21 12:46:38 -04:00
Mike Gerwitz 416676f1ab build-aux/progtest-runner: Deterministically concatenate files by name 2021-06-09 16:10:52 -04:00
Mike Gerwitz 645c4da541 RELEASES.md: Add _use-new-classification_system_ mention 2021-06-09 16:09:09 -04:00
Mike Gerwitz cdb2e876ab core/test/class: Begin classification system test cases
These are incomplete, but a start.
2021-06-09 13:33:11 -04:00
Mike Gerwitz 1e620e1e96 core/base (_use-new-classification-system): New template
This template prepares for the introduction of the new classification
system, which is a full rewrite that is both more performant and more
correct in its behavior.  Unfortunately, the corrections will cause problems
with old code that may be relying on certain cases, particularly where
undefined values are implicitly treated as zero.

Consequently, the legacy and new systems will exist side-by-side, able to be
toggled on as desired so people can verify that behavior is correct before
we switch it on by default.  This template allows switching on the system
for an entire package (if it's placed at the toplevel), or portions of a
package, though the latter should only be used in exceptional circumstances.

See the test cases in commits to follow for more information.
2021-06-09 13:32:46 -04:00
Mike Gerwitz d4dc1e651b core/base: Section _yield_ and _rate-each_ 2021-06-08 13:26:49 -04:00
Mike Gerwitz bf399c0370 core/aggregate: Remove package
This package is not used today.  See RELEASES.md for more information;  This
is a dangerous package that never should have existed.

This also fixes the test suite.
2021-06-08 12:00:45 -04:00
Mike Gerwitz 66e95fe9c4 src/current/summary: classify breakdown: Show lv:match/@on values
The classification system rewrite removed the debug value collection that
previously existed.  It didn't make a whole lot of sense anyway, given that
that compiler rearranges matches.

This falls back to showing the value of the @on, which should be good
enough, and is honestly better than what we had before.
2021-06-08 11:43:35 -04:00
Mike Gerwitz 702ba3f0c7 RELEASES.md: Sectioning tweak for recent release 2021-06-08 11:43:35 -04:00
Austin Schaffer f637b161b7 RELEASES.md: Update for v17.9.0 2021-05-27 13:22:00 -04:00
Mike Gerwitz e3a583624c design/tpl (Matches): Refine matrix visualization figure
This provides an element-level rather than row-level focus, which I feel is
more appropriate.

One could draw lines to connect each of the elements, but that'd likely be
too noisy and it'd be a lot of work.
2021-05-27 10:59:52 -04:00
Schaffer, Austin 9e9d5fc16b [DEV-9769] Allow feature flag mappings
See merge request floss/tame!52
2021-05-27 14:10:46 +00:00
Austin Schaffer d447e8107f [DEV-9769] Allow feature flag mappings 2021-05-27 14:06:56 +00:00
Mike Gerwitz 3cf859e72a design/tpl (Matches): Clean up matrix visualization
The previous design was a tad bit too noisy and I think undermined the
whole point of the visualization: to help grok the matching logic.
2021-05-26 14:33:31 -04:00
Mike Gerwitz 9c72d397d4 design/tpl: Introduce bibliography
This starts with the Hadamard Product as an example.  It also:

  - Configures BibLaTeX with biber.
  - Renames \undef, since BibLaTeX apparently defines it.
  - Redefines the citation and url colors, since they're bright and ugly.
2021-05-26 13:07:46 -04:00
Mike Gerwitz e07887f8b5 design/tpl: Remove now-unused listing package
All XML appears within the context of equations now.
2021-05-26 12:30:03 -04:00
Mike Gerwitz 9b0e97d0b9 design/tpl (Matches): Add light clarifying text
This is just some plain English to go along with and help rationalize the
text.  Further rationale will be provided in a dedicated section in the
future; such information is vitally important to understand why the system
evolved as it did.
2021-05-26 11:32:31 -04:00
Mike Gerwitz 972ea13623 design/tpl: {\bullet=>\monoidop}
This abstraction was introduced in the previous commit.
2021-05-26 10:38:36 -04:00
Mike Gerwitz ddcdb8d9c6 design/tpl (Classification System): Introduce linear algebra notation
I find this provides a visualization that is likely to be significantly more
intuitive for others.  It even holds when the matrix is not
rectangular (yes, I know, it's not really a matrix then), so long as all
matrices share the same respective K_j.
2021-05-26 10:33:01 -04:00
Mike Gerwitz cc057e8178 design/tpl (tpl.sty)[newtheoremwithlabel]: Use spref for *pref 2021-05-26 10:32:50 -04:00
Mike Gerwitz 72b6f95d4e design/tpl (Classification System): \odot=>\Classify-based monoid notation
This reads better, IMO.
2021-05-25 10:50:55 -04:00
Mike Gerwitz aeb862032d desgin/tpl (Makefile)[clean]: Delete tpl.pdf 2021-05-24 13:29:34 -04:00
Mike Gerwitz 848a415ab2 design/tpl (\equivish): Symbol change
This uses the same variable subscript on \equiv itself to define the symbol,
rather than the previous symbol which looked like equiv rotated, but also
looked too much like a turnstile used for "infer", a metalanguage construct
that is not appropriate here.  It kept bothering me.
2021-05-24 13:24:53 -04:00
Mike Gerwitz 9611dfc3fc design/tpl (Classification System): Correct vertical spacing on match ex
This is hinting that the xmlnl{,l} abstraction may not be working out.
2021-05-24 13:14:16 -04:00
Mike Gerwitz f9fc33944c design/tpl (Classification System): Improve page breaks (miscellaneous) 2021-05-24 12:57:09 -04:00
Mike Gerwitz 7a2f40e455 design/tpl (Classification System): Adjust always/never figure and use spref 2021-05-24 12:56:24 -04:00
Mike Gerwitz 9e8b4d0cb6 design/tpl (Classification System): Match example with all ranks
This does not yet provide the visualization using linear algebra notation;
that'll be coming soon.
2021-05-24 12:56:24 -04:00
Mike Gerwitz caafecdbda design/tpl: Add \spref
This is a bit opinionated, but I think it reads better, and it's certainly
better than manually having to proofread references frequently.
2021-05-24 12:56:23 -04:00
Mike Gerwitz e70933eef8 design/tpl (Vectors and Index Sets): Refinement and rectangular matrix intro
This refines the section a bit and introduces the familiar notation for
rectangular matrices (which normal people just call "matrices").
2021-05-21 16:10:44 -04:00
Mike Gerwitz 8edef8a8c8 design/tpl (\rank): Add macro 2021-05-20 15:28:10 -04:00
Mike Gerwitz 50a6ccf4ec design/tpl (Classification System): Equation number refinement
Just clean up a little bit using more proper AMS environments.
2021-05-20 15:22:06 -04:00
Mike Gerwitz 98d724a7d7 design/tpl (Vectors and Index Sets): Remove unique value set example
This ended up not being needed for the definition of the classification
system and just adds noise.
2021-05-20 12:42:51 -04:00
Mike Gerwitz 7d0402d350 src/current/doc: Remove
This represents the old cmatch system (which is in use today, but the
classification system has since been rewritten, though it has not yet been
merged).  It was my attempt over a decade ago to reason about how this
system ought to work.

I think it's fair to say that this is absolute insanity and that the new
formulation is significantly better.
2021-05-20 11:25:32 -04:00
Mike Gerwitz 9ad144d3d4 design/tpl (Classification System): Initial match definition
This is a bit raw; it needs some explanation and examples.
2021-05-20 10:45:44 -04:00
Mike Gerwitz 0d59ff607e design/tpl (XML Notation): Add additional example for literal binding 2021-05-19 13:06:40 -04:00
Mike Gerwitz f71c58b1da design/tpl (Vectors and Index Sets) [Rank]: New definition 2021-05-19 12:58:20 -04:00
Mike Gerwitz aafebbc716 design/tpl: Classification index entries 2021-05-19 10:48:35 -04:00
Mike Gerwitz 5fdaecf765 design/tpl: Adjust \footheight
KOMA-Script is complaining, because of the multi-line Copyright notice on
the first page.  This resolves that warning.
2021-05-19 10:05:36 -04:00
Mike Gerwitz 4a6f44ff90 design/tpl: Formatting of source files
This introduces missing license headers in files and better organizes
tpl.sty into sections.
2021-05-19 10:05:29 -04:00
Mike Gerwitz dfdf92848a design/tpl (Classification Introduction): Codomain {Real=>Bool}
Whoops.  The values of the params are reals, but predicates are all
booleans, having been first transformed by the yet-to-be-formally-defined
matches.
2021-05-19 09:02:48 -04:00
Mike Gerwitz f5b2261f0d design/tpl: Euler font
I removed this when I added concmath, thinking that it would include it for
me, and apparently I never re-added it after realizing that it didn't.

I'm a big fan of the typography of Concrete Mathematics.
2021-05-18 16:28:43 -04:00
Mike Gerwitz a488e3549d design/tpl (Classification System): Intro src formatting
Just aligning.  Meant to do this in the previous commit.
2021-05-18 16:15:05 -04:00
Mike Gerwitz bf6cf96169 design/tpl (Classification System): Reduce height for intro vdots
The subscript of the matrix family adds too much vertical space.  This
offsets that to restore it to about what it otherwise would be, since the
second subscript does not get in the way.
2021-05-18 16:14:10 -04:00
Mike Gerwitz 158e045762 design/tpl: geometry package for letterpaper
Was not working properly in vanilla TeX Live image.
2021-05-18 16:03:06 -04:00
Mike Gerwitz d65c061b29 design/tpl: configure script (for appendix)
It's not required; TPL will fall back when missing conf.tex.
2021-05-18 15:56:25 -04:00
Mike Gerwitz 80fd239d95 design/tpl: Letter paper
This is what we'll usually be printing on, after all.
2021-05-18 15:06:35 -04:00
Mike Gerwitz 83e3ade149 design/tpl (Classification System)[Classification Yield]: Allow break after "then"
This was rendering poorly, breaking instead in the middle of "Axiom".
2021-05-18 14:15:52 -04:00
Mike Gerwitz ef231f89fa design/tpl (Classification System): Remove TODO for always/never
This was done in the previous commit.
2021-05-18 14:11:40 -04:00
Mike Gerwitz 8957e6caf0 design/tpl (Classification System): Add always and never figure
This demonstrates the vacuity lemma.
2021-05-18 14:09:27 -04:00
Mike Gerwitz dfa37f5b77 design/tpl: Use \{emph=>dfn} for term introductions
This uses \textsl rather than \emph.
2021-05-18 12:16:11 -04:00
Mike Gerwitz 8a2407d66f design/tpl: Emphasize commutativity of monoids in classification system 2021-05-18 12:13:58 -04:00
Mike Gerwitz 1ec0fc0c7b design/tpl (Notational Convetions): Clean up unneeded bicompi
This was originally going to be used to define @yields for the classifier,
but I took a very different approach which doesn't require reasoning about
the system in terms of recursion.
2021-05-18 10:16:06 -04:00
Mike Gerwitz 2d268f2a55 design/tpl: Initial definition of classifications
This defines @as and @yields, but does not yet define matches formally.
It's also missing index entries, which I'll take the time to add after I'm
sure things are staying as they are.

This was quite a bit of work, and the approach I took is different than I
originally expected, so Section 0 can use some cleanup.

There is more to come from here.
2021-05-18 10:09:29 -04:00
Mike Gerwitz 4ea2574a8c design/tpl/tpl.sty: Use autoref for theorem macros 2021-05-17 13:21:56 -04:00
Mike Gerwitz fbe76a5616 design/tpl: Beginnings of classifications in terms of first-order logic
This is going to evolve a great deal, and note that the yield definition is
completely absent.

It may be time to switch to natural deduction (Gentzen-style).
2021-05-14 12:14:11 -04:00
Mike Gerwitz 9bcd7e1d7e design/tpl: Abstract theorem env + label creation 2021-05-14 10:50:49 -04:00
Mike Gerwitz 9fd57872ed design/tpl (Monoids and Sequences): Add missing index entries
Forgot in previous commit.
2021-05-14 10:38:17 -04:00
Mike Gerwitz 8d54420656 design/tpl (Monoids and Sequences): New section 2021-05-14 10:34:13 -04:00
Mike Gerwitz 060a7d0e6c design/tpl: \{log=>l}{and,or}
Don't I feel silly.
2021-05-14 10:34:09 -04:00
Mike Gerwitz 63b502b7df design/tpl: Add ccicons for Copyright line 2021-05-12 10:37:20 -04:00
Mike Gerwitz 1ad6bc93d7 design/tpl (Meta: Typesetting): Correct use of \tameclass
That macro previously expanded into \Classify, but that was undone before
committing to make it clear when one is referring to a variable vs. a
classification as a definition.
2021-05-11 16:52:03 -04:00
Mike Gerwitz 3cb2726737 desgin/tpl: Appendices begin on a new page 2021-05-11 16:50:23 -04:00
Mike Gerwitz bff2b32a52 design/tpl: More TODOs and margin remarks 2021-05-11 16:50:11 -04:00
Mike Gerwitz 9d9535a1b8 design/tpl: Introduce \Classify mathop
This will be used as an IR of sorts to eliminate the XML, which will be far
too verbose to use in proofs.  It also allows us to attach behavior to the
operator, which will end up defining two values for @as and @yields.
2021-05-11 16:37:08 -04:00
Mike Gerwitz d78186461f desgin/tpl: Add \todo margin notes 2021-05-11 16:36:43 -04:00
Mike Gerwitz 08109bc35d design/tpl: Vector addition example: \bicomp{=>i}
The example as presented was incorrect, since \bicomp is undefined for
scalar values.
2021-05-11 13:49:18 -04:00
Mike Gerwitz 7735ba1f29 design/tpl: Remove \exists from classification definitions
The previously-existing notation for this has been removed.  These will be
updated soon to account for vectors and matrices, but until then, this is
simply nonsense.
2021-05-11 13:28:02 -04:00
Mike Gerwitz d49133e8e9 design/tpl: (Disjunctive Classification): Footnote formatting correction
Missing `,' in the \forall set.
2021-05-11 13:24:20 -04:00
Mike Gerwitz 51c87f9938 design/tpl: Reposition disjunctive classification footnote
The original position made it look to much like d^2.
2021-05-11 13:21:05 -04:00
Mike Gerwitz 6dc0ca2454 design/tpl: Show listings in draft mode
What an odd default.
2021-05-11 13:08:25 -04:00
Mike Gerwitz 5808afc8a2 design/tpl: Universal=>conjunctive, existential=>disjunctive classification 2021-05-11 13:06:06 -04:00
Mike Gerwitz 2407af56e4 design/tpl: _-notation clarification (wildcard/hole) 2021-05-11 12:47:28 -04:00
Mike Gerwitz 13317aac6c design/tpl: Vectors and Index Sets \goodbreak
This fits nicely on a single page.  At the time of writing, the previous
section is near the end of the page, so this works reasonably well.
2021-05-11 11:39:23 -04:00
Mike Gerwitz 7f4fc8e3b7 design/tpl: Mostly-complex symbol index entries for Chapter 0 2021-05-11 11:33:12 -04:00
Mike Gerwitz 7624bd2958 design/tpl: TAMER case fix
'T' was lowercased.
2021-05-11 11:32:19 -04:00
Mike Gerwitz 02335f9a4a design/tpl: Clear copyright on Index pages
Apparently index page output uses a different even/odd determination than
the normal article page output.
2021-05-11 11:31:27 -04:00
Mike Gerwitz dfb013ca74 design/tpl: Corrected conjunction/disjunction index placement
They were incorrectly placed at the quantifiers.
2021-05-11 09:59:03 -04:00
Mike Gerwitz cb9ccfe5f3 design/tpl: \vdash=>\infer and index entry 2021-05-10 16:54:19 -04:00
Mike Gerwitz 4e7b882aed design/tpl: Begin symbol list at beginning of index 2021-05-10 16:50:30 -04:00
Mike Gerwitz 176c7785e9 design/tpl: Remove stackengine import
This is no longer needed after conversion of \bicomp to superscript.
2021-05-10 14:29:42 -04:00
Mike Gerwitz c371d12a02 design/tpl: Remove glossary
This is an unnecessary feature to maintain right now.  I will include
symbols at the very beginning of the index, which is common in mathematics
texts, and may will add a table of common symbols in the future.
2021-05-10 14:28:37 -04:00
Mike Gerwitz cacb72b2bd RELEASES.md: Entry for TPL 2021-05-10 14:21:24 -04:00
Mike Gerwitz 8d1c29b4cc design/tpl: bicomp: Use superscript instead of stacking
Stacking originally seemed like a good idea, but perhaps this does read a
bit better (and looks more like the composition operation being applied),
and composes a bit better if we needed e.g. \bicomp\bicomp{R}.

It's also less ambiguous when it's over a larger expression.  For example,
\bicomp{[A]} places \circ over top of the A, which looks as if it's
[\bicomp{A}].  It's obvious what the intention is in that context, since
\bicomp{A} makes no sense, but there could be other situations where it
doesn't.  With this change, it results in {[A]}^\circ.
2021-05-10 14:19:49 -04:00
Mike Gerwitz 0a16808542 design/tpl: Subscript notation for function application
This is convenient and visually appealing in certain circumstances.  That's
highly subjective.
2021-05-10 14:08:11 -04:00
Mike Gerwitz bd454f7a7c design/tpl: The Tame Programming Language initial concept
There's a lot of change that's likely going to take place with this thing,
but it's a start.  The abstract summarizes the purpose of this---to formally
define TAME in terms of algebra, first-order logic, and [ZFC] set theory.

This came about while working on compiler changes and optimizations, since
it's difficult to ensure correctness (and discover further optimizations)
without being able to formally define the language.  The focus at the moment
is the classification system rewrite, which can be expressed in terms of
first order logic and set theory.

This commit contains essentially a POC with some carefully chosen
mathematical foundations (abstractions of which are subject to change) and a
basic representation of a subset of the classification system for scalars.
2021-05-10 13:46:49 -04:00
Mike Gerwitz 685549f06b RELEASES.md: Update for v17.8.1 2021-03-18 09:56:02 -04:00
Mike Gerwitz 56210497ad RELEASES.md: Summary for next release 2021-03-18 09:55:36 -04:00
Mike Gerwitz 43204d1dd5 build-aux/Makefile.am: Lookup table dependency fix
%.xml{=>o}: %csvo rater/core/vector/table.xmlo

That is: we'll only build an object file when we try to build another object
file.  This was causing problems with dependency generation, because it will
triggering compilation early.
2021-03-17 17:02:58 -04:00
Mike Gerwitz 6b35812405 RELEASES.md: Mention recent tame and tamed changes 2021-03-15 09:49:57 -04:00
Mike Gerwitz 6221ce5fee bin/tame (verify-runner): Add missing id param
This was referencing a global $id, which is not the value we are interested
in (and may not exist at all).  Add the missing param.
2021-03-12 14:14:42 -05:00
Mike Gerwitz 7325578624 bin/tame{,d}: Fix assignments that lose exit code
Ensure that we fail if the command in the assignment fails.
2021-03-12 14:14:40 -05:00
Mike Gerwitz e7e4a61cf4 bin/tame: read {=>-r} where missing
While these should not be necessary in practice, there's no reason _not_ to
do this.
2021-03-12 09:47:38 -05:00
Mike Gerwitz 2a56c75345 bin/tame: Remove unnecessary trailing backslashes
This was originally in a Makefile, long ago, where backslashes were
actually needed.
2021-03-12 09:47:34 -05:00
Mike Gerwitz dfce9a89d8 RELEASES.md: Update for v17.8.0 2021-02-23 10:51:59 -05:00
Mike Gerwitz 566d9f6536 build-aux/Makefile.am (suppliers.mk): Regenerate when any sources change
This should have been done many years ago.  This will determine if any of
the dependencies have changed for the included suppliers.mk and regenerate
it as needed, without the developer having to do so manually when imports
change.
2021-02-23 10:48:21 -05:00
Mike Gerwitz cda3e845b8 Remove verbose messages from suppliers.mk generation
* build-aux/Makefile.am (suppliers.mk): Invoke ant with `-q` to eliminate
"processing" messages for each and every file.  This also speeds up
operation slightly.
* build-aux/gen-make: Remove information echos for each file.

These changes will allow for suppliers.mk to be regenerated automatically
without being so invasive.
2021-02-23 10:47:40 -05:00
Mike Gerwitz c319719065 src/current/rater.xsd (yieldsNameType): Remove length checks
The intent originally was to try to keep developers to a reasonable name
length, but generated identifiers can easily exceed this, and we further do
not support namespacing.

This can be handled at a template level instead for enforcing naming
conventions.
2021-02-23 10:46:58 -05:00
Mike Gerwitz 8651f683f6 src/current/rater.xsd: Update
This had gotten quite out of date from the actual rater.xsd, which existed
outside of this repository, that is used during our build process.  That was
an unintended artifact from moving files around.

That file has been removed and symlinked to this one.
2021-02-23 10:46:03 -05:00
Mike Gerwitz 6f67a4d6fa build-aux/Makefile.am: Accommodate step-level packages from proguic
Note: this really belongs in liza-proguic, and should be moved in the near
future.

liza-proguic is being modified to generate step-level packages, which are
significantly faster to build than larger ones (XSLT TAME scales
terribly).  These changes handle those new dependencies.

One important thing to note with this change is that suppliers.mk now
requires proguic to have run before generation so that those generated
dependencies can be properly examined.  This is a quick operation, so that
is not problematic.

This also depends on the .version.xml change that was previously made: when
the timestamp changed every time, we got into an infinite build loop.
2021-02-23 10:44:50 -05:00
Mike Gerwitz 9f5517f0d9 src/current/pkg-dep.xsl: Recognize step-level imports
First thing to note: this belong in liza-proguic, not here.  But it's here
right now, so for now I'm making the change.  The relationship between TAME
and proguic is awkward and will hopefully be improved upon in the near
future.

As for this actual change: step-level fragments will be concatenated such
that the imports will appear at the step level rather than the root.
2021-02-23 10:44:03 -05:00
Mike Gerwitz 80a61986bd build-aux/m4/calcdsl.m4: Do not generate suppliers.mk
This will be generated automatically by the Makefile.  It's not appropriate
to generate in the configure script, and I do not recall why I did
so---possibly to work around the issue of delayed tab completion when it
needs regeneration?

This removes suppmk-gen in favor of more generic Makefile targets---in this
case, having `%.tdat` depend upon `rater/core/tdat.xml`, even though that's
not quite true (the %.xml file generated from it needs it).  But these files
are going away soon; a pending TAME optimization branch removes support for
the underlying pattern primitive entirely; CSVMs should be used instead.
2021-02-23 10:43:09 -05:00
Mike Gerwitz 698ddcdd06 build-aux/Makefile.am (.version.xml): Only change timestamp on hash change
The timestamp of the file will now only be updated if the hash (version)
_actually_ changes.  This allows this to be used as a target dependency
without forcing a rebuild each and every time.
2021-02-23 10:41:20 -05:00
Mike Gerwitz 3b1b894dab RELEASES.md: Update for v17.7.0 2020-12-09 09:59:09 -05:00
Mike Gerwitz e27423f909 Fully tail-recursive mrange
This solves issues of hitting stack limits, particularly in browsers, when
querying matrices that return a large number of rows for one or more
predicates.
2020-12-09 09:57:08 -05:00
Mike Gerwitz 8ce217f779 [DEV-8947] Make mrange fully tail-recursive and enable TCO
We were still having issues with this function when taking the positive
branch, when predicates cause many matches within tables.  This was causing
us to hit stack limits in certain browsers on the Summary Page.

This converts it to an iterator so that all branches are tail-recursive, and
then enables TCO on them.

I was disappointed to find that there's little performance or memory benefit
in running our test suite.
2020-12-09 09:56:43 -05:00
Mike Gerwitz cb93f4c02a [DEV-8947] Guided TCO: Reassign argument values after processing all expressions
I did say it was _experimental_ guided TRO.

This waits to perform the actual argument reassignment until after
processing the expressions associated with the new arguments, since they
will otherwise be replaced when their original values are still needed.
2020-12-09 09:56:40 -05:00
Mike Gerwitz f175042f41 RELEASES.md (v17.6.5): Add missing subheading
I also rephrased it a bit.  The original phrasing was not incorrect.
2020-12-09 09:56:32 -05:00
Corey Vollmer 3913ed9d81 RELEASES.md: Update for v17.6.5 2020-12-03 14:09:37 -05:00
Corey Vollmer eb2951d8ba [DEV-8927] Improve summary page performance with new element queries in TAME 2020-11-30 16:18:26 -05:00
Corey Vollmer 38f4d52e32 [DEV-8927] Improve summary page performance with new element queries 2020-11-30 16:06:36 -05:00
Mike Gerwitz 3df31d0ffc RELEASES.md: Update for v17.6.4 2020-11-23 15:26:54 -05:00
Mike Gerwitz 79e2583ca1 map: Tolerate non-string inputs for `uppercase` and `hash` methods
This change simply prevents failure in such situations, (e.g. on invalidated
fields in Liza).  We'll worry about proper errors and correctness, which
ought to be compile-time, in TAMER.
2020-11-23 15:24:08 -05:00
Joseph Frazer 32f0245038 RELEASES.md: Update for v17.6.3 2020-11-03 13:24:36 -05:00
Joseph Frazer c55564e076 [DEV-8571] Update the MathJax CDN
Merge branch 'jira-8571' into master

* jira-8571:
  [DEV-8571] Update the MathJax CDN
2020-11-03 13:16:01 -05:00
Joseph Frazer 18731c9c62 [DEV-8571] Update the MathJax CDN
The MathJax CDN stopped working in April 2017. I updated it to the
recommended CDN with the last version from April 2017 to ensure it works
like it used to work before the CDN stopped.

I added the checksum to ensure the content of the script.
2020-11-03 12:37:38 -05:00
Joseph Frazer a3d47321d8 RELEASES.md: Update for v17.6.2 2020-10-01 10:22:17 -04:00
Joseph Frazer cc19e0065f [DEV-8362] Include program.mk in project root
Merge branch 'jira-8362' into master

* jira-8362:
  [DEV-8362] Include program.mk in project root
2020-10-01 10:12:16 -04:00
Joseph Frazer f7968c0513 [DEV-8362] Include program.mk in project root
If a "program.mk" exists in a project's root, it should be included in
the Makefile.

Co-Authored-By: Anthony Dalfonso <anthony.dalfonso@ryansg.com>
2020-10-01 09:58:45 -04:00
Mike Gerwitz 37a8fb29f0 RELEASES.md: Update for v17.6.1 2020-09-23 16:32:19 -04:00
Mike Gerwitz fb5e7c68df Fail lv:param-class-to-yields without relying on propagation 2020-09-23 16:29:53 -04:00
Mike Gerwitz 89d3494c57 [DEV-8492] Fail lv:param-class-to-yields rather than awaiting propagation
This problem manifested when the name of the attempted classification is the
same name as another object.  For example, if we have `t:match-class
name="foo"`, and `foo` is a param instead of a class, then `@yields` will
fail, and it'd fall back to matching on the param.

This is absolutely not what we want.

The error message in this context is ugly, but it does work.

Example:

  !!! Unknown match @on (/lv:package/lv:classify/match): `error: unable to
  determine @yields for class `scheduled_ai' (has the class been imported?)'
  is unknown for classification --vis-scheduled-ai-type
2020-09-23 16:29:37 -04:00
Schaffer, Austin c02a32f22e Stop using accumulate in tdat template
See merge request floss/tame!46
2020-08-21 10:06:06 -04:00
Austin Schaffer d8651cfb95 [DEV-8081] Stop using accumulate in tdat template 2020-08-20 18:18:46 -04:00
Mike Gerwitz 6743bfff4a package{,-lock}.json additions
These were being changed every time npm was run.
2020-08-19 15:39:50 -04:00
Mike Gerwitz 9111c3373e RELEASES.md: Update for v17.6.0 2020-08-19 15:30:00 -04:00
Mike Gerwitz da7a2c71c7 tamed: TAMED_JAVA_OPTS: New environment variable
This will be passed to dslc and then to the JVM.  The intent is to permit
fine-grained heap ratio tuning.
2020-08-19 10:19:04 -04:00
Mike Gerwitz 680691c4cf bootstrap: Permit directory for hoxsl
Now I recall the reason I had -e: we clone hoxsl in the pipeline.
2020-07-27 12:51:10 -04:00
Mike Gerwitz 2627b8eef5 bootstrap: Check explicitly for hoxsl symbolic link
Using `-e` resulted in a situation where a broken symbolic link would cause
`ln` to be executed in error.
2020-07-24 15:33:17 -04:00
Mike Gerwitz 13cd372eb7 RELEASES.md: Include mention of lsimports fix 2020-07-23 14:33:06 -04:00
Mike Gerwitz 59b2b32756 build-aux/lsimports: Fix awk gensub warning
Third argument must be numeric indicating which match to replace.
This error did not exist in previous versions.
2020-07-23 14:31:54 -04:00
Mike Gerwitz 4610f5d4a4 RELEASES.md (17.4.3): Fix {product=>produce} typo 2020-07-15 14:38:07 -04:00
Mike Gerwitz d4d412f20a RELEASES.md: Update for v17.5.0 2020-07-15 11:16:14 -04:00
Mike Gerwitz 6baa88136a RELEASES.md: Remove heading underling artifact 2020-07-15 11:15:40 -04:00
Mike Gerwitz 6784090cf0 Use experimental TCO for heavily recursive portion of table lookup
This was urgently needed for a project using TAME.  Somehow, we've gone
all of these years without a table in which the first predicate is unable to
sufficiently filter out enough results that we do not hit stack limits.

Each recursive step of mrange before inlining and TCO, at the time of
writing, was adding eight stack frames.  This is because each let (and many
other things) compile into self-applying functions.  Since mrange is invoked
once for every single row for a given value, we quickly run out of stack
space.

For example, consider this table:

  1, $a, $b
  2, $a, $b
  2, $b, $c
  2, $c, $d
  3, $a, $b

If we were to filter the first column on the value 2, it would first bisect
to find the middle row, backtrack to the first, and then move forward to the
last, producing:

  2, $a, $b
  2, $b, $c
  2, $c, $d

This is at least three mrange calls, for a potential total of 8*3=24 stack
frames, depending on implementation details I don't quite recall at the
moment about the how the query system works.

We had over 1000 rows after applying the first predicate; the stack was
exhausted before it could even reach the last row.

Tail call optimization (TCO) is the process of turning recursive calls in
tail position into jumps.  So, rather than the stack growing on a recursive
call, it stays constant.  A common way to accomplish this in stack-based
languages is using a trampoline.

In our case, we enclose the entirety of the function in a `do` loop, and
clear a flag indicating that a tail call took place.  When we reach a
recursive tail call, we set that flag.  Then, instead of invoking the
function again, we _overwrite the original arguments_ with their new
values, and simply return 0.  When the function hits the end of the loop, it
will see that the flag is set, and jump back to the beginning of the
function, starting all over with the new values.

Compiling in this functionality is not difficult.  Tracking whether a given
call is in tail position, however, is a bit of a pain given how the XSLT
code is currently written.  Given that this is all being replaced with
TAMER, it's difficult to stomach making too many changes to the compiler,
when we can do it properly in the future with TAMER.  But we need the
feature now.

As a compromise, I call this implementation "guided" TCO---we rely on a
human to indicate that a call is in tail position by setting an experimental
flag manually.  That frees us from having to have the compiler do it, but
does create some nasty problems if the human is wrong.  Consequently, this
should only be used in core, and people should not use it unless they know
what they're doing.

Using this feature currently outputs a warning---that way, if there are
problems, people have some idea of where they maybe can look.  The warning
will be removed in the future after this has been in production for some
time (granted, our test suite passes).

Once again: TAMER will implement proper tail calls automatically, without
the need for a human to intervene.

For more information on tail calls:

  - https://en.wikipedia.org/wiki/Tail_call
2020-07-15 10:55:38 -04:00
Mike Gerwitz 61aec5f714 [DEV-8130] core: mrange: Use experimental guided TCO
This alleviates stack exhaustion issues with large rate tables where the
predicates fail to reduce the search space to a reasonable size.
2020-07-15 10:33:05 -04:00
Mike Gerwitz 3418023269 core: mrange: Inline _mrange_cmp
This will permit the use of TCO in the following commit.
2020-07-15 10:33:05 -04:00
Mike Gerwitz 26b1bdacec Experimental guided TCO
This implements TCO in the XSLT compiler by requiring a human to manually
indicate when a recursive call is in tail position.  This was somewhat
urgently needed to resolve stack exhaustion on large rate tables.

TAMER will do this properly by determining itself whether a call is in tail
position.  Until then, this will serve as a test for this type of feature.
2020-07-15 10:33:04 -04:00
Mike Gerwitz 79c4116190 src/current/Makefile (all, html): Phony targets
Standardization for recursive make.  This Makefile will
go away at some point anyway.
2020-07-14 10:56:24 -04:00
Mike Gerwitz 1fbcededa8 build-aux/Makefile.am (ui/Program.js): include-path with arbitrary parent
This handles moving to another repository structure (our gigarepo) where
this relative path is no longer true.  The absolute path generated by this
is okay since it's ephemeral and only used for this build invocation.
2020-07-07 16:41:35 -04:00
Mike Gerwitz b5cfc11c34 RELEASES.md: Update for v17.4.3 2020-07-02 12:48:16 -04:00
Mike Gerwitz af222e2777 RELEASES.md: Summary for NEXT release 2020-07-02 12:47:12 -04:00
Mike Gerwitz e0356324bd TAMER: Linker: Human-readable unresolved object errors
Example output:

  error: unresolved extern `taxHomeState` of type `rate[float; 0]`, declared
  in `common/tax/state/dc`
2020-07-02 12:43:17 -04:00
Mike Gerwitz 96ffd5f6e5 [DEV-8000] ir::asg: Error types for unresolved identifiers during sorting
This checks explicitly for unresolved objects while sorting and provides an
explicit error for them.  For example, this will catch externs that have no
concrete resolution.

This previously fell all the way through to the unreachable! block.  The old
POC implementation was catching unresolved objects, albeit with a debug
error.
2020-07-02 01:38:32 -04:00
Mike Gerwitz a2415c8c6f [DEV-8000] ir::asg::base: Replace Symbol::new_dummy
Use symbol_dummy!.
2020-07-01 15:53:56 -04:00
Mike Gerwitz 0d4bbe5e4e [DEV-8000] ir::asg: Introduce SortableAsgError
This will be used for the next commit, but this change has been isolated
both because it distracts from the implementation change in the next commit,
and because it cleans up the code by removing the need for a type parameter
on `AsgError`.

Note that the sort test cases now use `unwrap` instead of having
`{,Sortable}AsgError` support one or the other---this is because that does
not currently happen in practice, and there is not supposed to be a
hierarchy; they are siblings (though perhaps their name may imply otherwise).
2020-07-01 13:42:14 -04:00
Mike Gerwitz f832feb3fa [DEV-8000] ir::asg::base::BaseAsg::check_cycles: Extract into function
The only reason this function was a method of `BaseAsg` was because of
`self.graph`, which is accessible within the scope of this
module.  `check_cycles` is logically associated with `SortableAsg`, and so
should exist alongside it (though it can't exist as an associated function
of that trait).
2020-07-01 11:02:20 -04:00
Joseph Frazer a5b3730410 RELEASES.md: Update for v17.4.2 2020-05-13 08:09:48 -04:00
Joseph Frazer 45b7adbf54 [DEV-7504] Add GraphML generation
Merge branch 'jira-7504'

* jira-7504:
  [DEV-7504] Update RELEASES.md to make it less technical
  [DEV-7504] Add cypher script for post-graph import
  [DEV-7504] Add make target for "graphml"
  [DEV-7504] Add GraphML generation
2020-05-13 08:05:57 -04:00
Joseph Frazer c72409d617 [DEV-7504] Update RELEASES.md to make it less technical 2020-05-13 08:04:48 -04:00
Joseph Frazer 71ba6a95eb [DEV-7504] Add cypher script for post-graph import
After we import the GraphML output into Neo4J, we need to make a few
modifications to make the data easier to work with.
2020-05-13 08:04:48 -04:00
Joseph Frazer 09350d0ada [DEV-7504] Add make target for "graphml" 2020-05-13 08:04:48 -04:00
Joseph Frazer 43d00a8268 [DEV-7504] Add GraphML generation
We want to be able to build a representation of the dependency graph so
we can easily inspect it.

We do not want to make GraphML by default. It is better to use a tool.
We use "petgraph-graphml".
2020-05-13 08:04:48 -04:00
Mike Gerwitz 18d87a6b00 Merge old Neo4j graph generation POC
This was never completed and will be able to be deleted entirely, but I
didn't want to lose this history by having it sit out in a branch.  Joe is
working on something better.
2020-05-13 00:45:01 -04:00
Mike Gerwitz b8d9128f18 POC for full graph 2020-05-13 00:44:13 -04:00
Mike Gerwitz be61d41ca5 RELEASES.md: Clarification and formatting fix of intro 2020-04-30 13:06:43 -04:00
Mike Gerwitz d53ee68e6d RELEASES.md: Fix tame-core heading 2020-04-29 15:49:57 -04:00
Mike Gerwitz 6f61ae7788 RELEASES.md: Add missing semver link 2020-04-29 15:48:21 -04:00
Mike Gerwitz 35010f6882 build-aux/release-check: Fix readonly variable issue 2020-04-29 15:38:22 -04:00
Mike Gerwitz baaecc7181 RELEASES.md: Update for v17.4.1 2020-04-29 15:34:29 -04:00
Mike Gerwitz 99f2d5054e Release notes and associated scripts
This begins providing release notes for changes and provides scripts to
facilitate this:

 - tools/mkrelease will update RELEASES.md and run some checks.
 - build-aux/release-check is intended for use in pipelines (e.g. see
   .gitlab-ci.yml) to verify that releases were done properly.
2020-04-29 15:33:46 -04:00
Mike Gerwitz 0127d4b698 TAMER: sym::Interner::index_lookup
This was originally omitted because there wasn't a use case for it.  Now
that we're adding context to errors, however, an owned value is highly
desirable.

This adds almost no measurable overhead to the internment system in
benchmarks (largely within the margin of error).
2020-04-29 11:33:41 -04:00
Mike Gerwitz 4b643385c8 TAMER: Update Cargo dependencies 2020-04-29 11:33:38 -04:00
Mike Gerwitz 1ddd9cceea TAMER: Finalize dependency graph construction
There is more refactoring that can be done, but this extracts it out of POC.
2020-04-29 11:14:51 -04:00
Mike Gerwitz bcca5f7c49 [DEV-7084] TAMER: AsgBuilder and IR lowering docs 2020-04-28 13:39:55 -04:00
Mike Gerwitz 0f4b2d75f8 [DEV-7084] TAMER: obj::xmlo: Private inner modules 2020-04-28 11:08:05 -04:00
Mike Gerwitz 549e9ca23b [DEV-7084] TAMER: AsgBuilderState:🆕 New constructor 2020-04-28 09:06:25 -04:00
Mike Gerwitz 9893d56775 [DEV-7084] TAMER: Finalize AsgBuilder 2020-04-28 09:06:25 -04:00
Mike Gerwitz 32abc7dce2 [DEV-7084] TAMER: impl PartialEq for XmloError
This cannot be dervied because XmlError does not implement PartialEq,
which is quite the annoyance in tests.
2020-04-28 09:06:25 -04:00
Mike Gerwitz 21a0bdcce1 [DEV-7084] TAMER: AsgBuilderError: Introduce proper error variants
This is a union (sum type) of three other errors types, plus errors specific
to this builder.

This commit does a good job demonstrating the boilerplate, as well as a need
for additional context (in the case of `IdentKindError`), that we'll want to
work on abstracting away.
2020-04-28 09:06:25 -04:00
Mike Gerwitz ef79a763ac [DEV-7084] TAMER: Correct Ix trait bound for AsgError
The `Debug` bound is inconvenient and requires propagation to any types that
use it.  Further, it's really awkward having `Display` depend on `Debug`; if
we want to render a useful display here, we can write one.

To be clear: IndexType implements Debug.

For now, this is pretty-printed by another part of the code, which we don't
want to implement in `Display` because it requires looking things up from
the graph.
2020-04-28 09:06:25 -04:00
Mike Gerwitz cfc13f9016 [DEV-7084] TAMER: ir::asg::IdentKindError: Replace string with enum 2020-04-28 09:06:25 -04:00
Mike Gerwitz 0a9a3214b7 [DEV-7084] TAMER: ir::asg::BaseAsg:🆕 New associated function
Profiling showed that creating an initial capacity of 0 did not have a
notable affect on performance.
2020-04-28 09:06:25 -04:00
Mike Gerwitz ecc2e33ba7 [DEV-7084] TAMER: xmlo::AsgBuilder: Accept XmloResult iterator
This flips the API from using XmloWriter as the context to using Asg and
consuming anything that can produce XmloResults.  This not only makes more
sense, but avoids having to create a trait for XmloReader, and simplifies
the trait bounds we have to concern ourselves with.
2020-04-28 09:06:25 -04:00
Mike Gerwitz 323ea79bf8 [DEV-7084] TAMER: Basic AsgBuilder cleanup
This just tidies things up a little bit before I get into some further
refactoring.  I wrote the original code when I was just learning Rust not
too long ago, so it's interesting to see how my understanding has changed
over that relatively short period of time.
2020-04-28 09:06:25 -04:00
Mike Gerwitz 9220de4769 [DEV-7084] TAMER: Finish encapsulating petgraph
This will allow us to migrate away from Petgraph in the future should we
choose to do so.
2020-04-28 09:06:25 -04:00
Mike Gerwitz 0f423f3b24 [DEV-7084] TAMER: Simplify path canonicalization
This abstracts away the canonicalizer and solves the problem whereby
canonicalization was not being performed prior to recording whether a path
has been visited.  This ensures that multiple relative paths to the same
file will be properly recognized as visited.
2020-04-28 09:06:25 -04:00
Mike Gerwitz 4a7e00c404 [DEV-7084] TAMER: ld::poc: Remove unused fragments arg 2020-04-28 09:06:25 -04:00
Mike Gerwitz c94120335f [DEV-7084] TAMER: ld::poc: Remove unnecessary initial path canonicalization
Less to refactor and test.
2020-04-28 09:06:25 -04:00
Mike Gerwitz da69118592 [DEV-7084] TAMER: AsgBuilderState
This completes the POC extraction for AsgBuilder, but is still POC
code.  The commits that follow will clean it up and provide tests.
2020-04-28 09:06:25 -04:00
Mike Gerwitz 3f46917da9 [DEV-7084] TAMER: AsgBuilder extracted from POC
This extracts the changes nearly verbatim before doing refactoring so that
it's easier to observe what changes have been made.
2020-04-28 09:06:25 -04:00
Mike Gerwitz 7ed0691c45 [DEV-7084] TAMER: fs: impl File for BufReader
This further simplifies the POC linker.
2020-04-28 09:06:25 -04:00
Mike Gerwitz fbfb3c4ba2 [DEV-7084] TAMER: CanonicalFile
This will be entirely replaced in an upcoming commit.  See that for
details.  I don't feel like dealing with the conflicts for rearranging and
squashing these commits.
2020-04-28 09:06:25 -04:00
Mike Gerwitz d97e53a835 [DEV-7084] TAMER: fs: Basic filesystem abstraction
This also includes an implementation to visit paths only once.  Note that it
does not yet canonicalize the path before visiting, so relative paths to the
same file can slip through, and relative paths to _different_ files could be
erroneously considered to have been visited.

This will be fixed in an upcoming commit.
2020-04-28 09:06:19 -04:00
Mike Gerwitz 90ed4e9bd6 [DEV-7084] TAMER: From<B, &I> for XmloReader
This serves as a constructor for the time being, decoupling from POC.  We
may do something better once we have a better idea of how the various
abstractions around this will evolve.
2020-04-20 10:53:51 -04:00
Joseph Frazer 3ba587c9f9 [DEV-7198] Replace macros with a templates
Merge branch 'jira-7198'

* jira-7198:
  [DEV-7198] Replace `rate-each` macro with a template
  [DEV-7198] Create a "yield" template
2020-04-17 12:15:43 -04:00
Joseph Frazer 15f5867508 [DEV-7198] Replace `rate-each` macro with a template
Replacing the existing macros with templates will allow us to now have
to deal with macros in the new compiler.

The `indexNameType` pattern needed to change to allow for variables. I
also had to remove the prefix for the `gentle-no` option of `rate`.
2020-04-17 11:35:10 -04:00
Joseph Frazer aa2bc6eedf [DEV-7198] Create a "yield" template
Create a "yield" and add backwards compatibility for the macro of the
same name. This is one of 2 macros that need to be replaced so we do not
have to worry about them with the new compiler.
2020-04-17 07:42:09 -04:00
Joseph Frazer 3dabc126f2 [DEV-7147] Add "tamec" executable
Merge branch 'jira-7147'

* jira-7147:
  [DEV-7147] Build "xmli" files using "tamec"
  [DEV-7147] Add "tamec" executable
2020-04-10 08:47:55 -04:00
Joseph Frazer b52b5825e6 [DEV-7147] Build "xmli" files using "tamec"
Rather than copying the files, we want to start using "tamec" to make
the "xmli" files, even if right now all it does is copy the file.
2020-04-09 09:46:46 -04:00
Joseph Frazer 2c587e2d9d [DEV-7147] Add "tamec" executable
Add a stub executable that will eventually become a full-featured TAME
compiler. The first implementation will only copy the source file to an
intermediary file that will be compiled by the XSLT compiler.
2020-04-09 09:46:46 -04:00
Joseph Frazer bb0c748672 [DEV-7136] Add xmli files
Merge branch 'jira-7136'

* jira-7136:
  [DEV-7136] Add xmli files
2020-04-08 09:10:18 -04:00
Joseph Frazer f6bf042505 [DEV-7136] Add xmli files
Add a new step to the build process that copies the `xml` file to an
`xmli` file. Eventually, the new compiler will create the `xmli` file
and the old compiler will convert it to an `amle` file during the
transition.
2020-04-08 08:27:47 -04:00
Mike Gerwitz 587241bf9b TAMER: Finalize object state transitions
In particular, this finalizes overrides and redeclarations.  The linker
should now be feature-complete.
2020-04-06 10:30:33 -04:00
Mike Gerwitz 8385b64e1d [DEV-7086] TAMER: Remove WIP linker warning
While it is true that this is still being finalized, the warnings originally
existed because tameld was not feature complete.  It is now.
2020-04-06 10:04:19 -04:00
Mike Gerwitz 68c7636be8 [DEV-7086] TAMER: ir::asg::base::test Add missing set_fragment failure test
Results the last remaining BaseAsg test TODO.
2020-04-06 09:56:13 -04:00
Mike Gerwitz b870480944 [DEV-7086] TAMER: ir::asg::TransitionError::BadFragmentDest tuple=>struct
Consistency.
2020-04-06 09:56:13 -04:00
Mike Gerwitz da5057058d [DEV-7086] TAMER: Disallow IdentObject::resolve redeclarations
Except under well-defined circumstances.
2020-04-06 09:56:12 -04:00
Mike Gerwitz 0868453dab [DEV-7086] Proper handling of identifier overrides
This is an awkward system that I'd like to remove at some point.  It adds
complexity.  For the meantime, overrides have been arbitrarily restricted to
a single override (no override-override).  But it's needed being until we
rework maps and can handle the illusion of overrides using the template
system.
2020-04-06 09:55:54 -04:00
Mike Gerwitz a4657580ca [DEV-7086] TAMER: TransitionError::Incompatible: Remove unused 2020-04-01 15:56:33 -04:00
Mike Gerwitz eab47783ab [DEV-7086] .gitignore (a.out, perf.data): Ignore 2020-03-31 15:17:49 -04:00
Mike Gerwitz 0f9acd16cd [DEV-7086] TAMER: BaseAsg::set_fragment: Remove duplicate code
Benchmark performance for this method is still substantially slower.  And
oddly, this nearly doubled the speed of the other two calls (granted, at
that speed, it doesn't matter).
2020-03-31 14:56:34 -04:00
Mike Gerwitz f7ed0dbff3 [DEV-7086] ASG benchmarks 2020-03-31 14:18:26 -04:00
Mike Gerwitz 7c65d729aa TAMER: BaseAsg test: Remove fulfilled stub TODO 2020-03-26 16:16:51 -04:00
Mike Gerwitz d39ec84399 Proper extern resolution
This properly checks identifier types when resolving externs. It also
includes a bit of refactoring. Note that some of that refactoring was
already merged into master.

The old linker was missing some things, so there are template changes in
here as well.

An example of an error currently:

  error: extern `__retry` of type `cgen[boolean; 1]` is incompatible with
  type `cgen[boolean; 0]`
2020-03-26 09:24:02 -04:00
Mike Gerwitz 4051debad2 [DEV-7087] TAMER: Add Source to IdentObject::Extern
All of these refactoring commits to arrive at this one final change: the
ability to store the source location for externs so that we can report on
what package is expecting an identifier to be defined.

Phew.  Goodnight.
2020-03-26 09:22:21 -04:00
Mike Gerwitz f44549d730 [DEV-7087] TAMER: Object{State,Data}: API representative of state transitions
The API now enforces beginning at Missing and transitioning through
states.  Methods have been renamed to reflect this.
2020-03-26 09:22:17 -04:00
Mike Gerwitz d3ecd7b228 [DEV-7087] TAMER: BaseAsg: Refactor duplicate declare{,_extern} code 2020-03-26 09:21:50 -04:00
Mike Gerwitz 40eaeb3dc8 [DEV-7087] TAMER: Remote optional Source from ASG and Object
This undoes work I did earlier today...but now we'll be able to support a
Source on an extern.

There is duplicate code between `BaseAsg::declare{,_extern}` that will be
resolved in an upcoming commit.  Upcoming commits will also simplify
terminology and clean up methods on ObjectState.
2020-03-26 09:18:08 -04:00
Mike Gerwitz 7dd8717f2f [DEV-7087] TAMER: Asg: Reintroduce declare_extern
There is some duplication here with `declare` that will be cleared up in a
following commit.  Reintroducing this method is necessary so that Source can
be used to represent the source location of the extern itself; it's
currently None to indicate an extern in `declare`.
2020-03-26 09:15:59 -04:00
Mike Gerwitz 537d9e64af [DEV-7087] TAMER: ObjectState: Introduce extern transition
This is the first step in a more incremental refactoring that previous
commits to undo the optional Source in `ObjectState::ident`.  This provides
an explicit transition to an extern, with the intent of requiring an initial
missing state.  This will simplify logic on the ASG.

Note that the Source provided to this new method is not yet used.  That too
will come in a following commit and will represent the source of the defined
extern rather than the concrete identifier.
2020-03-26 09:14:29 -04:00
Mike Gerwitz d6762ab547 [DEV-7087] TAMER: Type compatability check during extern resolution
This properly verifies extern types, and cleans up Asg's API a little so
that externs aren't handled much differently than other declarations.

With that said, after making src optional, I realized that we will indeed
want source information for externs themselves so we can direct the user to
what package is expecting that symbol (as the old linker does).  So this
approach will not work, and I'll have to undo some of those changes.
2020-03-26 09:14:26 -04:00
Mike Gerwitz 8de174d6a2 [DEV-7087] core: Fix extern dim defaults 2020-03-26 09:08:13 -04:00
Mike Gerwitz ee077e8f12 [DEV-7087] core/retry (__retry): dim=0
Now that we will be doing extern type checks, this
must be properly set.
2020-03-26 09:08:13 -04:00
Mike Gerwitz 7a972465ea [DEV-7087] TAMER: tameld: Format error output
We will want an option for verbose debug output in the future.
2020-03-26 09:08:13 -04:00
Mike Gerwitz 05d03dc4bb [DEV-7087] Beginning of extern type verification and reporting
This only verifies when externs are defined _before_ they need to be
resolved.  See a future commit for the rest of this.
2020-03-26 09:08:13 -04:00
Mike Gerwitz b35dd4f4dd [DEV-7087] TAMER: AsgError: Wrap TransitionError
See next commit.
2020-03-26 09:08:10 -04:00
Joseph Frazer 03fa2ffc0b [DEV-7133] Check for cyclic dependencies
Merge branch 'jira-7133'

* jira-7133:
  [DEV-7133] Clearly show the cycles in the output
  [DEV-7133] Check for cyclic dependencies
  [DEV-7133] Remove dependency from "lv:function/lv:param"
  [DEV-7133] Add AsgError::Cycle
2020-03-26 08:48:56 -04:00
Joseph Frazer 6386e096b4 [DEV-7133] Clearly show the cycles in the output 2020-03-26 08:48:43 -04:00
Joseph Frazer 8af93d9339 [DEV-7133] Check for cyclic dependencies
We want the linker to show an error when a cyclic dependency is
encountered.

Co-authored-by: Mike Gerwitz <mike.gerwitz@ryansg.com>
2020-03-26 08:48:43 -04:00
Joseph Frazer add610b7df [DEV-7133] Remove dependency from "lv:function/lv:param"
These dependencies do not matter and can be safely ignored. The linker
will catch these cycles in future versions so we need to remove the deps
now.
2020-03-26 08:48:43 -04:00
Joseph Frazer 59f194a46a [DEV-7133] Add AsgError::Cycle
We want a special error type when we detect cyclic dependencies.
2020-03-26 08:48:43 -04:00
Mike Gerwitz 7a4f6cf9f2 [DEV-7087] TAMER: symbol_dummy! macro 2020-03-24 14:14:05 -04:00
Mike Gerwitz f969877324 [DEV-7087] TAMER: {=>Ident}Object{,State,Data}
This is essential to clarify what exactly the different object types
represent with the new generic abstractions.  For example, we will have
expressions as an object type.
2020-03-24 09:56:25 -04:00
Mike Gerwitz 5fb68f9b67 TAMER: Make Asg generic over object
There's a lot here to make the object stored on the `Asg` generic.  This
introduces `ObjectState` for state transitions and `ObjectData` for pure
data retrieval.  This will allow not only for mocking, but will be useful to
enforce compile-time restrictions on the type of objects expected by the
linker vs. the compiler (e.g. the linker will not have expressions).

This commit intentionally leaves the corresponding tests in their original
location to prove that the functionality has not changed; they'll be moved
in a future commit.

This also leaves the names as "Object" to reduce the number the cognative
overhead of this commit.  It will be renamed to something like "IdentObject"
in the near future to clarify the intent of the current object type and to
open the way for expressions and a type that marries both of them in the
future.

Once all of this is done, we'll finally be able to make changes to the
compatibility logic in state transitions to implement extern compatibility
checks during resolution.

DEV-7087
2020-03-24 09:56:20 -04:00
Mike Gerwitz f20120787f TAMER: Extract identifier transitions into Object
The next commit will generalize this further.  This moves logic out of
BaseAsg so that we can implement more sophisticated transitions for
compatability checks.

The logic is still tested as part of BaseAsg; the next commit will change
that as it's generalized further.

* tamer/src/ir/asg/base.rs: Extract object transitions.
* tamer/src/ir/asg/graph.rs (AsgError)[IncompatibleIdent]: New variant.
  (From<TransitionError> for AsgError): Basic type translation.
* tamer/src/ir/asg/object.rs (TransitionResult): New type.
  (impl Object): Transition methods.
  (TransitionError): New enum.
2020-03-19 15:42:06 -04:00
Mike Gerwitz 3fe3fc4b84 TAMER: ld/poc: Simplify {get_interner_value=>get_ident} 2020-03-19 15:42:06 -04:00
Mike Gerwitz 400d5b25a1 ir::asg::Object::Empty: Remove variant
This variant is unnecessary, as it was used only by the indexer to represent
the absence of a node, for which was can simply use `None` in the containing
`Option`.

* tamer/Cargo.toml: Add `lazy_static`.
* tamer/Cargo.lock: Update.
* tamer/src/ir/asg/base.rs (with_capacity): Use `None` in place of
    `Some(Object::Empty)`.
* tamer/src/ir/asg/object.rs: Adjust state machine graphic.
  (Empty): Remove variant.
  (Missing): Remove reference to variance.
* tamer/src/lib.rs: Import `lazy_static` for test builds.
* tamer/obj/xmle/writer/writer.rs (Section::iter): Remove `Object::Empty`
    from documentation.
  (test::): Remove references to `Object::Missing`.  `lazy_static!` used
    here.
* tamer/obj/xmle/writer/xmle.rs (test::write_section_catch_missing): Replace
    reference to `Object::Missing`.
2020-03-19 15:42:06 -04:00
Joseph Frazer bc976b43cd [DEV-7085] Create `SortableAsg` trait
Merge branch 'jira-7085'

* jira-7085:
  TAMER: Tidy up graph_sort test
  [DEV-7085] Create `SortableAsg` trait
  [DEV-7085] Implement `PartialEq` for `Sections`
  [DEV-7085] Move sections to IR module
2020-03-16 11:12:57 -04:00
Mike Gerwitz 0a135ad707 TAMER: Tidy up graph_sort test
This still isn't comprehensive.  Further, it won't be able to be, because
we'd have to rely on Petgraph implementation details: there are potentially
many acceptable orderings for a given graph.
2020-03-13 11:51:59 -04:00
Joseph Frazer 7e95394076 [DEV-7085] Create `SortableAsg` trait
Create a trait that sorts a graph into `Sections` that can then be used
as an IR. The `BaseAsg` should implement the trait using what was
originally in the POC.
2020-03-13 11:51:59 -04:00
Joseph Frazer bc760387f6 [DEV-7085] Implement `PartialEq` for `Sections`
We want to be able to easily compare `Sections` in tests, so
implementing `PartialEq` (and `Debug`) for both `Sections` and `Section`
is required.
2020-03-13 11:51:59 -04:00
Joseph Frazer 59a0c382af [DEV-7085] Move sections to IR module
We need to use `Sections` in both the writer and the ASG so it needs to
be in a place that makes sense.
2020-03-13 11:51:59 -04:00
Austin Schaffer 5f3ccc6894 Allow yaml tests to evaluate assertions and prohibits 2020-03-12 13:04:43 -04:00
Austin Schaffer 433fc01e77 [DEV-7160] Do not allow terminating classifications for test runner 2020-03-12 13:00:33 -04:00
Austin Schaffer 940d41817f [DEV-7160] Set neg-yields param in retry template 2020-03-12 13:00:33 -04:00
Joseph Frazer 2434e138b8 [DEV-7134] Propagate errors
Merge branch 'jira-7134'

* jira-7134:
  [DEV-7134] Remove unnecessary node replacement
  [DEV-7134] Propagate errors from the writer
  [DEV-7134] Propagate sorting errors
  [DEV-7134] Propagate errors setting fragments
  [DEV-7134] Pass read event errors up the stack
  [DEV-7134] Return error for XmloEvent::SymDecl
  [DEV-7134] Add alias for LoadResult
  [DEV-7134] Remove unwrap so we can bubble up error messages
  [DEV-7134] Escalate the error from finding the absolute path
2020-03-09 13:41:26 -04:00
Joseph Frazer b5f6a082dd [DEV-7134] Remove unnecessary node replacement
The node was being replaced before we were catching errors properly. Now
that they are propagated, we should not need the replacement.
2020-03-09 11:41:11 -04:00
Joseph Frazer 01e7d3e560 [DEV-7134] Propagate errors from the writer
When an error occurs during the XML writing, they should be shown to the
user.
2020-03-09 08:23:13 -04:00
Joseph Frazer f373a00a80 [DEV-7134] Propagate sorting errors
If a node is found while sorting that is not expected, we should show
the error to the user.
2020-03-09 08:23:13 -04:00
Joseph Frazer 2a5551a04a [DEV-7134] Propagate errors setting fragments
If we cannot set a fragment, we need to display the error to the user.

We are currently ignoring "___head", "___tail", and objects that are
both virtual and overridden. Those will be corrected in with future
changes.
2020-03-09 08:23:13 -04:00
Joseph Frazer 06bc89a9ce [DEV-7134] Pass read event errors up the stack 2020-03-06 14:08:55 -05:00
Joseph Frazer 246a40a047 [DEV-7134] Return error for XmloEvent::SymDecl
We want more than warnings when a XmloEvent::SymDecl symbol has an
unknown "kind".
2020-03-06 13:41:32 -05:00
Joseph Frazer 2228a6158a [DEV-7134] Add alias for LoadResult
It looks better and was recommended by Rust's linter.
2020-03-06 12:44:22 -05:00
Joseph Frazer 4810e7a099 [DEV-7134] Remove unwrap so we can bubble up error messages 2020-03-06 12:32:42 -05:00
Joseph Frazer 590245e191 [DEV-7134] Escalate the error from finding the absolute path
We do not want to have a panic here. The error should be displayed
properly.
2020-03-06 12:24:45 -05:00
Mike Gerwitz bfea768f89 Copyright year 2020 update 2020-03-06 11:05:18 -05:00
Joseph Frazer 4941a7602f [DEV-7081] Add options to tameld
Merge branch 'jira-7081'

* jira-7081:
  [DEV-7081] Add options to tameld
2020-03-06 10:04:48 -05:00
Joseph Frazer e613bd8a8c [DEV-7081] Add options to tameld
We want to add an option to set the output file to the linker so we do
not need to redirect output to awk any longer.

This also adds integration tests for tameld.
2020-03-06 09:41:55 -05:00
Mike Gerwitz 8555cf1e4a configure.ac: Missing cargo-doc error=>warning
Documentation does not need to be built by most users,
who are simply trying to bootstrap the system.
2020-03-05 11:16:15 -05:00
Mike Gerwitz 777494a602 TAMER linker (still partly proof-of-concept)
We will continue to finalize this as we go.  It is currently used in
production, both for performance and because it fixes a bug in the
XSLT-based linker.
2020-03-03 11:32:49 -05:00
Joseph Frazer 6ac7641087 [DEV-7083] TAMER: xmle writer
This introduces the writer for xmle files.
2020-03-03 11:21:18 -05:00
Mike Gerwitz c2e6efc0b5 TAMER: Additional crate::ld documentation 2020-03-02 15:54:36 -05:00
Mike Gerwitz 310ddb7ea8 Replace XSLT-based linker with error
All systems should be using the provided Makefile, so this shouldn't be
invoked anymore.  The new linker is still considered a proof-of-concept, but
bugs have been encountered in the old one that are not worth investing the
time into fixing.

The new linker has been used in production for nearly a couple months and is
functioning properly.
2020-03-02 15:54:32 -05:00
Mike Gerwitz b89408e5bb TAMER: Extract quick_xml event-related mocks 2020-02-26 10:49:01 -05:00
Mike Gerwitz 19a6d67dc4 TAMER: Separate static xmle section 2020-02-26 10:49:01 -05:00
Mike Gerwitz 7c60b53de8 TAMER: Virtual symbol override 2020-02-26 10:49:01 -05:00
Mike Gerwitz ab3aec980d TAMER: POC: Use FxHash to remove nondeterminism
The default SipHash is a cryptographic hash and causes ordering to change
between runs.
2020-02-26 10:49:00 -05:00
Mike Gerwitz 645908e258 TAMER: xmle output changes to support Summary Page
Co-Authored-By: Joseph Frazer <joseph.frazer@ryansg.com>
2020-02-26 10:49:00 -05:00
Mike Gerwitz 6939753ca0 TAMER: POC: Output xmle
This is a working proof-of-concept that will be finalized in future commits.
2020-02-26 10:49:00 -05:00
Mike Gerwitz 85a4934db5 TAMER: Symbol source data and metadata 2020-02-26 10:49:00 -05:00
Mike Gerwitz bcc2ab1221 TAMER: Initial abstract semantic graph (ASG)
This begins to introduce the ASG, backed by Petgraph.  The API will continue
to evolve, and Petgraph will likely be encapsulated so that our
implementation can vary independently from it (or even remove it in the
future).
2020-02-26 10:48:59 -05:00
Mike Gerwitz f177b6ae5d configure.ac: Rust 1.{39>41}.0 version bump
Relaxes orphan rules for foreign traits.

This also modifies the error to suggest how to update using rustup.
2020-02-25 16:46:28 -05:00
Mike Gerwitz 10b9caa7ad TAMER: Fail on empty fragment ids (and fix underlying problem) 2020-02-25 16:46:28 -05:00
Mike Gerwitz a0893da577 TAMER: xmlo: Add Package event 2020-02-25 16:46:27 -05:00
Mike Gerwitz a8726918f7 TAMER: poc: Use xmlo reader
TODO: More information
2020-02-25 16:46:27 -05:00
Mike Gerwitz a929c8cae4 TAMER: xmlo reader
This introduces the reader for xmlo files produced by the XSLT-based
compiler.  It is an initial implementation but is not complete; see future
commits.
2020-02-25 16:46:25 -05:00
Mike Gerwitz db52fcdb30 Makefile.am (html-am): Add --document-private-items
This generated documenation is only going to be read be developers,
and the private information is very useful to them.
2020-02-25 16:10:57 -05:00
Mike Gerwitz 6aae741162 TAMER (sym::Interner::intern_utf8_unchecked): New function
This removes boilerplate for reading xmlo files.  See next commit.
2020-02-25 16:10:55 -05:00
Mike Gerwitz e8cd378d59 TAMER: Display for Symbol
One of the benefits of storing a reference to the interned string on the
symbol itself is that we get to get its underlying value essentially for
free.
2020-02-24 14:56:28 -05:00
Mike Gerwitz ff0c8bb34f Order symtable, sym-dep, fragments
This ordering will simplify streaming processing of xmlo files in
TAMER.  Specifically, we know that symbols will have been declared by the
time dependencies are added to the graph (and so we should only be creating
edges to existing nodes); and we can halt reading as soon as the closing
fragments tag is encountered, avoiding parsing the entirety of these massive
XML files.

On one particularly large program, this cuts time down from ~0.333s to
~0.300 in the POC linker.
2020-02-24 14:56:28 -05:00
Mike Gerwitz 1f4db84f24 TAMER: Arena-based string interner
Contrary to what I said previously, this replaces the previous
implementation with an arena-backed internment system.  The motivation for
this change was investigating how Rustc performed its string interning, and
why they chose to associate integer identifiers with symbols.

The intent was originally to use Rustc's arena allocator directly, but that
create pulled in far too many dependencies and depended on nightly
Rust.  Bumpalo provides a very similar implementation to Rustc's
DroplessArena, so I went with that instead.

Rustc also relies on a global, singleton interner.  I do not do that
here.  Instead, the returned Symbol carries a lifetime of the underlying
arena, as well as a pointer to the interned string.

Now that this is put to rest, it's time to move on.
2020-02-24 14:56:28 -05:00
Mike Gerwitz 176d099fb6 tamer::sym: FNV => Fx Hash
For strings of any notable length, Fx Hash outperforms FNV.  Rustc also
moved to this hash function and noticed performance
improvements.  Fortunately, as was accounted for in the design, this was a
trivial switch.

Here are some benchmarks to back up that claim:

test hash_set::fnv::with_all_new_1000                 ... bench:     133,096 ns/iter (+/- 1,430)
test hash_set::fnv::with_all_new_1000_with_capacity   ... bench:      82,591 ns/iter (+/- 592)
test hash_set::fnv::with_all_new_rc_str_1000_baseline ... bench:     162,073 ns/iter (+/- 1,277)
test hash_set::fnv::with_one_new_1000                 ... bench:      37,334 ns/iter (+/- 256)
test hash_set::fnv::with_one_new_rc_str_1000_baseline ... bench:      18,263 ns/iter (+/- 261)
test hash_set::fx::with_all_new_1000                  ... bench:      85,217 ns/iter (+/- 1,111)
test hash_set::fx::with_all_new_1000_with_capacity    ... bench:      59,383 ns/iter (+/- 752)
test hash_set::fx::with_all_new_rc_str_1000_baseline  ... bench:      98,802 ns/iter (+/- 1,117)
test hash_set::fx::with_one_new_1000                  ... bench:      42,484 ns/iter (+/- 1,239)
test hash_set::fx::with_one_new_rc_str_1000_baseline  ... bench:      15,000 ns/iter (+/- 233)
test hash_set::with_all_new_1000                      ... bench:     137,645 ns/iter (+/- 1,186)
test hash_set::with_all_new_rc_str_1000_baseline      ... bench:     163,129 ns/iter (+/- 1,725)
test hash_set::with_one_new_1000                      ... bench:      59,051 ns/iter (+/- 1,202)
test hash_set::with_one_new_rc_str_1000_baseline      ... bench:      37,986 ns/iter (+/- 771)
2020-02-24 14:56:28 -05:00
Mike Gerwitz 0d2bb5de59 Makefile.am (clean): New target
Not sure how I missed this one.
2020-02-24 14:56:28 -05:00
Mike Gerwitz 541fbffc2e tameld: Move documentation to tamer::ld 2020-02-24 14:56:28 -05:00
Mike Gerwitz f2b24e6505 HashMapInterner: New interner, docs, and benchmarks
This interner will be suitable for providing an index to look up nodes in
the ASG.
2020-02-24 14:56:28 -05:00
Mike Gerwitz 9a98644213 TAMER: sym::tests: Generate with macro
This will be used for generating the common tests between HashSet and
HashMap implementations.

This is my first macro in Rust.  There does not seem to be a way to
concatenate identifiers (!), so I'm placing them within modules
instead.  That ended up working out just fine, since then I can use a type
to provide the SUT.
2020-02-24 14:56:28 -05:00
Mike Gerwitz e4e0089815 TAMER: Initial string interning abstraction
This is missing two key things that I'll add shortly: a HashMap-based one
for use in the ASG for node mapping, and an entry-based system for
manipulations.

This has been a nice start for exploring various aspects of Rust
development, as well as conventions that I'd like to implement.  In
particular:

  - Robust documentation intended to guide people through learning the
    necessary material about the compiler, as well as related work to
    rationalize design decisions;
  - Benchmarks;
  - TDD;
  - And just getting used to Rust in general.

I've beat this one to death, so I'll commit this and make smaller changes
going forward to show how easily it can evolve.

(This module was originally named `intern` but this commit and those that
follow rewrote it to `sym`.)
2020-02-24 14:56:28 -05:00
Mike Gerwitz 593faa3491 Makefile.am (html-am): Run doc tests
Ensure that we have good examples before generating docs.
2020-02-24 14:56:28 -05:00
Mike Gerwitz 3248c429fe Makefile.am (doc, html): Use intra_rustdoc_links
This is enabled by default in nightly, and is not available at all in
stable.  Considering the PITA that it will be to go back and rewrite docs to
use the new format, and how important of a feature this is, we will just
make use of it now.
2020-02-24 14:56:28 -05:00
Mike Gerwitz 0147cb7cb4 Makefile.am (bench): New target
The configure script will determine if nightly is required for running
benchmarks, because `test` is currently an unstable feature.
2020-02-24 14:56:28 -05:00
Mike Gerwitz 0acc21f16f Makefile.am (check): Check whether formatting is required
Given that developers should be doing TDD and therefore running this target
frequently, this has the effect of providing immediate feedback when
formatting is needed and outputting a diff.  Developers will then quickly
understand what changes need to be made to avoid future issues (and can run
`cargo fmt` to fix it), at which point they'll rarely ever encounter
formatting errors.

The original purpose was to ensure pipelines fail when the formatter has not
been run.
2020-02-24 14:56:28 -05:00
Austin Schaffer e1076ce388 [DEV-6306] Add retry template
[DEV-6306] Add testing instructions to README.md

[DEV-6306] Add assertion to retry template
2020-02-20 08:36:39 -05:00
Joseph Frazer 9f02781a5b [DEV-6947] Add template to match UI values
The UI values need to match AND the question needs to be
visible. We do not have the visibility classifications yet, so we need to
define externs to allow this to build.
2020-01-31 16:27:04 -05:00
Joseph Frazer f2cbc5f8ad [DEV-6947] Allow param values to remove underscores 2020-01-31 16:27:04 -05:00
Austin Schaffer 0db05c442c Pass canterm flag to raters 2020-01-29 11:14:13 -05:00
Mike Gerwitz bc28af0145 Reduce size of compiler output
This both reduces some of the output and permits it to be run through
Google's Closure compiler.  Combined, this has the potential to halve the
size of classification-heavy executables, like the UI's classifier.
2020-01-23 15:01:19 -05:00
Mike Gerwitz 696a5ab371 build-aux/closure-externs.js: New file 2020-01-22 16:31:57 -05:00
Mike Gerwitz 7a2ce00ed5 src/current/compiler/js.xsl: Remove inline defaults for anyValue
This not only reduces file size, but also has a significant performance
benefit for the UI, which is almost entirely classifications.  A run for one
of our systems was reduced from 1m30s to 11s from this change.
2020-01-22 16:31:16 -05:00
Mike Gerwitz 46d5ed286c src/current/compiler/js.xsl: Strip unused result-set (@yields alt) 2020-01-22 16:31:16 -05:00
Mike Gerwitz 661684f1e4 src/current/compiler/js.xsl: Remove last anyValue arg by default
This was used to provide additional information on the stack for debugging
the compiled code.  Since this is very rarely needed, and is only needed by
someone debugging the compiler, it can be manually enabled if desired.

This also wraps it so that it'll be stripped if it is included.
2020-01-22 16:31:16 -05:00
Mike Gerwitz b51f7fa042 src/current/compiler/js.xsl: {._CMATCH_=>[_CMATCH_]}
This was confusing Closure Compiler.
2020-01-22 16:31:16 -05:00
Mike Gerwitz 47d5dc238c src/current/compiler/js.xsl: @expose Closure Compiler annotations
This is deprecated, but neither of the recommended @export or @nocollapse
work the same way.
2020-01-22 16:31:09 -05:00
Mike Gerwitz 97806d5602 src/current/compiler/js.xsl: Remove dead arg check code
This was removed during The Great Refactoring.
It will be replaced with a better systemin TAMER.
2020-01-22 16:30:53 -05:00
Mike Gerwitz e0a78c2ed6 src/current/compiler/js-calc.xsl (compile-calc)[c:let]: Remove global assignment
The previous code was unintentionally assigning to an undefined global
variable.
2020-01-22 16:30:53 -05:00
Mike Gerwitz 0718d80257 bin/tamed: Fail without explicit DONE
We want to fail e.g. on a JVM crash.
2020-01-21 11:40:14 -05:00
Mike Gerwitz 61fe1af1cb build: Add revision files for xml{o,e}
This will force a rebuild and will be useful for upcoming changes.
2020-01-14 01:13:51 -05:00
Mike Gerwitz be296a241a build-aux/Makefile.am: Optional timestamping
Note that, because of the way this is implemented, the timestamps may become
mangled (multiple per line) for parallel builds.

Output can be prettied up in the future.
2020-01-02 10:42:08 -05:00
Mike Gerwitz 3cb67109ec Cargo.toml (profile.release)[lto]: Enable 2020-01-02 10:40:52 -05:00
Joseph Frazer 2a91d7680d [DEV-6595] Loosen the way we find classification matches
Merge branch 'jira-6595'

* jira-6595:
  [DEV-6595] Loosen the way we find classification matches
2019-12-10 09:42:11 -05:00
Joseph Frazer cdacd1d93d [DEV-6595] Loosen the way we find classification matches
The `<t:match-class-code-lookup />` matches were not showing in the
summary pages. I loosened the selector so it is able to find the matches
when it generates the summary pages.
2019-12-10 08:08:52 -05:00
Mike Gerwitz 2c1ff90d0a TAMER: tameld: Proof-of-concept
This is a POC playing around with Rust to demonstrate how the linker could
be approached and to gather benchmarks.
2019-12-02 15:21:46 -05:00
Mike Gerwitz 8455a38a1d Graph-based POC
This makes use of Petgraph for representing the dependency graph and uses a
separate data structure for both string interning and indexing by symbol
name.
2019-12-02 10:05:48 -05:00
Mike Gerwitz d78d81d721 Cargo.toml: Add petgraph
This will be used to represent the dependency graph.
2019-12-02 10:00:53 -05:00
Mike Gerwitz 717375a84a Cargo.toml: Tame {on=>in} Rust
Changed to match README.md.  This makes more sense too.
2019-12-02 10:00:53 -05:00
Mike Gerwitz 8374541965 tamer: Initial baisc POC with no XML output
This is garbage code.  Do not use it.  It is intentionally throwaway.

While I've researched Rust, I haven't actually _used_ it for a project, so
this is a combination of me exploring various ways of accomplishing the
problem and forcing myself to learn certain aspects of the language.

I'll likely be using petgraph, and this also currently lacks symbol
abstractions.  This commit also performs far too much heap allocation
copying strings around.  But it _does_ perform the topological sort.

Since this only stores the symbol name, it lacks enough information about
the symbol to perform a proper linking.
2019-12-02 10:00:53 -05:00
Mike Gerwitz e53482f2a3 Introduce CARGO_BUILD_FLAGS
This is intended to permit passing `--release`, since dev builds are
terribly slow (e.g. 6s -> 0.2s).  See README.md for more information.
2019-12-02 10:00:49 -05:00
Mike Gerwitz d96f090337 core/ui: Correct vector/cmatch import path 2019-11-27 09:17:04 -05:00
Mike Gerwitz 695077d27b core/states.xml: Remove old transition file
Everything should use core/state.
2019-11-27 09:16:47 -05:00
Mike Gerwitz 01e3c33b58 tamer/Cargo.toml: Add quick_xml 2019-11-27 09:16:00 -05:00
Mike Gerwitz e52dd45872 tamer/rustfmt (max_width): Set to 80 2019-11-27 09:15:15 -05:00
Mike Gerwitz c4a8eac59e Makefile.am: Clean up currently-unused path_ vars
Cargo handles it for us.
2019-11-20 10:11:00 -05:00
Mike Gerwitz 7412a8934c tameld: Placeholder binary 2019-11-20 10:11:00 -05:00
Mike Gerwitz f72ff973a7 Makefile.am (all): {cargo=>@CARGO@}
Typo.
2019-11-20 10:11:00 -05:00
Mike Gerwitz f0ca5c60c9 Makefile.am (doc, html): New documentation target 2019-11-20 10:11:00 -05:00
Mike Gerwitz 0ab2b09dc6 Scaffolding for TAMER 2019-11-19 15:30:10 -05:00
Mike Gerwitz aff58dab63 .gitlab-ci.yml (build): Use bootstrap script
No use in maintaining this stuff in two places.
2019-11-18 14:15:21 -05:00
Mike Gerwitz d20e2bc78a tamer: Integrate into normal build process
Rust is now expected to be installed in the base image.
2019-11-18 14:15:07 -05:00
Mike Gerwitz a2478938b8 .gitlab-ci.yml (build): Clean up script
This has since been moved into the Docker image.
2019-11-18 14:06:55 -05:00
Mike Gerwitz d0208bf89b .gitlab-ci.yml (image): Make variable (BUILD_IMAGE)
We moved to an internal container registry so that we do not have to rely on
DockerHub.  Since TAME is a public project, this will allow our
configuration internally to vary from a public configuration.
2019-11-18 14:06:55 -05:00
Mike Gerwitz 8e241218b7 tamer build as part of pipeline and bootstrap 2019-11-18 14:06:54 -05:00
Mike Gerwitz fd1a5837ba TAMER: Initial commit 2019-11-18 14:05:47 -05:00
Joseph Frazer d9ecbd4e2c [DEV-6370] Package changes
Merge branch 'jira-6370'

* jira-6370:
  [DEV-6370] Allow recursive conditionals
  [DEV-6370] Pass in the $line_code rather than using it from the contract
2019-11-04 11:35:23 -05:00
Joseph Frazer 0fadbe8e8a [DEV-6370] Allow recursive conditionals
If an `lvm:if` is immediately followed by another 'lvm:if`, both should
be used to create the conditional. The existing code wouild only "select
the nearest condition".
2019-11-01 10:23:19 -04:00
Joseph Frazer cbe32aff72 [DEV-6370] Pass in the $line_code rather than using it from the contract
The LOB being passed into the function was being ignored and instead it
was pulling it from the contract object. With Package, this caused all 3
LOB to be "COMMPKGE" rather than the correct LOB being processed at the
time.

Going forward, one cannot `map` or `pass` to "line_code" as it will be
considered a reserved word.

Co-Authored-By: Jim Grundner <james.grundner@rtspecialty.com>
2019-11-01 10:23:19 -04:00
Mike Gerwitz ad7d585d37 TAME: {Adaptive=>Algebraic}
Just adapting the backronym a bit to be more accurate.  I've been meaning to
do this for over a year.
2019-10-28 13:15:37 -04:00
Mike Gerwitz 3ef6571922 Provide friendly lv:param-typedef-lookup failure for duplicate item values
The real solution is to disallow typedefs from getting into this state to
begin with, but I don't have time for that right now.
2019-10-25 13:56:47 -04:00
Mike Gerwitz e97f7a75c9 core/test/vector (_define-vector_): Require description
We want things to require documentation when possible.
2019-10-17 09:20:15 -04:00
Mike Gerwitz 39c7161cca vector/define: New package introducing _define-vector_
This is intended to work around the issue of defining arbitrary vectors
outside of a c:let.
2019-10-17 09:16:45 -04:00
Mike Gerwitz ce0e31032b core: ui: Add _match-ui-*_ templates
These are analaogus to the _match-*_ counterparts, but they convert @on@
into the question param and check against the applicability of the question.
2019-08-13 16:46:06 -04:00
Mike Gerwitz 8005268a1b lv:param-typedef-lookup: New preprocessor directive 2019-08-06 15:31:48 -04:00
Mike Gerwitz a58243c403 core/ui (_match-ui-applicable_): Account for applicability
It doesn't makes sense to consider a question to be set if it's not even
applicable.  This also helps to remove a bunch of duplicate code where these
templates are being used.
2019-07-30 14:35:05 -04:00
Mike Gerwitz 3e13b733c4 core/vector/cmatch.xml (_classify-vector_): New template
This is the analog of _classify-scalar_.
2019-07-29 14:51:38 -04:00
Mike Gerwitz 90bedc20f8 map: Nested value support for input map
For example: meta:foo.bar.baz.

DEV-3871
2019-06-14 11:02:18 -04:00
Mike Gerwitz 13ed4cd7dc Clean up extclass remenants
This is left over from f2db9f1268, in which I
should have cleaned all of this up.  One of our developers was hitting the
removed warning, which isn't necessary since the concept of a separate
"classifier" is no longer a thing after the aforementioned commit.

* rater/rater.xsd (no-extclass, no-extclass-keeps): Remove.
* src/current/rater.xsd: Likewise.  (I really need to deduplicate these.)
* src/current/compiler/js.xsl (compiler:entry-rater): Remove inaccurate
    comment (genclasses is used for other things).
* src/current/include/depgen.xsl (preproc:depgen-match): Remove error
    checking for pulling in non-external classes (this is the error that the
    developer hit that is no longer needed).
* src/current/include/preproc/eligclass.xsl (preproc:sym): Remove
    `@extclass' predicate.  Remove portion of comment.
* src/current/include/preproc/expand.xsl: Remove ancient footnote that
    even references an old internal rater!
* src/current/include/preproc/macros.xsl (preproc:class-groupgen): Remove
    external propagation.
* src/current/include/preproc/symtable.xsl (preproc:symimport): Remove
    extclass checks and propagation.
  (preproc:symtable)[lv:rate]: Remove external propagation.
    [lv:classify]: Likewise.
* src/current/include/preproc/template.xsl (preproc:inline-apply): Remove
    external sym metadata support.
2019-05-22 12:57:35 -04:00
Mike Gerwitz c888e17e97 Revert "build-aux/Makefile.am: (program.expanded.xml): .version.xml dependency"
Now I remember why I didn't do this: it forces a rebuild of
program.expanded.xml every build.

This reverts commit 4f3dfc3bc7.
2019-05-07 14:04:11 -04:00
Mike Gerwitz 4f3dfc3bc7 build-aux/Makefile.am: (program.expanded.xml): .version.xml dependency 2019-05-07 12:03:30 -04:00
Mike Gerwitz c54a87e3e3 progtest (HtmlConsoleOutput): Correct indexing
Caused by previous commit.
2019-05-06 16:59:41 -04:00
Mike Gerwitz aaf7b47e9e progtest (AsyncTestRunner): Fix reporter line breaking
It was adding a count and a line break after the first test run.
2019-05-06 15:53:20 -04:00
Mike Gerwitz 5706ab4bef summary: Fill timestamp_* param values automatically
These exist because TAME is nondeterministic, so all state must be passed
into it.  But it's inconvenient to have users have to manually fill in
dates, so we derive them from the environment unless they are set.

* src/current/scripts/entry-form.js (fillTimeValues): New function.
  (rater): Use it.
2019-04-19 11:46:29 -04:00
Mike Gerwitz f270220b11 build-aux/csvm2csv: Propagate csvm-expand exit status
csvm2csv was not failing when csvm-expand exited with a non-zero
status.  Further, the tests were written incorrectly to account for this.

* build-aux/csvm2csv: Set `pipefail' option.
* build-aux/test/test-csvm2csv: Fix tests.
2019-04-09 10:57:59 -04:00
Mike Gerwitz 4c61dfb1cc csvm: Permit all whitespace (including tabs)
While tabs aren't desirable, users that are not developers will be modifying
these files, and so we need to be permissive in what we want to
accept.  That doesn't mean that we need to forego occasional formatting, though.
2019-04-08 15:17:00 -04:00
Mike Gerwitz 6f5796238a tame: Create guard parent directory
It may not exist on certain systems (e.g. build containers).
2019-04-04 14:52:36 -04:00
Mike Gerwitz 1a35232bd8 Parallel build support
tamed was originally designed with support for parallel builds in mind, but
I hadn't completed that work because we didn't have enough hardware that
we'd benefit strongly from it.  That has since changed.

tamed will now spawn additional runners as needed to fulfill requests, which
works around the issue of not knowing how many jobs GNU Make is going to try
to do at once.

There were a couple minor dependency fixes/workarounds for now in the
Makefile, but otherwise everything appears to be working great.
2019-04-04 14:41:07 -04:00
Mike Gerwitz 7b7cf13607 build-aux/csvm-expand: {orig=>src} local arg typo fix
This does not affect its functionality.
2019-04-02 11:05:03 -04:00
Mike Gerwitz 9a1f916486 build-aux/csvm-expand: Spawn only one date and memoize
A table with a couple hundred thousand rows was taking minutes to
generate.  This gets it down to a few seconds.

* build-aux/csvm-expand (parse_date): New function.
  (parseline): use it.
2019-04-02 10:58:12 -04:00
Mike Gerwitz 3d07597f7c core/insurance (_credit_, _debit_)[@allow-zero@]: Add 2019-03-19 13:19:33 -04:00
Mike Gerwitz 6f9a95d306 progtest: Fail on unknown param as input 2019-02-27 11:44:35 -05:00
Mike Gerwitz c062cc5a5c progtest: Check inputs against known params
This aims to prevent needlessly wasted time debugging a non-working test
case, and to avoid writing incorrect test cases that happen to succeed even
though their inputs aren't properly defined.

For example, a common error is to use the name of a bucket field rather than
the name of the param that it maps to.

* progtest/src/TestRunner.js (_verifyKnownParams): New method.
  (_tryRun): Use it.
* progtest/test/TestRunnerTest.js: New test case.  Modify existing test
   cases to define used params.
* progtest/test/_stub/program.js (exports.rater.params): Declare used param.
2019-02-26 11:10:25 -05:00
Mike Gerwitz 602a77443f compiler: Expose params via compiled rater function
* src/current/compiler/js.xsl (compiler:exit-rater)[lv:package]: Expose
    `params' publicly on the rater function.
2019-02-26 11:09:23 -05:00
Mike Gerwitz c5a99e594d Another round of xmlo compilation performance enhancements
This reduces overall build times for one of our systems by ~50% by
addressing a lot of the low-hanging fruit for compilation of object
files.  There is much more work to be done, and the addition of maps added a
little bit of a mess that will be abstracted in future commits once I'm done
surveying the possible improvements that can be done.
2019-02-20 09:42:19 -05:00
Mike Gerwitz 5714bfb96b symtable: Substantial performance improvement in processing
This further improves performance of the symbol table processing.  The next
step will be to address how symbols are handled on a more intimate level,
since it's a huge mess atm.  But I'll save that for later, after the
low-hanging fruit has been resolved.

* src/current/include/preproc/symtable.xsl (preproc:sym-discover): Use
    `for-each-group' in place of `preceding-sibling'.  Aggressive use of
    maps for geneating the `dedup' sequence, which is a mess.
  (preproc:symtable-process-symbols): Additional maps to avoid
    preceding-sibling and following-sibling selectors (O(n²)=>O(n)).
2019-02-20 02:03:20 -05:00
Mike Gerwitz 16749a9a45 fragment: Iterate over document and use symtable map
Same concept as previous commits: rather than iterating over the symbol
table and scanning the tree for the matching node, iterate over the document
and look up from a symbol map: O(n²) => O(n).

This gives a respectable performance boost to compilation of certain
packages (best improving packages with many classifications or rate blocks).

* src/current/compiler/fragments.xsl (@xmlns:xs, @xmlns:map): New namespace
    declarations.
  (preproc:compile-fragments): Generate `preproc:fragment' nodes and match
    on document rather than symbols.
    [lv:package]: Generate map and tunnel it.
* src/current/compiler/js.xsl (compile)[lv:classify, lv:match]: Use
    symtable-map.
  (compile-class-condition)[lv:rate]: Likewise.
  (compile-cmatch)[lv:rate]: Likewise.
2019-02-20 00:26:32 -05:00
Mike Gerwitz dae1990a00 symtable: Speed up processing a bit
This uses the same map strategy (and same duplicate code) as previous
commits, but this one generates a map for two separate tables.

There is more room for improvement, but this cuts down on the time a
lot.  Also keep in mind that this is performed multiple times (once per
pass), so it's still worth revisiting.  Performance is still very poor for
very large (many thousands of symbols) symbol tables.

The next slowest part appears to be the fragment compilation.  I'm nearing
the end of the low-low-hanging fruit for maps.  The /common/gl package
mentioned in previous commits that previously took over a minute to compile
now compiles in 20s as of this commit on equivalent hardware.

* src/current/include/preproc/symtable.xsl (@xmlns:map): New namespace
    declaration.
  (preproc:symtable-process-symbols): Create map for `cursym' and
    `extresults'.  Use it.  Remove unused `dup'.  Output message when
    done (another is output slightly later on in the process).
2019-02-20 00:10:42 -05:00
Mike Gerwitz 063e68b3d0 validate: Use map for symbol table
This is the first step to improving the map.  Note that this duplicates the
symbol table generation code that's used in a few other places
now---that'll be cleaned up in future commits once I have a better idea of
all the places this will be used and try to move it to a higher level.

* src/current/compiler/validate.xsl (@xmlns:xs, @xmlns:map): New namespace
    definitions.
  (lvv:validate)[lv:package]: Generate symbol table map.  Tunnel to
    templates.
    [c:apply[@name], lv:classify[@as]//lv:match, lv:match[@value]]
    [c:*[@name or @of], c:apply/c:arg[@name], lv:rate/lv:class]: Use it.
2019-02-18 15:42:22 -05:00
Mike Gerwitz c077a71402 resolv-syms: Simplify dimension resolution
The existing code was not only complex (because of XSLT 1), but mostly
unnecessary.  We don't need to consult remote symbol tables at all anymore.

This shaves off an additional few seconds on large packages.

* src/current/include/preproc/package.xsl (preproc:resolv-syms)[preproc:sym]:
  Only consult local symbol table.  Simplify max dimension calculation.
2019-02-18 11:56:17 -05:00
Mike Gerwitz d2ab6e1149 resolv-syms: Generate maps for symtable and dependency lists
This is a first step (low-hanging-fruit kinda thing) for improving the
performance of symbol resolution, where the compiler has to figure out the
dimensions of a symbol by first resolving its dependencies,
recursively.  This is approximately an O(n³) polynomial-time algorithm _per
recursive step_.  Yikes.

This is traditionally where dynamic programming methods would be used, but
that's considerably more difficult in a immutable languages like XSLT, so
I'll do my best without.  (Saxon does offer some support for mutability, but
I'd prefer to avoid it if possible.)

This first change improves performance 30--40%.  For example, on two large
packages we have, build times drop from 55s to 35s and from 1m42s to 1m13s
respectively.

Good start, but much more to be done!

* src/current/include/preproc/package.xsl (preproc:resolv-syms)[lv:package]:
    Compute maps for preproc:symtable and preproc:sym-deps at each recursive
    step.  Pass along via tunneling.
  (preproc:resolv-syms)[preproc:sym]: Use them.

DEV-4354
2019-02-18 11:56:17 -05:00
Mike Gerwitz 09eb442c63 linker: Index root package symbol table
This only saves 1--2s on a 30s run, but I want to move into this direction,
so it'll simplify future refactoring if I just add it.  Small changes like
these will accumulate, too.

* src/current/compiler/linker.xsl (l:orig-package, l:root-symtable-map): New
    variables.
  (l:resov-extern): Use it.
2019-02-18 11:56:17 -05:00
Mike Gerwitz 14dc534709 test/symtable/symbols.xspec: Fix failing test
A bunch of failing pipelines apparently wasn't obvious to me.  And shame on
me for not running these locally; I forgot that the part of the system that
I touched had tests.

This was broken by b6cfdb4221.
2019-02-18 11:29:02 -05:00
Mike Gerwitz 26249f8dbb core: Add _vfilter-mask_
* core/test/core/suite.xml: Import `vector/filter'.
* core/test/core/vector/filter.xml: New package.
* core/vector/filter.xml (_vfilter-mask_, _vfilter_mask): New template, function.
2019-02-14 15:05:49 -05:00
Mike Gerwitz 279245d168 core: Add missing _minreduce_ @isvector@ test
core/test/core/vector/minmax.xml: Add missing @isvector@ test.
2019-02-13 16:10:26 -05:00
Mike Gerwitz 46b7c234dd core: vector/minmax/_minreduce_: New template
* test/core/suite.xml: Import `test/core/vector/minmax'.
* test/core/vector/minmax.xml: New package.
* vector/minmax.xml (_minreduce_, _minreduce): New template, function.
2019-02-13 14:38:08 -05:00
Mike Gerwitz bd73ea0121 Strip xsl: namespace prefix from most files
Still left are files that I don't want to deal with testing right now.
2019-02-07 14:36:18 -05:00
Mike Gerwitz e022a3133d Copyright year simplification and update to Ryan Specialty Group
This now uses year ranges, which I'll update annually.

This also renames "R-T Specialty" to "Ryan Specialty Group".  The latter is
the parent company of the former.  I was originally employed under the
former when LoVullo Associates was purchased, by I now work for the parent
company.
2019-02-07 13:23:09 -05:00
Mike Gerwitz 7862eef62e summary: Accommodate now-missing dependency lists
The previous commit made dependency lists optional for certain symbols.  The
Summary Page needs to be updated to permit such a thing.

The whole Summary Page needs aggressive refactoring, though, so this doesn't
bother checking for `no-deps' to see if this is a bad thing.

* src/current/summary.xsl (typeset-final)[preproc:sym-ref]: Permit missing
    symbol dependencies.
  (lv:param|lv:const|lv:item): Likewise.
2019-02-07 13:11:31 -05:00
Mike Gerwitz b6cfdb4221 depgen: Quadratic=>linear-time algorithm
This is a significant performance improvement for dependency
generation (which is responsible for building the dependency graph for a
package).

The previous algorithm ran in O(n²) time: it would iterate over the given
symbol table, and for _each_ symbol, do a linear scan of the entire document
to search for the corresponding source block.  This resulted in explosive
depgen time for larger packages.

This makes the algorithm run in O(n) by:
  - Using an XSLT 3 map for the symbol table for O(1) lookups; and
  - Iterating over the _document_ a single time rather than the symbol
    table, referencing the symbol table as needed (in O(1) time).

There are other parts of the system that can benefit from these same
improvements.  This is important, since we need to be able to handle many
thousands of symbols efficiently.

* src/current/compiler/linker.xsl (l:depgen-sym): Recognize smybol `no-deps'
    property, permitting missing dependencies.  This allows us to avoid
    creating nonsense nodes just to satisfy the linker, while still allowing
    the linker to perform essential checks to defend against compiler bugs.
* src/current/compiler/map.xsl (lvmc:stub-symtable): Set @no-deps on
    `___head' and `___tail' symbols.
  (lvmc:mapsym): Set `no-deps' as appropriate on map symbols.
  (preproc:depgen)[lvm:map[@from]]: Generate `preproc:sym-dep' node, which
    is now expected by the depgen process.
  (preproc:depgen)[lvm:map[*]]: Likewise.
  (preproc:depgen)[*[@lvmc:type='retmap']//lvmm:map[@from]]: Remove
    unnecessary template.
  (preproc:symtable)[lvm:map[@value]]: Pass `no-deps' to `lvmc:mapsym'.
* src/current/include/depgen.xsl (preproc:depgen)[preproc:symtable]: Create
    and use XSLT 3 map in place of `preproc:symtable' tree.  This allows for
    constant-time lookups.  Provide to templates via tunnelling.  Use it in
    place of exiting tree references.  Process source tree rather than
    iterating over symbol table.
  (preproc:depgen)[lv:rate, c:sum[@generates], c:product[@generates],
    lv:classify, lv:function/lv:param, lv:function, lv:typedef]: Produce
      `preproc:sym-dep' nodes (which was previously done while iterating
      over the symbol table).
  (preproc:depgen)[preproc:sym]: Remove all such processing, since we no
    longer iterate over the symbol table.
  (preproc:depgen)[c:value-of]: Use symtable map.
  (preproc:depgen-match): Likewise.
  (preproc:depgen)[lv:union]: Modify to handle changes to lv:typedef
    template.
  (preproc:depgen)[text()]: Remove and replace with `node()'.
* src/current/include/preproc/package.xsl (preproc:resolv-syms): Remove
    logging of symbol resolution.  This has a slight performace impact since
    there is a lot of output.
* src/current/include/preproc/symtable.xsl
  (lv:function/lv:param, c:let/c:Values/c:value): Set `no-deps'.
* src/symtable/symbols.xsl: Add documentation of `no-deps'.
  (preproc:symtable)[lv:meta]: Set `no-deps'.
2019-02-07 11:39:50 -05:00
Mike Gerwitz 8e7a946127 vector/table: Add comparison operators 2019-02-04 12:21:05 -05:00
Mike Gerwitz 11109d4361 core: Add _where-*_ query predicate templates
These provide a more pleasent abstraction than having to reference CMP_OP_*
constants.

* core/test/core/vector/interpolate.xml: {t:when=>t:where-eq}.
* core/test/core/vector/table.xml: Likewise, but using the other variants
    where appropriate given the value of `@op'.
* core/vector/interpolate.xml: Likewise.
* core/vector/table.xml (_when_, _where_): Rename former to latter and
    provide deprecation warning.
  (_when-lt_, _when-lte_, _when-gt_, _when-gte_): Add abstractions.
* src/current/rater.xsd: Permit template variable as tenplate name.
2019-02-04 10:22:46 -05:00
Mike Gerwitz 36a3e348b6 core: Add comparison operators for table query predicates
This is fairly primitive support and it completely sidesteps the bisect
algorithm for now.  The next commit will abstract this a little bit further
to make it less awkward to use.

* core/test/core/vector/table.xml: New test cases.
* core/vector/filter.xml (CmpOp): New typedef.
  (mfilter): Document that bisecting will not happen unless `CMP_OP_EQ'
    is used.  Implement that restriction.
    [op]: New parameter.  Provide it to `mrange'.
  (_mfilter, _mrange_cmp): Rename from `_mfilter'.  Implement new comparison
    check based on `op'
    [op]: New argument.
* core/vector/table.xml (_when_)[@op@]: New param.  Add it to the produced
    vector.
  (_mquery): Unpack op (from `_when_') in call to `mfilter'.
2019-02-04 10:22:46 -05:00
Mike Gerwitz 74f8b56fcc Use some modern shorthands for core/vector/{table,filter}
Just trying to clean up a little as I go to start to make it easier
to understand.

* core/vector/filter.xml: Use _when-*_ templates and c:recurse.
* core/vector/table.xml: Likewise.
2019-02-04 10:22:46 -05:00
Mike Gerwitz 9af38261b9 calc.xsd: Permit template applications within c:value-of
* src/current/calc.xsd (valueType): Permit ##other nodes.
2019-02-04 10:22:46 -05:00
Mike Gerwitz c68c2f41d5 core/vector/table: Add specification for main templates
* core/test/core/suite.xml: Import core/test/core/vector/table.
* core/test/core/vector/table.xml: New specification.
2019-02-04 10:22:46 -05:00
Mike Gerwitz a35844f3fb depgen: Do not recurse into templates
Same logic, more efficient implementation.

* src/current/include/depgen.xsl (preproc:depgen): Stop at lv:template.
2019-02-02 23:19:40 -05:00
Mike Gerwitz 07e5dbd94b Add expand-barrier and skip-child-expansion
It's going to be like TeX before you know it... ._.

* src/current/include/preproc/package.xsl (preproc:tpl-check)
  [lv:template|lv:const|lv:typedef|lv:param-copy]: Add lv:param-copy.
* src/current/include/preproc/template.xsl (preproc:apply-template)
 [lv:expand-barrier, lv:skip-child-expansion]: New expansion control
   structures.
2019-02-01 16:01:56 -05:00
Mike Gerwitz 7ac4c1ce9d Template variable expansion on (lv:param-value|lv:param-inherit)/@name
This allows for dynamically generated metadata names.

* src/current/include/preproc/template.xsl (preproc:apply-template)
  [lv:param-meta]: Expand @name.
  [lv:param-inherit]: Expand @meta.
2019-02-01 16:01:56 -05:00
Mike Gerwitz f719d391c7 Default short-hand constant description to c:value-of/@label
This is a much more useful description if present.

* src/current/include/preproc/macros.xsl (preproc:macros)[c:value-of...]:
    Default generated constant description to @label.
2019-02-01 16:01:56 -05:00
Mike Gerwitz 73d691273e core: Replace all occurrences of c:{set=>vector}
The former is deprecated and never made any sense at all.
2019-02-01 16:01:56 -05:00
Mike Gerwitz b725963722 Add c:vector as c:set alternative
The term "set" is all wrong---it is actally intended to be a vector, and can
absolutely have duplicate elements (and often does).

* src/current/calc.xsd (vector): Add, recommending in place of `set'.
* src/current/compiler/js-calc.xsl (compile-calc)[c:set|c:vector]:
    Add `c:vector' and provide deprecation notice for `c:set'.
* src/current/include/calc-display.xsl (c:set|c:vector): Likewise.
2019-02-01 16:01:56 -05:00
Mike Gerwitz e4ccf3e90a test/spec: Work around expand-sequence bug
* core/test/spec.xml (_describe_): Enclose aggregate classification in a
  series of nested expand-sequence to work around bug (described in
  comment), which was causing test cases to not be compiled.
2019-02-01 16:01:56 -05:00
Mike Gerwitz 304faa1f07 summary: Remove rate-group processing
* src/current/summary.css (.rate-group, .rate-groups): Remove.
* src/current/summary.xsl (gen-menu): Remove rate-group processing.
  (rate-group-title): Remove.
  (lv:rate-group): Remove.
2019-02-01 16:01:56 -05:00
Mike Gerwitz 22aa59b5cf map: Properly default value in translation
A better option is to pre-process all inputs, but I need a quick
fix to my stupidity.  0||""==="".

* src/current/compiler/map.xsl (lvmc:compile)[lvm:map//lvm:from[*]]: Correct oval default.
2019-02-01 16:01:56 -05:00
Mike Gerwitz c7fec6a240 csv2xml: Import /rater/core{=>/base} directly
/rater/core is being removed.
2019-02-01 00:36:48 -05:00
Mike Gerwitz 9070d97e87 doc (Core Concepts): Initial stub section
I wanted to get this section started so that I can easily add to it when I
have small bits of time to do so.  Our documentation needs to improve.

* doc/Makefile.am (tame_TEXINFOS): Add `concept.texi'.
* doc/concept.texi: New file.
* doc/preproc.texi: Remove accidentally added input line.
* doc/tame.texi (menu): Add `Core Concepts' node.
2019-01-30 13:45:43 -05:00
Mike Gerwitz 01a420fd81 Revert "set_default: Allow empty vectors"
I need to revert this for now because it breaks YAML test cases.  The proper
fix is a more expressive type system with dependent types that would allow
it to know the proper number of indexes to initialize relative to other
inputs.  I wanted to implement this anyway to help catch iteration-related
bugs.

I'm tabling this for now, though, since I have other things that I need to
work on.

This reverts commit 4406cbe553.
2019-01-30 13:45:17 -05:00
Mike Gerwitz a985cf1f23 doc ({About=>Using} TAME): {about=>usage.tex}
* doc/Makefile.am (tame_TEXINFOS): {about=>usage}.texi.
* doc/tame.texi: Include {about=>usage}.texi
* doc/about.texi: Rename file.
* doc/usage.texi: New file (renamed from about).
2019-01-30 13:45:15 -05:00
Mike Gerwitz 0c67e85676 doc: Add cindex entries for existing About
* doc/about.texi: Add miscellaneous entries.
2019-01-30 13:44:43 -05:00
Mike Gerwitz 2e0a3fa62f doc/macros.texi: TODO adds dnindex entry
* doc/macros.texi (todo): Add dnindex entry.
2019-01-30 13:44:43 -05:00
Mike Gerwitz 290cf1b6e6 doc: Copied developer-related macros from Liza
This includes, notably, the Developer Notes feature.  I did not copy any
SRCUI stuff since this project uses literate documentation, but I'll add it
if it seems like it will be useful.  Barely any of the project is written
literately right now.

* .gitignore: `{=>/}config.*'.
* configure.ac (SET_DEVNOTES): New variable.
  (AC_CONFIG_FILES): Add `doc/config.texi'.
* doc/.gitignore (config.texi): Ignore (generated).
* doc/Makefile.am (tame_TEXINFOS): Add `macros.texi' and `config.texi'.
* doc/config.texi.in: New file.
* doc/macros.texi: New file containing some macros from `doc/tame.texi' and
  some from Liza's `doc/macros.texi'.
* doc/tame.texi: Adjust position of header comment.  Include `config.texi'
    and `macros.texi'.  Add devnotice to header.  Strip out macros.
  (menu): Add `Concept Index' and conditional `Developer Notes Index'.
  (Concept Index, Developer Notes Index): New nodes (latter conditional).
2019-01-30 13:44:24 -05:00
Mike Gerwitz 7f6961272c doc (Preprocessor): Extract into own file
* doc/Makefile.am (tame_TEXINFOS): Add `preproc.texi'.
* doc/preproc.texi: New file.
* doc/tame.texi: Extract `Preprocessing' section.
2019-01-29 15:46:21 -05:00
Mike Gerwitz e30e69d904 doc/tame.texi: Copyright year update 2019-01-29 15:44:22 -05:00
Mike Gerwitz f3aa38a0c1 doc: Convert most sections index appendicies
I want this manual to be useful both to developers and users of TAME,
so this distinction needs to be made clear.

* doc/tame.texi (Preprocessor): chapter=>appendix.
* src/graph.texi: Top to appendix and raise subsections.
* src/symtable.texi: Top to appendix.
2019-01-29 15:38:00 -05:00
Mike Gerwitz 4406cbe553 set_default: Allow empty vectors
This is an assumption that's existed since the Summary Page was first
devised---that all vectors have at least one value.  This is because the
bucket (originating from Liza) always has at least one value in its vectors.

Of course, we still have a problem in that the Summary Page initializes
everything to have a single value by default, and that's still the
case.  But this will at least allow for things _outside_ the Summary Page to
provide an empty array.  I'll have to address the Summary Page separately,
and that's going to be difficult, since we don't really want to change the
behavior across the board.

* src/current/compiler/js.xsl (set_defaults): Default max index to 0 if
    `length' is unavailable, rather than 1.
2019-01-29 10:07:36 -05:00
Mike Gerwitz c9ab302f53 map: Proper array check for translation iteration
The previous length check existed as a really bad array check (before
Array.isArray was a thing).  This has been broken since Nov 2012.

The problem manifests itself when you want an empty array.  We then have:
  [] => [[]] => [DEFAULT_VALUE]

* src/current/compiler/map.xsl (lvmc:compile)[lvm:map//lvm:from[*]]: Use
    `Array.isArray' in place of length check.
2019-01-28 14:15:10 -05:00
Mike Gerwitz 017ca1f437 suppmk-gen: Properly exit with non-zero status on failure
* build-aux/m4/calcdsl.m4: Exit on suppmk-gen.
* build-aux/suppmk-gen: Exit on failure.
2019-01-23 16:04:01 -05:00
Mike Gerwitz ce08086b15 doc: Remove todo.texi
TODOs shouldn't be stored here, and they will get out of sync.

* Makefile.am (tame_TEXINFOS): Remove todo.texi.
* tame.texi: Remove include and menu entry.
* todo.texi: Remove file.
2019-01-23 09:53:37 -05:00
Mike Gerwitz a7f186beff [BC BREAK] rater/core/insurance (_premium_): Add zero and negative assertions
This is a BC break since this generates assertions by default.  To maintain
BC, set `@allow-zero@' and `@allow-negative@' to `true' in existing template
applications.

* core/insurance.xml
  (assert_ignore_premium_zero, assert_ignore_premium_negative): New params.
  (_premium_): Generate assertions.
    [@allow-zero@, @allow-negative@]: New params.
2019-01-02 16:58:56 -05:00
Mike Gerwitz dec3f2ef35 rater/core/insurance (_factor_): gt{=>e} for negative assertions 2019-01-02 16:56:57 -05:00
Mike Gerwitz c3c7cfeeff map.xsl: Escape all output in strings 2018-12-20 14:31:14 -05:00
Mike Gerwitz fa378a654a Add lsimports and check-coupling
lsimports will be able to be used to replace the last remaining Ant script
that generates depfiles.

* build-aux/check-coupling:
* build-aux/lsimports: New files.
2018-12-19 14:20:24 -05:00
Mike Gerwitz 73f6b77771 [BC BREAK] check target supplier customization
This allows customizing from the command-line what suppliers should be
checked.  This motivation for this is both to run as part of a distributed
pipeline (where each supplier may be built individually), and for during
development of a single supplier.

BC BREAK: Note that this will now check for `package' in the test path for
UI tests.  To keep the old directory around, a symlink of `packages' to `ui'
would suffice.

* build-aux/Makefile (SUPPLIERS, suppliers_strip): New variables.
  (check-am): BC-BREAK: Build and check only requested suppliers.
* build-aux/progtest-runner: BC-BREAK: First argument is now test directory
    and all remaining arguments specify the supplier XML files to check.
2018-12-18 21:46:18 -05:00
Mike Gerwitz 32e3b16ec9 Makefile.am (program-ui): Remove standalones dep
We want to be able to build the UI independently of the
suppliers.  Historically, this did not provide much of a benefit, but this
change allows us to build independently as a job in a distributed pipeline,
and allows testing out the UI when rating is unneeded.

* build-aux/Makefile.am (program-ui): Remove `standalones'.
2018-12-18 20:58:29 -05:00
Mike Gerwitz f50b49542e Makefile.am (program-ui-immediate): Remove old target
This target has not been used for years.

* build-aux/Makefile.am (program-ui-immediate): Remove target.
  (program-ui): Use dependency of old `program-ui-immediate'.
  (.PHONY): Remove `program-ui-immediate'.
2018-12-18 20:53:58 -05:00
Mike Gerwitz 4105dc8fef Makefile.am: Add common target
* src/build-aux/Makefile.am (common): New target.
2018-12-18 16:40:35 -05:00
Mike Gerwitz c7e84f2e29 DslCompiler: Use s9api instead of JAXP
The difference is described here:

  http://www.saxonica.com/html/documentation/using-xsl/embedding/

And s9api here:

  http://www.saxonica.com/html/documentation/using-xsl/embedding/s9api-transformation.html

* Makefile.am (DSLC_CLASSPATH): Export for submakes.
* configure.ac (DSLC_CLASSPATH): Prefix with SAXON_CP.
* rater/rater.xsd (classNameType): Increase length 50=>75 (generated
    identifiers can now exceed that, it seems).
* src/current/rater.xsd: Likewise.  These files need to be combined.
* src/current/src/Makefile (CLASSPATH): Set to DSLC_CLASSPATH.
* src/current/src/com/lovullo/dslc/DslCompiler.java: Update imports.
  (DslCompiler)[_DslCompiler]: New members _processor and
    _xsltCompiler.  Convert to s9api.
2018-12-18 13:33:25 -05:00
Mike Gerwitz 044498f03f Makefile.am: Copy srv/!(rater).js to destination paths
Note that such files may not actually exist, which is why `nullglob' is set
and the `for' loop is used.

* build-aux/Makefile.am (SHELL): Set `nullglob'.
  (program-data-copy, lvroot): Copy srv/!(rater).js to destination JS paths.
2018-12-10 10:51:03 -05:00
Mike Gerwitz 219a4b521a summary: Remove reset button
This has been broken for years.  I don't object to fixing it, it's just that
I have better things to do right now and we've gotten complaints about it;
no use in keeping around something that's broken if there's no desire to fix
it.  Workaround: refresh the page.

This does keep around the reset logic because it is actually used in other
places.

* src/current/include/entry-form.xsl (entry-form)[lv:package]: Remove reset
    button.
* src/current/include/entry-form.js (clearTestCases): Remove broken function
    call `Prior.setPriorMessage(null)'.
2018-12-05 10:21:25 -05:00
Mike Gerwitz bcd8a67bd9 summary: Sans-serif font stack
It wasn't until recently that I realized that the default browser font was
being used, since I have mine customized.

* src/current/summary.css (body)[font-family]: Sans-serif font stack.
2018-12-05 10:14:08 -05:00
Mike Gerwitz fe14db7379 map: Correctly set translation defaults given the symbol dimensions
* src/current/compiler/map.xsl:
  (lvmc:gen-input-default): Add argument.
    [dim]: New param, defaulting to `$sym/@dim'.
  (lvmc:compile)[lvm:map//lvm:from[*]]: Provide appropriate dimension value
    to `set_defaults'.  Provide compile-time error if nesting of `from'
    nodes exceeds what is appropriate for the symbol dimensions.
2018-12-04 16:56:53 -05:00
Mike Gerwitz 6e4d42f926 DslCompiler: Properly output errors and termination line
This fixes a number of obnoxious miscellaneous issues, summarized below.

* src/current/src/com/lovullo/dslc/DslCompiler.java (DslCompiler)[compile]:
    Output termination line (DONE) on missing destination path
    error.  Always output exception message before termination
    line (otherwise it won't output to the user).  Output termination line
    and remove destination file for XSD failure.
2018-12-04 13:45:46 -05:00
Mike Gerwitz f62f2ccacb bin/tame: Shift arguments after -v
* bin/tame (main): Fix issue where -v did not shift arguments.
2018-12-04 13:45:36 -05:00
Mike Gerwitz 10106993b5 map: Always terminate on missing destination symbol
This was a bit of a nasty one.  Fortunately, this was only used as a
validation, so the code that the compiler produced was still correct.

The problem was that a version of Saxon sometime between 9.5 and 9.8 added
an optimization to eliminate conditionals with no body.  Consequently, the
kluge to force the variable to be evaluated was optimized away,
`lvmc:get-symbol' was never called, and no error was ever produced.

This would be best refactored, but that's not something I have time to take
up at the moment priority-wise.  This should be future-proof since this
would never be a noop.

* src/current/compiler/map.xsl (lvmc:compile)[lvm:map//lvm:from[*]]: Force
    evaluation of `$sym' by ensuring that the condition will not be a noop.
2018-12-04 12:06:40 -05:00
Mike Gerwitz cd5440b8da tamed: Clarify usage output shell example
* bin/tamed (usage): Clarify killing when run on a shell.
2018-12-03 16:46:06 -05:00
Mike Gerwitz 079d1dcfaf tamed: Do not stall if TAMED_SPAWER_PID is running
This will ensure that tamed does not stall while e.g. make is still
running.  This makes TAMED_STALL_SECONDS almost useless; maybe it'll be
removed in future versions.

* bin/tame (TAMED_SPAWNER_PID): Export variable.
* bin/tamed (TAMED_SPAWNER_PID): New variable, default to PPID.
  (spawner-dead): New function.
  (stall-monitor): Use it.
  (usage): Update documentation.
* build-aux/Makefile.am: Set TAMED_SPAWNER_PID to own id and export.
2018-12-03 16:25:25 -05:00
Mike Gerwitz 210693c22f [DEV-3958] c1map: Make output PHP namespace configurable
* src/current/c1map.xsl (lvm:c1map): Copy `@namespace' to generated
    `lvmp:root'.
* src/current/c1map/render.xsl (lvmp:render)[lvmp:root]: Output
    `@namespace' rather than using hardcoded string and dynamic program.
2018-11-28 16:42:04 -05:00
Mike Gerwitz cc7e09a700 Add new c1root and local c1-service copying to build
This maintains BC for existing raters that have not yet been migrated to use
the new c1-import service.

* build-aux/Makefile.am (path_c1root): New variable.
  (.PHONY): Add c1root target dependency.
  (program-data-copy): Copy to `@C1_IMPORT_MAPDEST@'.
  (c1root): New target.
* build-aux/m4/calcdsl.m4 (C1_IMPORT_MAPDEST): Configure depending on the
    existence of the `c1-import' directory.
2018-11-28 15:55:49 -05:00
Mike Gerwitz 7f3e279cfa anyValue: Always yield a matrix if any predicate is a matrix
This is a long-standing bug, apparently.  The location of this code makes it
difficult to test directly (that is in dire need of correcting), but
fortunately we have a number of tests in systems that use TAME that
indirectly test this.

The problem manifested when a matrix was already in the store, but then a
scalar or vector predicate was considered.  Without making the branch that
was modified here, it modified store such that it would always yield a
vector.

* src/current/compiler/js.xsl (anyValue): Consider store dimension when
    recursing.
2018-11-21 15:19:44 -05:00
Mike Gerwitz 98494edee5 core build
This is the start of a working build for core.

* .gitignore: Ignore generated files from configuration and build.
* build.xml: Copy from rater repo.  This is the last remaining ant-based
    dependency and can be gotten rid of; see comments.
* configure.ac: New file.
* rater/build-aux, rater/src: New symlinks.
2018-11-08 11:15:12 -05:00
Mike Gerwitz 62089877b2 Expose CALCROOT and new SRCPATHS to build scripts
This begins to decouple the rater directory conventions using an incremental
approach, defaulting to the existing structure.  Not all things were
modified (for example, cleaning will not work properly with a custom
SRCPATHS if those directories do not exist); WIP.

* build-aux/Makefile.am (path_dsl): Use `CALCROOT'.
  (suppliers.mk): Test for existence of program.dep and c1map directory
    before acting on them.
* build-aux/m4/calcdsl.m4: Default SRCPATHS.  Output it during configure.
    Expose CALCROOT and SRCPATHS using AC_SUBST.
    Invoke suppmk-gen using SRCPATHS.
* build-aux/suppmk-gen: Use arguments (SRCPATHS) in place of hard-coded paths.
2018-11-08 11:05:38 -05:00
Mike Gerwitz 5cb78cc47d dslc: Invoke with static rater path
This frees us from requiring a rater/ directory in the working
directory.  However, it is important that we continue using it if it
exists, since there are additional things that haven't yet been moved
into the tame repo.

* bin/dslc.in: Provide path to rater/ directory.
* src/current/src/com/lovullo/dslc/DslCompiler.java: Use provided rater/ path.
2018-11-08 09:26:07 -05:00
Mike Gerwitz 970c3531c5 core/COPYING: Remove duplicate
This is no longer necessary since tame-core was merged
with this repo.
2018-11-07 23:27:18 -05:00
Mike Gerwitz 1fb87106b0 core/test/core/insurance: Add missing descriptions
This was broken by a previous commit, but was not noticed because
the test cases aren't being compiled as part of the build yet!

Now that we have tamed, that is an option.

* test/core/insurance.xml: Add missing @desc@.
2018-10-29 11:55:34 -04:00
Joseph Frazer 21ffeb5841 [DEV-3836] make tests check recursively 2018-10-22 16:15:45 -04:00
Joseph Frazer ac00ce4401 [DEV-3836] make tests recursive 2018-10-22 16:15:23 -04:00
Mike Gerwitz f5913d6fa0 build-aux/Makefile.am: Correct program_fragments sorting
The sorting is intended to remove nondeterminism.

This fixes 9cce2b15.
2018-10-19 10:38:08 -04:00
Mike Gerwitz 9cce2b1542 build-aux/Makefile.am: Recognize all fragments as dependencies of program.expanded.xml
* build-aux/Makefile.am (program_fragments): New variable.
  (ui/program.expanded.xml): Add program_fragments as dependencies.
2018-10-19 10:15:14 -04:00
Mike Gerwitz 8fa7d9ece6 bin/tame: Inherit TAME_CMD_WAITTIME from environment
* bin/tame (TAME_CMD_WAITTIME): Renamed from `RUNNER_CMD_WAITTIME'.
    Inherit from environment, default 3.
  (command-runner): Sleep for an additional TAME_CMD_WAITTIME seconds after
    requesting runner reload to give more time in case of high load.
  (verify-runner-ack): Rename variable.
  (usage): Document env var.
* build-aux/Makefile.am: Export TAME_CMD_WAITTIME.
2018-10-19 10:15:14 -04:00
Mike Gerwitz 8143207903 Support non-xmlo dependencies for input map @src
* build-aux/gen-make: Do not add ".xmlo" suffix for deps with a
    trailing `$'.
* src/current/pkg-dep.xsl (lvm:program|lvm:return-map): Append ".xml$" to
    dep for map/@src (new dep).
2018-10-19 10:15:11 -04:00
Mike Gerwitz 04a31ecca1 csv2xml: Remove @name
This has been autogenerated for some time (during complication).

* build-aux/csv2xml: Remove @name from output root node.
2018-10-19 09:27:20 -04:00
Mike Gerwitz 74d1160533 build-aux/Makefile.am: Copy stripped ui/package.strip.js
This is the one we always want in the UI.  Rather than stripping with an
outside build process, just use this.

* build-aux/Makefile.am (program-data-copy, lvroot): Copy ui/program{=>.strip}.js.
2018-10-16 23:00:59 -04:00
Mike Gerwitz fba0f0df35 Run YAML test cases against stripped executable
This significantly improves speed and reduces memory usage when dealing with
hundreds of test cases.

* build-aux/Makefile.am (dest_standalone_strip): New variable.
  (strip, %.strip.js: New targets.
  (.PHONY): Add strip target.
  (check-am): Depend on strip.
* build-aux/progtest-runner: Use stripped executables.
2018-10-16 22:36:13 -04:00
Mike Gerwitz a4c8c0d840 bin/tame: Better runner re-try
Try to re-post message, since the previous message will have already been
read (otherwise the previous echo would have hung).

* bin/tame (EX_STALLED): New exit code.
  (command-runner): Re-post message after stall.  If unrecoverable, provide
    a more clear error and exit with EX_STALLED.
2018-10-16 22:23:57 -04:00
Mike Gerwitz b7167467b0 Propagate TAMED_STALL_SECONDS
bin/tame (TAMED_STALL_SECONDS): Export variable.
build-aux/Makefile.am (TAMED_INSTALL_SECONDS): Export variable.
2018-10-16 09:26:37 -04:00
Mike Gerwitz 1e5cdf8c40 src/current/src/Makefile: Phony recursive targets
Otherwise *-recursive fails.

* src/current/src/Makefile (check, info, pdf, html): New phony targets.
2018-10-16 09:10:15 -04:00
Mike Gerwitz db1c03dfd9 tame{,d}: Reload runner when unresponsive
This tries to be a bit more resilient in case a runner becomes unresponsive,
rather than waiting for tamed to kill itself.

* bin/tame (RUNNER_CMD_WAITTIME): New variable.
  (command-runner): Tell runner to reload if it does not respond in
    RUNNER_CMD_WAITTIME seconds.
  (verify-runner-ack): New function.
* bin/tamed (mkfifos): Only keep stdin open.  stdout isn't necessary, and
    may have actually been causing subtle issues.
  (spawn-runner): Support restarting dslc on SIGHUP.
2018-10-16 08:53:04 -04:00
Mike Gerwitz 5679be281a Makefile.am: Correct build intermediates and target dependencies
* Makefile.am (.SECONDARY): Keep all intermediate files.
  (%.html): Add `%.xmle' dependency.
  (lvroot): Add program-ui and c1map dependencies.
2018-10-11 23:50:53 -04:00
Mike Gerwitz f44d89d4d2 Makefile.am: Specify hoxsl-generated apply stylesheets
This will speed up compilation a bit.

* Makefile.am (apply_src): Manually specify apply files.
2018-10-11 23:26:44 -04:00
Mike Gerwitz 1a7dbcb651 Makefile.am (all): New target (to compliment all-nodoc)
`all-nodoc' was previously used in bootstrapping.

* Makefile.am (all): New target.
2018-10-11 23:00:28 -04:00
Mike Gerwitz 01671f8345 Improved build system
This fixes a lot of the problems with the build by using a normal Makefile
as it is intended to be used.  To do this, tamed was created.  See the
manual and commit messages for more information.  bin/tame{,d} also have
more information.  More information will follow in the manual in the future.

There is also more cleanup to follow; I just want to get this committed so
that people can take advantage of it and stop some of the suffering.
2018-10-11 22:28:25 -04:00
Mike Gerwitz 37c8af62b2 doc/about.texi: Begin adding `About TAME'
This does not include a great deal of information, but it is a start.

* README.md: Modernize.
* doc/Makefile.am (tame_TEXINFOS): Add `about.texi'.
* doc/about.texi: New file.
* doc/tame.texi: Include it.
2018-10-11 22:25:19 -04:00
Mike Gerwitz dc1d8036d6 build-aux/Makefile.am: .{PRECIOUS=>SECONDARY}
This will keep the intermediate files around but will still delete them on
build failure.

* build-aux/Makefile.am (.SECONDARY): Renamed from `.PRECIOUS'.
2018-10-11 22:25:19 -04:00
Mike Gerwitz 4442a3a3c2 bootstrap: New file
Please excuse the mess.  This was taken from an existing bootstrap script in
a private repository; it can be cleaned up in the future.

* bootstrap: New file.
* README.md (Getting Started): New section.
2018-10-11 22:25:19 -04:00
Mike Gerwitz 6027769633 Integrate new compilation scripts, remove cqueue and Makefile.2
This is a major step toward normalcy---removing the kluge of a build process
that was causing so many issues.  Rather than echoing all operations to a
queue file before passing it off to dslc, the new build scripts in `bin/'
are used to invoke tame normally, as needed.  This solves all of the current
issues with things not rebuilding when they should.  And, as a bonus, tab
completion on targets works.

Sorry this took so long.  There wasn't much motivation until we hired so
many people that are suffering from this.

This does a few major things, along with some miscellaneous others:
  - Invoke bin/tame directly;
  - Merge Makefile.2.in into Makefile.am; and
  - Fix up some targets.

* build-aux/Makefile.2.in: Delete file.  Mostly merged with Makefile.am.
* build-aux/Makefile.am: Add a bunch of new targets and definitions from
    Makefile.2.in.  Modify all that previously used .cqueue to now invoke
    `$(TAME)' directly.  Remove miscellaneous targets for trying to proxy
    targets to Makefile.2.
  (saneout, _go): Remove definitions.
  (.NOTPARALLEL): Add to prevent parallel builds.
  (ui/program.expanded.xml)[.version.xml]: Remove dependency for now.
  (clean): Also clean generated PHP files.  Follow symlinks to clean core.
    This is still incomplete (does not clean all rate table stuff).
  (suppliers.mk)[xmlo_cmd]: Remove.  See `gen-make' and `gen-c1make'.
  (lvroot)[summary-html]: New dependency.
  (kill-tamed, tamed-die): New targets (former alias of latter) to kill
    tamed.
* build-aux/gen-c1make: Generate `$(TAME)' invocation.
* build-aux/gen-make: Likewise.  Remove `xmlo_cmd' output.  Ignore recursive
    `tame' symlink (this can be removed once we clean `rater/' up.
* build-aux/m4/calcdsl.m4 (TAME): Update description to reflect that it
    should now be the path to `bin/tame'.  Adjust `AC_CHECK_FILE' lines
    accordingly.
  (tame_needed_ver): Remove.  We have been in the same repo as TAME itself
    for quite some time.  Remove associated code.
  (AC_CONFIG_FILES): Remove `Makefile.2'.
* src/current/src/com/lovullo/dslc/DslCompiler.java (_DslCompiler)[compile]:
    Perform validation prefore `compile' command rather than a separate
    `validate' step.  Remove `rm'.
  [compileSrc]: Stop echoing command.  This was only necessary because of
    the previous Makefile klugery; now Make echoes on its own correctly.
2018-10-11 22:25:18 -04:00
Mike Gerwitz 88da519c5e template.xsl: Remove eseq:expand-node function @override
* src/current/include/preproc/template.xsl (eseq:expand-node)[@override]:
  Remove attribute (deprecated by Saxon and unneeded).
2018-10-11 21:03:51 -04:00
Mike Gerwitz cf57857ce5 bin/: Server/client build scripts
These scripts allow the TAME compiler stack to be invoked naturally, rather
than requiring the use of a Makefile today.  This will not only allow users
to more easily invoke the compiler, but will also allow us to invoke TAME
naturally from Makefile and remove the klugery that has existed for so
long.

This users a server/client architecture in order to mitigate the startup
cost of the JVM.  More documentation will follow.

Note that there are a bunch of symlinks in rater/---this is a transition
step to allow the build to continue working as it did before, which relies
on a directory structure that exists outside of this repository.  This will
be cleaned up in the future.

* .gitignore (bin/dslc): Add ignore for generated file.
* bin/dslc.in: New script to encapsulate Java invocation.
* bin/tame: New script (client).
* bin/tamed: New script (server).
* configure.ac (JAVA_OPTS, DSLC_CLASSPATH, AUTOGENERATED): New variables for
  dslc.in.  Output bin/dslc.
* rater/README.md: Note that this symlink mess is temporary.
* rater/c1map: New symlink for dslc assumptions.
* rater/c1map.xsl: Likewise.
* rater/calc.xsd: Likewise.
* rater/compile.xsl: Likewise.
* rater/compiler: Likewise.
* rater/dot.xsl: Likewise.
* rater/include: Likewise.
* rater/link.xsl: Likewise.
* rater/standalone.xsl: Likewise.
* rater/summary.xsl: Likewise.
* rater/tame: Likewise (warning: circular symlink).
* src/current/src/com/lovullo/dslc/DslCompiler.java (_DslCompiler)[compile]:
  Output `DONE' lines.
2018-10-08 23:25:02 -04:00
Mike Gerwitz 4ad0c5d1be Include dslc Java build as submake
This will now automatically build on recursive target `all'.

* Makefile.am (SUBDIRS): Add `src/current/src'.
* src/current/src/Makefile: (.PHONY): Add `all'.
  (all): New target.  Alias to `dslc'.
2018-10-08 23:07:41 -04:00
Mike Gerwitz 7e69a0c2b6 build-aux/gen-make: Do not parse typelist as depfile 2018-10-04 16:46:48 -04:00
Mike Gerwitz b716e8c2cd csvm: Auto-sort expanded output 2018-10-03 14:44:55 -04:00
Mike Gerwitz 397710c055 csvm: Auto-sort expanded output
This will allow the variable abstractions to fully encapsulate values while
still permitting binary searches on sorted rows.

* csvm-expand: Renamed from `csvm2csv'.  Add directive support.
* csvm2csv: New script to perform sorting.  Invokes aforementioned.
* test/test-csvm2csv: Update for sorting.
2018-10-03 14:21:35 -04:00
412 changed files with 82090 additions and 9574 deletions

13
.gitignore vendored
View File

@ -7,6 +7,10 @@
*.info
# autotools- and configure-generated
/bin/dslc
/build-aux/install-sh
/build-aux/missing
/confdefs.h
/Makefile.in
/Makefile
/aclocal.m4
@ -23,3 +27,12 @@
# texinfo
/doc/*.fns
# generated by TAME build
suppliers.mk
run-[0-9].log
# binary data and profiling
a.out
perf.data

View File

@ -1,27 +1,43 @@
image: lovullo/rater-ci
image: $BUILD_IMAGE
stages:
- check
- build
- deploy
before_script:
- apt-get update
- apt-get -y install --no-install-recommends texinfo texlive-latex-base
- apt-get -y install nodejs
- git submodule update --init --recursive
- git clone https://gitlab.com/mikegerwitz/hoxsl
release_check:
stage: check
script:
- build-aux/release-check
only:
- tags
build:
stage: build
script:
- git submodule update --init --recursive
- git clone https://gitlab.com/mikegerwitz/hoxsl
- export SAXON_CP=/usr/share/ant/lib/saxon9/saxon9he.jar
- autoreconf -fvi
- ./configure
- ( cd progtest && npm install && ./autogen.sh && ./configure )
- make all check info pdf html
- export HOXSL=hoxsl
- ./bootstrap
- make clean all check info pdf html
artifacts:
paths:
- doc/
- tamer/target/*/tamec
- tamer/target/*/tameld
- tamer/target/doc
expire_in: 30 min
build:doc:tpl:
image: $BUILD_IMAGE_TEXLIVE
stage: build
script:
- cd design/tpl/
- make
artifacts:
paths:
- design/tpl/tpl.pdf
expire_in: 30 min
pages:
@ -29,9 +45,28 @@ pages:
script:
- mkdir -p public/doc
- mv doc/tame.html/* doc/tame.pdf doc/tame.info public/
- mv tamer/target/doc public/tamer/
- mkdir -p public/design
- mv design/tpl/tpl.pdf public/design/
artifacts:
paths:
- public/
expire_in: 30 min
only:
- tags
- main
- stage
ci:merge:
stage: deploy
script:
- git config user.email "gitlab-ci@localhost"
- git config user.name "GitLab CI"
- git checkout main
- git reset --hard origin/main
- git merge --ff origin/stage
# Do not trigger the pipeline after pushing; there's no use in
# re-doing the work we just did, since the merge is a fast-forward.
- git push -o ci.skip http://ci:$STAGE_MERGE_ACCESS_TOKEN@$CI_SERVER_HOST/$CI_PROJECT_PATH.git HEAD:main
only:
- stage

4
.rev-xmle 100644
View File

@ -0,0 +1,4 @@
# This number is incremented for every linker change to force rebuilding
# of xmle files.
5 # Removal of {ret,}map:___{head,tail}

7
.rev-xmlo 100644
View File

@ -0,0 +1,7 @@
# This number is incremented for every compiler change to force rebuilding
# of xmlo files.
7
# 6: Return map omit preproc:sym/preproc:from
# 7: Add lv:package/@name for worksheet xmlo files

View File

@ -1,7 +1,7 @@
Hacking TAME
============
Copyright (C) 2018 R-T Speciality, LLC.
Copyright (C) 2012-2019 Ryan Specialty, LLC.
Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License,

View File

@ -1,6 +1,6 @@
## TAME Makefile
#
# Copyright (C) 2015 R-T Specialty, LLC.
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
@ -16,29 +16,31 @@
# along with this program. If not, see <http://www.gnu.org/licenses/>.
##
SUBDIRS = doc progtest
SUBDIRS = tamer src/current/src doc progtest
path_src = src
path_test = test
path_aux = build-aux
# all source files will be run through hoxsl; see `applies' target
apply_src := $(shell find "$(path_src)" "$(path_test)" \
-name '*.xsl' \
-a \! -path "$(path_src)"/current/\* )
apply_src := $(path_src)/graph.xsl $(path_src)/symtable.xsl \
$(path_src)/symtable/symbols.xsl $(path_test)/graph-test.xsl
apply_dest := $(apply_src:%.xsl=%.xsl.apply)
# needed by test runner
export SAXON_CP
export DSLC_CLASSPATH
.DELETE_ON_ERROR:
.PHONY: check test texis applies FORCE
.PHONY: all all-nodoc check test texis applies FORCE
.DEFAULT_GOAL = all-nodoc
.DEFAULT_GOAL = all
all: applies progtest
all-nodoc: applies progtest
bin-local: applies
# the "applies" are hoxsl-generated stylesheets containing definitions to
# permit partial function application
@ -49,7 +51,7 @@ applies: $(apply_dest)
"$<" > "$@"
test: check
check: | applies
check-am: | applies
for test in $(path_aux)/test/test-*; do ./$$test || exit 1; done
$(path_test)/runner

View File

@ -1,5 +1,5 @@
<!---
Copyright (C) 2015, 2016, 2017 LoVullo Associates, Inc.
Copyright (C) 2015-2022 Ryan Specialty Group, Inc.
Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License,
@ -9,17 +9,18 @@
COPYING.FDL.
-->
# TAME
TAME is The Adaptive Metalanguage, a programming language and system of tools
TAME is The Algebraic Metalanguage, a programming language and system of tools
designed to aid in the development, understanding, and maintenance of systems
performing numerous calculations on a complex graph of dependencies,
conditions, and a large number of inputs.
This system was developed at LoVullo Associates to handle the complexity of
comparative insurance rating systems. It is a domain-specific language (DSL)
that itself encourages, through the use of templates, the creation of sub-DSLs.
TAME itself is at heart a calculator—processing only numerical input and
output—driven by quantifiers as predicates. Calculations and quantifiers are
written declaratively without concern for order of execution.
This system was developed at Ryan Specialty Group (formerly LoVullo Associates) to
handle the complexity of comparative insurance rating systems. It is a
domain-specific language (DSL) that itself encourages, through the use of
templates, the creation of sub-DSLs. TAME itself is at heart a
calculator—processing only numerical input and output—driven by quantifiers
as predicates. Calculations and quantifiers are written declaratively
without concern for order of execution.
The system has powerful dependency resolution and data flow capabilities.
@ -27,21 +28,13 @@ TAME consists of a macro processor (implementing a metalanguage), numerous
compilers for various targets (JavaScript, HTML documentation and debugging
environment, LaTeX, and others), linkers, and supporting tools. The input
grammar is XML, and the majority of the project (including the macro processor,
compilers, and linkers) are written in XSLT. There is a reason for that odd
choice; until an explanation is provided, know that someone is perverted enough
to implement a full compiler in XSLT.
More information will become available as various portions are liberated
during refactoring. [tame-core](https://github.com/lovullo/tame-core) is
TAME's core library, and [hoxsl](https://savannah.nongnu.org/projects/hoxsl)
was developed as a supporting library.
compilers, and linkers) is written in a combination of XSLT and Rust.
## "Current"
The current state of the project as used in production is found in
`src/current/`. The environment surrounding the development of this
project resulted in a bit of a mess, which is being refactored into
`src/` as it is touched. Documentation is virtually non-existent.
## TAMER
Due to performance requirements, this project is currently being
reimplemented in Rust. That project can be found in the [tamer/](./tamer/)
directory.
## Documentation
@ -54,6 +47,37 @@ instance. Available formats are:
- [Info][doc-info]
## Getting Started
To get started, make sure Saxon version 9 or later is available and its path
set as `SAXON_CP`; that the path to hoxsl is set via `HOXSL`; and then run
the `bootstrap` script:
```bash
$ export SAXON_CP=/path/to/saxon9he.jar
$ export HOXSL=/path/to/hoxsl/root
$ ./boostrap
```
## Running Test Cases
To run the test cases, invoke `make check` (or its alias, `make test`).
##### Testing Core Features
In order to run tests located at `core/test/core/**`, a supporting environment
is required. (e.g. mega rater). Inside a supporting rater, either check out a
submodule containing the core tests, or temporarily add them into the
submodule.
Build the core test suite summary page using:
```bash
$ make rater/core/test/core/suite.html
```
Visit the summary page in a web browser and click the __Calculate Premium__
button. If all test cases pass, it will yield a value of __$1__.
## Hacking
Information for TAME developers can be found in the file `HACKING`.

475
RELEASES.md 100644
View File

@ -0,0 +1,475 @@
TAME Release Notes
==================
This file contains notes for each release of TAME since v17.4.0.
TAME uses [semantic versioning][]. Any major version number increment
indicates that backwards-incompatible changes have been introduced in that
version. Each such version will be accompanied by notes that provide a
migration path to resolve incompatibilities.
[semantic versioning]: https://semver.org/
TAME developers: Add new changes under a "NEXT" heading as part of the
commits that introduce the changes. To make a new release, run
`tools/mkrelease`, which will handle updating the heading for you.
v19.1.0 (2022-09-22)
====================
Add gt/gte/lt/lte operators to if/unless
v19.0.3 (2022-04-01)
====================
Add upper/lower state code abbreviation
v19.0.2 (2022-03-07)
====================
This is a bugfix release that corrects applying param defaults via input
maps.
Compiler
--------
- Input maps using translations that fall back to `param/@default` will now
correctly apply that default.
- This was broken in the previous release v19.0.1.
v19.0.1 (2022-03-03)
====================
This is a bugfix release.
Compiler
--------
- Input maps will now ensure that non-numeric string values result in `0`
rather than `NaN`, the latter of which results in undefined behavior in
the new classification system.
v19.0.0 (2022-03-01)
====================
This version includes a backwards-incomplatible change to enable the new
classification system for all packages, which was previously gated behind a
feature flag and the `_use-new-classification-system_` template. (This
template will remain for some time before being removed, but will emit
deprecation warnings.) This new system can be disabled for now by setting
`legacy-classify=true` using the new `TAME_PARAMS` configuration option or
Makefile variable.
Nightly Rust >= 1.53 is now required for TAMER.
Compiler
--------
- The new classification system is now enabled by default on all packages.
- Legacy systems may use `TAME_PARAMS` (see below) to append
`legacy-classify=true` to disable the new system, for now.
- The new system will consider `x/0=0`; see commit message for detailed
rationale on this change. This will also remove all `Infinity` and `NaN`
values from intermediate and return variables. These get serialized to
`null`s in JSON.
- `TAME_PARMS`, now accepted by the `Makefile` and `configure` script, will
append `key=value` options to the XSLT-based compiler invocations.
- Input mappings will no longer emit the destination param as a dependency.
- `tamed --report` and `TAMED_TUI` for analyzing build performance.
- Runners now store start time and duration for each command, available in
the runpath for reporting.
- `TAMED_RUNTAB_OUT`, if set, will aggregate all runners' runtabs into a
single file as jobs are completed. See `tamed --help` for more
information and examples.
- Improved symbol table processing performance.
- For packages/maps with thousands of dependenices, this may improve
processing time by a minute or more.
Documentation
-------------
- `@mdash` macro now formally defines an argument, correcting errors in
newer versions of Texinfo (~v6.7).
Linker
------
- Remove exception for input map dependency processing (now that compiler no
longer emits such a dependency).
- Reduce peak memory usage by clearing buffer of `xmlo` reader between
events.
- How effective this is varies depending on the size of individual
entities within the XML document. In some cases, it can reduce peak
memory usage by half.
Tools
=====
- `build-aux/check-coupling` will now prevent `supplier/` packages from
importing `ui/` packages; previously only the reverse was true.
v18.0.3 (2021-07-21)
====================
This release significantly improves the performance of executables
containing large constants, and fixes an optimization-related bug introduced
in v18.0.0.
Compiler
--------
- Place constants into static section in linked executable.
- This was the case in the old linker before the `tameld`
proof-of-concept. The benefits are significant when large constants are
used (e.g. for large tables of data).
- Do not report value list optimization error on duplicate conjunctive
predicates.
- This doesn't emit code any differently, it merely permits the situation,
which can occur in generated code.
v18.0.2 (2021-07-15)
====================
This is a bugfix release that corrects issues with the Summary Page compiler
and corrects behavior with the new classification system (that is currently
off unless explicitly requested).
Compiler
--------
- Make Summary Page less chatty.
- Fix incorrect package name for generated worksheet packages.
- Restrict `TRUE`-match optimization to classification matches (class
composition).
- This was mistakenly not considering the domain of the match, and
therefore was applying the optimization in situations where it should
not. Results of previous classifications are currently the only place
we guarantee a boolean value.
- Apply classification alias optimization to any `1`-valued constant match.
- Previously applied only to `TRUE`.
Summary Page
------------
- Correctly generate input fields for params using imported types.
- This is a long-standing (nearly 10-year-old) bug.
v18.0.1 (2021-06-24)
====================
This is a minor maintenance release.
Compiler
--------
- Remove internal notice when new system is used to emit code for a
particular classification.
v18.0.0 (2021-06-23)
====================
This release focuses primarily on compiler optimizations that affect runtime
performance (both CPU and memory). The classification system has undergone
a rewrite, but the new system is gated behind a template-based feature flag
`_use-new-classification-system_` (see Core below). Many optimizations
listed below are _not_ affected by this toggle.
Compiler
--------
- Numerous compiler optimizations including (but not limited to):
- Classification system rewrite with significant correctness and
performance improvements, with significantly less generated code.
- There is more work to be done in TAMER.
- This change is gated behind a feature toggle (see
`_use-new-classification-system_` in Core below).
- Significant reduction in byte count of JavaScript target output.
- Classifications with single-`TRUE` predicate matches are aliases and now
compile more efficiently.
- Classifications that are a disjunction of conjunctions with a common
predicate will have the common predicate hoisted out, resulting in more
efficeint code generation.
- Classifications with equality matches entirely on a single param are
compiled into a `Set` lookup.
- Most self-executing functions in target JavaScript code have been
eliminated, resulting in a performance improvement.
- Floating point truncation now takes place without using `toFixed` in
JavaScript target, eliminating expensive number->string->number
conversions.
- Code paths are entirely skipped if a calculation predicate indicates
that it should not be executed, rather than simply multiplying by 0
after performing potentially expensive calculations.
- A bunch of wasteful casting has been eliminated, supplanted by proper
casting of param inputs.
- Unnecessary debug output removed, significantly improving performance in
certain cases.
- Single-predicate any/all blocks stripped rather than being extracted
into separate classifications.
- Extracted any/all classifications are inlined at the reference site when
the new classification system is enabled, reducing the number of
temporaries created at runtime in JavaScript.
- Summary Page now displays values of `lv:match/@on` instead of debug
values.
- This provides more useful information and is not subject to the
confusing reordering behavior of the compiler that is not reflected on
the page.
- Changes that have not yet been merged will remove debug values for the
classification system.
Core
----
- New feature flag template `_use-new-classification-system_`.
- This allows selectively enabling code generation for the new
classification system, which has BC breaks in certain buggy situations.
See `core/test/core/class` package for more information.
- Remove `core/aggregate`.
- This package is not currently utilized and is dangerous---it could
easily aggregate unintended values if used carelessly. Those who know
what they are doing can use `sym-set` if such a thing is a good thing
within the given context, and proper precautions are taken (as many
templates already do today).
Rust
----
- Version bump from 1.42.0 to 1.48.0 now that intra-doc links has been
stabalized.
Miscellaneous
-------------
- `build-aux/progtest-runner` will now deterministically concatenate files
based on name rather than some unspecified order.
v17.9.0 (2021-05-27)
====================
This is a documentation/design release, introducing The TAME Programming
Language in `design/tpl`.
Compiler
-------
- Allow the mapping of flag values from `program.xml`.
Design
------
- Introduce The TAME Programming Language.
v17.8.1 (2021-03-18)
====================
This release contains a bufix for recent build changes in v17.8.0 that were
causing, under some circumstances, builds to fail during dependency
generation. It also contains minor improvements and cleanup.
Build System
------------
- [bugfix] Lookup tables will no longer build `rater/core/vector/table` when
geneating the `xml` package.
- This was causing problems during `suppliers.mk` dependency generation.
The dependency is still in place for the corresponding `xmlo` file.
- This was broken by v17.8.0.
- Minor improvements to `tame` and `tamed` scripts to ensure that certain
unlikely failures are not ignored.
v17.8.0 (2021-02-23)
====================
This release contains changes to the build system to accommodate
liza-proguic's introduction of step-based packages (in place of a monolithic
`package-dfns.xml`), as well as miscellaneous improvements.
Compiler
--------
- `rater.xsd`, used for certain validations of TAME's grammar, has been
updated to an out-of-tree version; it had inadvertently gotten out of
date, and the discrepency won't happen again in the future.
- Further, limits on the length of `@yields` identifiers have been
removed; the lack of namespacing and generation of identifiers from
templates can yield longer identifier names.
Build System
------------
- Only modify `.version.xml` timestamp when hash changes. This allows
its use as a dependency without forcefully rebuilding each and every time.
- `configure` will no longer immediately generate `suppliers.mk`.
- Additionally, `build-aux/suppmk-gen`, which `configure` directly invoked
until now, was removed in favor of generic rules in `Makefile.am`.
- Step-level imports in program definitions are now recognized to
accommodate liza-proguic's step-level package generation.
- Step-level program packages are now properly accounted for as dependencies
for builds.
- `supplier.mk` is now automatically regenerated when source files
change. This previously needed to be done manually when imports changed.
- `supplier.mk` generation will no longer be verbose (it'll instead be
only one line), which makes it more amenable to more frequent
regeneration.
v17.7.0 (2020-12-09)
====================
This release provides tail-call optimizations aimed at the query system in
core.
Compiler
--------
- [bugfix] Recursive calls using TCO will wait to overwrite their function
arguments until all expressions calculating the new argument values have
completed.
`tame-core`
-----------
- `mrange` is now fully tail-recursive and has experimental TCO applied.
- It was previously only recursive for non-matching rows.
v17.6.5 (2020-12-03)
====================
This release improves Summary Page performance when populating the page with
data loaded from an external source.
Summary Page
------------
- Populating the DOM with loaded data now runs in linear time.
v17.6.4 (2020-11-23)
====================
This release tolerates invalid map inputs in certain circumstances.
Compiler
--------
- Tolerate non-string inputs to `uppercase` and `hash` map methods.
v17.6.3 (2020-11-03)
====================
- Update the CDN used to get MathJax.
v17.6.2 (2020-10-01)
====================
- Optionally include a "program.mk" file if it is present in the project's root
directory. This allows us to move program specific tasks outside of TAME.
v17.6.1 (2020-09-23)
====================
Compiler
--------
- `lv:param-class-to-yields` will now trigger a failure rather than relying
on propagating bad values, which may not result in failure if the symbol
is represented by another type (non-class) of object.
Miscellaneous
-------------
- `package-lock.json` additions.
v17.6.0 (2020-08-19)
====================
This release provides a new environment variable for JVM tuning. It does
not provide any new compiler features or performance enhancements in itself,
though it enables optimizations through JVM tuning.
Compiler
--------
- The new environment variable `TAMED_JAVA_OPTS` can now be used to provide
arguments to the JVM. This feature was added to support heap ratio
tuning.
Miscellaneous
-------------
- `build-aux/lsimports` was causing Gawk to complain about the third
argument to `gensub`; fixed.
- `bootstrap` will test explicitly whether `hoxsl` is a symbol link, since
`-e` fails if the symlink is broken.
v17.5.0 (2020-07-15)
====================
This release adds support for experimental human-guided tail call
optimizations (TCO) to resolve issues of stack exhaustion during runtime for
tables with a large number of rows after having applied the first
predicate. This feature should not be used outside of `tame-core`, and will
be done automatically by TAMER in the future.
`tame-core`
-----------
- `vector/filter/mrange`, used by the table lookup system, has had its
mutually recursive function inlined and now uses TCO.
- This was the source of stack exhaustion on tables whose predicates were
unable to filter rows sufficiently.
Compiler
--------
- Experimental guided tail call optimizations (TCO) added to XSLT-based
compiler, allowing a human to manually indicate recursive calls in tail
position.
- This is undocumented and should only be used by `tame-core`. The
experimental warning will be removed in future releases if the behavior
proves to be sound.
- TAMER will add support for proper tail calls that will be detected
automatically.
v17.4.3 (2020-07-02)
====================
This release fixes a bug caused by previous refactoring that caused
unresolved externs to produce an obscure and useless error for the end
user.
Linker
------
- Provide useful error for unresolved identifiers.
- This was previously falling through to an `unreachable!` block,
producing a very opaque and useless internal error message.
v17.4.2 (2020-05-13)
====================
This release adds GraphML output for linked objects to allow us to
inspect the graph.
Linker
------
- Add `--emit` oprion to `tamer/src/bin/tameld.rs` that allows us to specify
the type of output we want.
- Minor refactoring.
Miscellaneous
-------------
- Added `make` target to build linked GraphML files.
- Updated `make *.xmle` target to explicitly state it is emitting `xmle`.
- Added Cypher script to use in Neo4J after a GraphML file is imported.
- `RELEASES.md`
- Add missing link to semver.org.
- Fix `tame-core` heading, which was erroneously Org-mode-styled.
- Rephrase and correct formatting of an introduction paragraph.
v17.4.1 (2020-04-29)
====================
This release refactors the linker, adds additional tests, and improves
errors slightly. There are otherwise no functional changes.
Compiler
--------
- Refactor proof-of-concept dependency graph construction code.
- Improvements to error abstraction which will later aid in reporting.
Miscellaneous
-------------
- `RELEASES.md` added.
- `tools/mkrelease` added to help automate updating `RELEASES.md`.
- `build-aux/release-check` added to check releases.
- This is invoked both by `tools/mkrelease` and by CI via
`.gitlab-ci.yml` on tags.
v17.4.0 (2020-04-17)
====================
This release focuses on moving some code out of the existing XSLT-based
compiler so that the functionality does not need to be re-implemented in
TAMER. There are no user-facing changes aside form the introduction of two
new templates, which are not yet expected to be used directly.
`tame-core`
-----------
- New `rate-each` template to replace XSLT template in compiler.
- New `yields` template to replace XSLT template in compiler.
- Users should continue to use `rate-each` and `yields` as before rather
than invoking the new templates directly.
- The intent is to remove the `t` namespace prefix in the future so that
templates will be applied automatically.
Compiler
--------
- XSLT-based compiler now emits `t:rate-each` in place of the previous XSLT
template.
- XSLT-based compiler now emits `t:yields` in place of the previous XSLT
template.

View File

@ -1,7 +1,7 @@
#!/bin/bash
# Configuration script to be run before `make'
# Listen for TAME commands (compilers, linker, etc)
#
# Copyright (C) 2018 R-T Specialty, LLC.
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
@ -15,26 +15,28 @@
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
# @AUTOGENERATED@
##
echo "Generating suppliers.mk..."
# TODO: kluge; do properly.
rm -f suppliers.mk
make suppliers.mk
declare -r mypath=$( dirname "$( readlink -f "$0" )" )
declare -r dslc_jar="$mypath/../src/current/src/dslc.jar"
# XXX: paths are hard-coded here!
while read csv; do
csvbase="${csv%%.*}"
echo "$csvbase.xmlo: $csvbase.xml"
echo "$csvbase.xml: $csvbase.csvo"
done < <( find suppliers common -regex '^.+\.csv.?$' ) \
>> suppliers.mk
# TODO: decouple from old rater/ directory
rater-path()
{
# use rater/ in cwd if available to maintain previous behavior
if [ -d "$(pwd)/rater" ]; then
echo "$(pwd)/rater"
return
fi
while read tdat; do
tbase="${tdat%%.*}"
echo "$tbase.xmlo: $tbase.xml rater/core/tdat.xmlo"
echo "$tbase.xml: $tdat"
echo -e "\trater/tools/tdat2xml \$< > \$@"
done < <( find suppliers common -regex '^.+territories/.+\.dat$' ) \
>> suppliers.mk
# otherwise use our own
echo "$mypath/../rater"
}
CLASSPATH="$CLASSPATH:@DSLC_CLASSPATH@:$dslc_jar" \
"@JAVA@" @JAVA_OPTS@ $JAVA_OPTS \
com.lovullo.dslc.DslCompiler \
"$( rater-path )"

536
bin/tame 100755
View File

@ -0,0 +1,536 @@
#!/bin/bash
# Client for TAME daemon (tamed)
#
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
##
set -euo pipefail
declare mypath; mypath=$( dirname "$( readlink -f "$0" )" )
readonly mypath
declare -ri EX_NOTAMED=1 # tried to start tamed or runner but failed
declare -ri EX_STALLED=2 # runner stalled and could not recover
declare -ri EX_NORUN=3 # no available runners
declare -ri EX_DLOCK=4 # failed to get a lock to start tamed
declare -ri EX_BLOCK=5 # failed to get a lock for busy runner check
declare -ri EX_NODONE=6 # tamed did not provide a DONE with exit code
declare -ri EX_USAGE=64 # incorrect usage; sysexits.h
# maximum amount of time in seconds to wait for runner to ack
# before forcibly restarting it
declare -ri TAME_CMD_WAITTIME="${TAME_CMD_WAITTIME:-3}"
# propagate to daemon
export TAMED_STALL_SECONDS
export TAMED_SPAWNER_PID
export TAMED_JAVA_OPTS
# Send a single command to the next available runner and
# observe the result
#
# See `command-runner' for more information.
command-available-runner()
{
local -r root="${1?Missing root run path}"
shift 1
local -r id=$( reserve-runner "$root" )
test -n "$id" || {
echo "no available runners at $root" >&2
exit $EX_NORUN
}
command-runner "$id" "$root" "$@" \
| tee -a "run-$id.log"
}
# Send a single command to a runner and observe the result
#
# stdin will be directed to the runner. stdout of the runner will be
# echoed until a line beginning with "DONE" is found, after which this
# procedure will return with the exit code indicated by the runner.
command-runner()
{
local -ri id="${1?Missing id}"
local -r root="${2?Missing root run path}"
shift 2
local -r base="$root/$id"
local -ri pid=$( cat "$base/pid" )
verify-runner "$base" "$id" "$pid"
# forward signals to runner so that build is actually halted
# (rather than continuing in background after we die)
trap 'kill -TERM $pid &>/dev/null' INT TERM
# log the provided command line and starting time so that we can determine
# what is currently being compiled and how long it is taking
millis > "$base/cmdstart"
echo "$*" > "$base/cmdline"
# all remaining arguments are passed to the runner
echo "$*" > "$base/0"
# we should immediately get a response from the runner;
# if not, then it may have stalled for some reason
verify-runner-ack "$*" < "$base/1" || {
echo "warning: failed runner $id ack; requesting reload" >&2
kill -HUP "$pid"
# give some extra time in case the host is under high load
sleep "$TAME_CMD_WAITTIME"
# try one last time
echo "$*" > "$base/0"
verify-runner-ack "$*" < "$base/1" || {
echo "error: runner $id still unresponsive; giving up" >&2
exit "$EX_STALLED"
}
}
# output lines from runner until we reach a line stating "DONE"
while read -r line; do
# don't parse words in the initial read because we may be
# dealing with a lot of lines
if [ "${line:0:5}" == "DONE " ]; then
read -r _ code _ <<< "$line"
runtab-append "$base"
mark-available "$base"
return "$code"
fi
echo "$line"
done < "$base/1"
# We should have returned as soon as we received DONE. If this was not
# provided, then something probably went wrong (e.g. JVM crash).
return "$EX_NODONE"
}
# Get id of the first available runner and mark it as busy
#
# If no runners are available, tamed is signalled to spawn a new one.
#
# This command calls `mark-busy' so that it can acquire a runner in an
# atomic manner. The caller is responsible for invoking `mark-available'
# after processing is complete.
#
# If no runner is available, then the result will be empty.
reserve-runner()
{
local -r root=${1?Missing root}
local -r timeout=10
(
flock -w $timeout 7 || {
echo "error: failed to acquire busy lock at $root" >&2
exit $EX_BLOCK
}
# grab the first available or request a new one
local id; id=$( get-available-runner-id "$root" )
if [ -z "$id" ]; then
id=$( spawn-runner-and-wait "$root" ) || {
echo "error: failed to reserve runner at $root" >&2
exit $EX_NORUN
}
fi
# mark it as busy while we still have the lock
mark-busy "$root/$id"
echo "$id"
) 7>"$root/busy-lock"
}
# Get the id of the next available runner
#
# THIS FUNCTION MUST BE GUARDED BY A MUTEX! Otherwise there is a race
# between acquiring the available id and then actually making use of it.
#
# If multiple runners are available, then the first available runner sorted
# numerically will be chosen. This helps to give the same runners more
# work, since they're more likely to have source (and compiled) already
# parsed in memory. As such, runners will have load disproportionately
# spread, and may exhibit large variances in resource consumption.
#
# Sorting numerically is done because globbing sorts lexically---if runner
# 10 is spawned, then it would find itself after "1" in the list rather than
# after runner "9".
#
# If all runners are visible, then nothing will be returned.
get-available-runner-id()
{
local -r root=${1?Missing root}
grep -l 0 "$root"/*/busy \
| awk -F/ '{ print $(NF-1) }' \
| sort -n \
| head -n1
}
# Tell tamed to spawn a new runner and output the new runner id
#
# THIS FUNCTION MUST BE GUARDED BY A MUTEX! Otherwise there is a race
# between signaling and reading from `maxid'.
#
# This sens USR1 to tamed indicating that the next available runner should
# be spawned, and then waits on that expected runner. See `wait-for-runner'
# for more information on waiting.
spawn-runner-and-wait()
{
local -r root=${1?Missing root}
local -r pid=$( < "$root/pid" )
local -ri maxid=$( < "$root/maxid" )
# request runner
kill -USR1 "$pid"
# wait on the expected id
local -ri nextid=$(( maxid + 1 ))
wait-for-runner "$root" "$nextid"
echo "$nextid"
}
# Mark a runner as busy (unable to accept new commands)
#
# Once work is done, use `mark-available' to undo this operation.
mark-busy()
{
local -r base=${1?Missing runner base path}
echo 1 > "$base/busy"
}
# Mark a runner as available (able to accept new commands)
#
# Once work is available, use `mark-busy' to undo this operation.
mark-available()
{
local -r base=${1?Missing runner base path}
echo 0 > "$base/busy"
echo idle > "$base/cmdline"
# this can be used to determine how long the worker has been idle
millis > "$base/cmdstart"
}
# Output seconds and milliseconds, space-delimited
millis()
{
local date
date=( $(date '+%s %N') )
# %N returns nanoseconds and it may be 0-prefixed, which would be
# interpreted as octal without the explicit base specification
echo "${date[0]}" "$(( 10#"${date[1]}" / 1000000 ))"
}
# Append data to the runner table (runtab)
#
# This takes information about the most recently executed command and
# appends it to a table representing the work that the runner has
# done. This should be done at the end of processing a particular job but
# before marking the runner as available using `mark-available'.
#
# The columns of this report are, tab-delimited:
# 1. Start date (Unix timestamp, seconds);
# 2. Duration (milliseconds); and
# 3. Runner command line
runtab-append()
{
local -r base=${1?Missing runner base path}
local cmd duration
local -a cmdstart now
cmd=$(< "$base/cmdline")
cmdstart=( $(< "$base/cmdstart") )
now=( $(millis) )
# duration consists of seconds and nanoseconds; let's just deal with
# milliseconds, since any greater precision is not useful to us with how
# slow the system is today, and convert it into a decimal for
# reporting. Nanoseconds may be 0-prefixed, which will be interpreted as
# octal without an explicit base specification.
duration=$((
((now[0] * 1000) + now[1])
- ((cmdstart[0] * 1000) + cmdstart[1])
))
# the duration is in milliseconds
printf "%d\t%s\t%s\n" "$cmdstart" "$duration" "$cmd" >> "$base/runtab"
}
# Verify that a runner is available
#
# If the runner is offline or not owned by $UID, then exit with
# a non-zero status.
verify-runner()
{
local -r base="${1?Missing base}"
local -ri id="${2?Missing id}"
local -ri pid="${3?Missing pid}"
ps "$pid" &>/dev/null || {
echo "error: runner $id ($pid) is offline!" >&2
exit "$EX_NOTAMED"
}
test -O "$base/0" || {
echo "error: runner $id ($pid) is not owned by $USER!" >&2
exit "$EX_NOTAMED"
}
}
# Wait for command acknowledgment from runner
#
# The runner must respond within TAME_CMD_WAITTIME seconds
# and must echo back the command that was given. Otherwise,
# this function returns with a non-zero status.
verify-runner-ack()
{
local -r cmd="${1?Missing command}"
read -t"$TAME_CMD_WAITTIME" -r ack || return
test "COMMAND $cmd" == "$ack" || {
# TODO check for ack mismatch once output race condition is fixed
:
}
}
# Wait somewhat impatiently for a runner
#
# Assumes that the runner is ready once the pidfile becomes
# available. Polls for a maximum of six seconds before giving up
# and exiting with a non-zero status.
wait-for-runner()
{
local -r root=${1?Missing root}
local -r id=${2?Missing runner id}
# we could use inotify, but that is not installed by default
# on Debian systems, so let's just poll rather than introduce
# another dependency (give up after 6 seconds)
local -i i=12
while test $((i--)); do
test ! -f "$root/$id/pid" || return 0
sleep 0.5
done
# still not available
echo "error: runner $id still unavailable; giving up" >&2
exit "$EX_NOTAMED"
}
# Attempts to start tamed if it's not already running
#
# This is designed to be safe for parallel builds by allowing only the first
# process to start tamed and hanging the others until spawning is complete.
#
# See `_start-tamed' for more information.
start-tamed-safe()
{
local -r root=${1?Missing root}
local -ri timeout=5
local -r guard="$root-guard"
mkdir -p "$( dirname "$root" )"
(
flock -w $timeout 6 || {
echo "error: failed to acquire tamed spawning lock at $root" >&2
exit $EX_DLOCK
}
_start-tamed "$root"
flock -u 6
rm -f "$guard"
) 6>"$guard"
}
# Start tamed if it is not already running
#
# If tamed is already running, nothing will happen; otherwise, start
# tamed and wait impatiently for the runner to become available.
#
# Even if tamed is started, wait for runner 0 to become available;
# this ensures that tamed is initialized even if this script is run
# after tamed is started but before it has fully come online (e.g
# parallel make).
_start-tamed()
{
local -r root="${1?Missing root}"
local -ri pid=$( cat "$root/pid" 2>/dev/null )
ps "$pid" &>/dev/null || {
echo "starting tamed at $root..."
# tell tamed to clean up so that we eliminate race conditions
# with wait-for-tamed (this will also kill any stray processes
# that a previous tamed may have spawned but didn't get the
# chance to clean up)
kill-tamed "$root" || true
# start tamed and allow it to persist for future commands
"$mypath/tamed" "$root" & disown
}
# wait for tamed even if it was already started (just in
# case this script was executed right after tamed started
# but before it is done initializing)
wait-for-runner "$root" 0
}
# Kill tamed
#
# Ask tamed to kill itself.
kill-tamed()
{
local -r root="${1?Missing root}"
"$mypath/tamed" --kill "$root"
}
# Filter dslc output to essential information
#
# The original output of dslc is quite noisy; this filters it down
# to only errors and warnings.
#
# Eventually, dslc out to be modified to handle filtering its own
# output rather than wasting cycles doing this filtering.
saneout()
{
# the final line clears the entire line before outputting in an attempt to
# better accommodate the runner status line from tamed; this can be
# removed once the Makefile properly takes up this task.
awk '
/^~~~~\[begin /,/^~~~~\[end / { next }
/^rm / { next }
/^COMMAND / { next }
/^Exception|^\t+at / {
if ( /^E/ ) {
print;
print "Stack trace written to run-*.log";
}
next;
}
/([Ww]arning|[Nn]otice)[: ]/ { printf "\033[0;33m"; w++; out=1; }
/[Ff]atal:/ { printf "\033[0;31m"; out=1; }
/!|[Ee]rror:/ { printf "\033[0;31m"; e++; out=1; }
/internal:/ { printf "\033[0;35m"; out=1; }
/internal error:/ { printf "\033[1m"; out=1; }
/^[^[]/ || out { print; printf "\033[0;0m"; out=0; }
' | sed 's/^/\x1b[2K\r/'
}
# Output usage information and exit
usage()
{
cat <<EOF
Usage: $0 [-v|--verbose] cmdline
Or: $0 --kill
Send command line CMDLINE to a tamed runner. Start tamed if
not already running.
If a runner does not acknlowedge a request in TAME_CMD_WAITTIME
seconds, it will be reloaded and given TAME_CMD_WAITTIME seconds
to come online. After that time has elapsed, the command will
be re-attempted, timing out again after TAME_CMD_WAITTIME and
and at that point giving up.
The first available runner sorted numerically will be
chosen. This helps to give the same runners more work,
since they're more likely to have source (and compiled)
already parsed in memory. As such, runners will have load
disproportionately spread, and may exhibit large variances
in resource consumption.
If all runners are busy, then a new runner will be spawned,
allowing for parallel builds.
Options:
--help show this message
--kill kill tamed
-v, --verbose show runner logs
Environment Variables:
TAME_VERBOSE when greater than zero, show runner logs
(see also --verbose)
TAME_CMD_WAITTIME number of seconds to wait for ack from
runner (default 3)
EOF
exit $EX_USAGE
}
# Run tame
main()
{
local -r root=/run/user/$UID/tamed
local outcmd=saneout
test $# -gt 0 || usage
case "${1:-}" in
--kill) kill-tamed "$root"; exit;;
-v|--verbose) outcmd=cat; shift;;
--help) usage;;
esac
# alternative to --verbose
if [ "${TAME_VERBOSE:-0}" -ge 1 ]; then
outcmd=cat
fi
start-tamed-safe "$root"
# for now we only support a single runner
command-available-runner "$root" "$@" \
| "$outcmd"
}
main "$@"

672
bin/tamed 100755
View File

@ -0,0 +1,672 @@
#!/bin/bash
# Daemon for accepting TAME commands (compilers, linker, etc)
#
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
##
set -euo pipefail
declare mypath; mypath=$( dirname "$( readlink -f "$0" )" )
readonly mypath
declare -ri EX_RUNNING=1
declare -ri EX_NOTRUNNING=2 # tamed is not running
declare -ri EX_RUNTAB_LOCK=3 # failed to acquire aggregate runtab lock
declare -ri EX_RUNTAB_OUT=4 # failed to write to aggregate runtab
declare -ri EX_USAGE=64 # incorrect usage; sysexits.h
declare -ri EX_CANTCREAT=73 # cannot create file; sysexits.h
# number of seconds of output silence before runners are considered unused
# and are subject to termination (see stall-monitor)
declare -ri TAMED_STALL_SECONDS="${TAMED_STALL_SECONDS:-1}"
# id of process that indirectly spawned tamed (default $PPID)
declare -ri TAMED_SPAWNER_PID="${TAMED_SPAWNER_PID:-$PPID}"
# options to pass to JVM via dslc
declare -r TAMED_JAVA_OPTS="${TAMED_JAVA_OPTS:-}"
export JAVA_OPTS="$TAMED_JAVA_OPTS"
# set by `main', global for `cleanup' and `runner-report-all'
declare root=
# non-empty if in TUI (terminal UI) mode (use `in-tui-mode')
declare -r TAMED_TUI="${TAMED_TUI:-}"
declare tui_mode=
# file into which aggregate runner report will be placed (none if empty)
declare -r TAMED_RUNTAB_OUT="${TAMED_RUNTAB_OUT:-}"
# Create FIFOs for runner
#
# The FIFOs are intended to be attached to stderr and stdout
# of the runner and will be created relative to the given
# root path ROOT.
#
# If a FIFO cannot be created, exit with EX_CANTCREAT.
mkfifos()
{
local -r root="${1?Missing root path}"
mkdir -p "$root"
# note that there's no stderr; see `add-runner'
for n in 0 1; do
rm -f "$root-$n"
mkfifo -m 0600 "$root/$n" || {
log "fatal: failed to create FIFO at $root/n" >&2
exit $EX_CANTCREAT
}
done
# keep FIFOs open so we don't get EOF from writers
tail -f >"$root/0" &
}
# Output a line, clearing the remainder of the line if in TUI mode
log()
{
if in-tui-mode; then
echo -en "\e[2K"
fi
echo "$@"
}
# Spawn a new runner using the next available runner id
#
# See `spawn-runner' for more information.
spawn-next-runner()
{
local -r root="${1?Missing root path}"
# get the next available id
local -ri id=$( < "$root/maxid" )
spawn-runner "$(( id + 1 ))" "$root"
}
# Spawn a runner
#
# A new runner is created by spawning dslc and attaching
# new FIFOs under the given id ID relative to the given
# run path ROOT. The PID of the runner will be stored
# alongside the FIFOs in a pidfile `pid'.
spawn-runner()
{
local -ri id="${1?Missing id}"
local -r root="${2?Missing root run path}"
local -r base="$root/$id"
mkfifos "$base"
# flag as available (the client will manipulate these)
echo 0 > "$base/busy"
# runtab is used for reporting, which we will optionally aggregate
> "$base/runtab"
monitor-runner-runtab "$root" "$base/runtab" &
# monitor runner usage and kill when inactive
stall-monitor "$base" &
# loop to restart runner in case of crash
while true; do
declare -i job=0
trap 'kill -INT $job' HUP
"$mypath/dslc" < "$base/0" &> "$base/1" & job=$!
declare -i status=0
wait -n 2>/dev/null || status=$?
echo "warning: runner $id exited with code $status; restarting" >&2
done &
echo "$!" > "$base/pid"
# we assume that this is the new largest runner id
echo "$id" > "$root/maxid"
log "runner $id ($!): $base"
}
# Monitor the given runner runtab and append to the aggregate runtab
#
# The aggregate runtab is append-only and has a row-level lock to support
# concurrent writes without having to rely on kernel buffering.
monitor-runner-runtab()
{
local -r root="${1?Missing root run path}"
local -r runtab="${2?Missing runtab path}"
# no use in aggregating if it was not requested
test -n "$TAMED_RUNTAB_OUT" || return 0
while ! spawner-dead; do
# this is a shared file, and while buffering _should_ be sufficient, we
# may as well avoid potential headaches entirely by locking during the
# operation
tail -f "$runtab" | while read -r row; do
# we want to lock _per row write_, since output will be interleaved
# between all the runners
(
local -ri timeout=3
flock -w $timeout 7 || {
echo "error: failed to acquire lock on aggregate runtab" >&2
exit $EX_RUNTAB_LOCK
}
echo "$row" >&7
) 7>> "$TAMED_RUNTAB_OUT"
done
done
}
# Check that we can write to the provided runtab, and clear it
runtab-check-and-clear()
{
test -n "$TAMED_RUNTAB_OUT" || return 0
# clear the runtab, and see if we can write to it
>"$TAMED_RUNTAB_OUT" || {
echo "error: unable to write to '$TAMED_RUNTAB_OUT' (TAMED_RUNTAB_OUT)"
exit $EX_RUNTAB_OUT
}
echo "tamed: aggregating runner runtabs into '$TAMED_RUNTAB_OUT'"
}
# Kill runner at BASE when it becomes inactive for TAMED_STALL_SECONDS
# seconds
#
# This monitors the modification time on the stdout FIFO. stdin does not
# need to be monitored since dslc immediately echoes back commands it
# receives.
#
# dslc is pretty chatty at the time of writing this, so TAMED_STALL_SECONDS
# can easily be <=30s even for large packages. This may need to change in
# the future if it becomes too much less chatty. Increase that environment
# variable if runners stall unexpectedly in the middle of builds.
#
# If the id of the spawning process has been provided then we will never
# consider ourselves to be stalled if that process is still running. This
# prevents, for example, tamed from killing itself while a parent make
# process is still running.
stall-monitor()
{
local -r base="${1?Missing base}"
# monitor output FIFO modification time
while true; do
local -i since last
since=$( date +%s )
sleep "$TAMED_STALL_SECONDS"
last=$( stat -c%Y "$base/1" )
# keep waiting if there has been activity since $since
test "$last" -le "$since" || continue
spawner-dead || continue
# no activity; kill
local -r pid=$( cat "$base/pid" )
kill "$pid"
wait "$pid" 2>/dev/null
# this stall subprocess is no longer needed
break
done
}
# Check to see if the spawning process has died
#
# If no spawning process was provided, then this always returns a zero
# status. Otherwise, it returns whether the given pid is _not_ running.
spawner-dead()
{
test "$TAMED_SPAWNER_PID" -gt 0 || return 0
! ps "$TAMED_SPAWNER_PID" &>/dev/null
}
# Exit if tamed is already running at path ROOT
#
# If tamed is already running at ROOT, exit with status
# EX_RUNNING; otherwise, do nothing except output a warning
# if a stale pid file exists.
abort-if-running()
{
local -r root="${1?Missing root rundir}"
local -ri pid=$( cat "$root/pid" 2>/dev/null )
test "$pid" -gt 0 || return 0
! ps "$pid" &>/dev/null || {
log "fatal: tamed is already running at $root (pid $pid)!" >&2
exit $EX_RUNNING
}
test -z "$pid" || {
log "warning: clearing stale tamed (pid $pid)" >&2
}
}
# Exit with EX_NOTRUNNING if tamed is not running at path ROOT
#
# ROOT must both exist and contain a `pid` file of a running process.
abort-if-not-running()
{
local -r root="${1?Missing root rundir}"
test -d "$root" || {
log "tamed is not running at $root: path does not exist" >&2
exit $EX_NOTRUNNING
}
local -ri pid=$( cat "$root/pid" 2>/dev/null )
# this should not happen unless bash crashed
ps "$pid" &>/dev/null || {
log "tamed is not running at $root: process $pid has terminated" >&2
exit $EX_NOTRUNNING
}
}
# Kill running tamed at path ROOT
#
# If no pidfile is found at ROOT, do nothing. This sends a
# signal only to the parent tamed process, _not_ individual
# runners; the target tamed is expected to clean up itself.
# Consequently, if a tamed terminated abnormally without
# cleaning up, this will not solve that problem.
#
# Note that this is also called by tame to clean up an old tamed
# before spawning a new one.
kill-running()
{
local -r root="${1?Missing root}"
test -d "$root" || return 0
local -r pid=$( < "$root"/pid 2>/dev/null )
test -n "$pid" || return 0
log "killing tamed at $root ($pid)..."
kill "$pid"
}
runner-report-all()
{
local -r root="${1?Missing root}"
abort-if-not-running "$root"
for-each-runner "$root" runner-report
}
for-each-runner()
{
local -r root="${1?Missing root}"
local -r cmd="${2?Missing command}"
shift 2
local -ri maxid=$(cat "$root/maxid")
echo "tamed is running at $root with $((maxid+1)) runner(s)"
for runner in $(seq 0 "$maxid"); do
echo
"$cmd" "$root" "$@" "$runner"
done
}
# Report on the status and current operation of each runner
#
# This report is generated by tamed rather than delegating to the runners
# themselves to avoid the complexity of mitigating output races.
runner-report()
{
local -r root="${1?Missing root}"
local -ri id="${2?Missing runner id}"
local -r path="$root/$id"
test -f "$path/cmdline" || return 0
local cmdline=$(< "$path/cmdline" )
local -a cmdstart cmdstart_fmt
cmdstart=( $(< "$path/cmdstart" ) )
cmdstart_fmt=$(date --date=@"${cmdstart[0]}" +%Y-%m-%dT%H:%M:%S)
local -i now=$(date +%s)
cat <<EOF
runner: $id
command: $cmdline
start: ${cmdstart[0]}.${cmdstart[1]} ($cmdstart_fmt)
elapsed: $((now - cmdstart)) seconds
EOF
}
elide-paths()
{
local -r cols="${1?Missing columns}"
local -r buffer="${2?Missing buffer}"
# first, keep the first letter and last three of each dir, if doing so
# would remove three or more characters; for example:
# "suppliers/foobarbaz/quux/quuux.xmlo" => "s…ers/f…baz/quux/quuux.xmlo"
result=$(
echo "$buffer" \
| sed 's|\([a-zA-Z0-9_-]\)[a-zA-Z0-9_-]\{3,\}\([a-zA-Z9-9_-]\{3\}\)/|\1…\2/|g'
)
[ "${#result}" -gt $cols ] || {
echo -n "$result"
return
}
# more aggressive: remove all but the first letter if it would save at
# least three characters, as in:
# "suppliers/foobarbaz/quux/quuux.xmlo" => "s…/f…/quux/quuux.xmlo"
result=$(
echo "$buffer" | sed 's|\([a-zA-Z0-9_-]\)[^ /]\{3,\}/|\1…/|g'
)
[ "${#result}" -gt $cols ] || {
echo -n "$result"
return
}
# even more aggressive: elide all but the filename, as in:
# "suppliers/foobarbaz/quux/quuux.xmlo" => "…/quuux.xmlo"
result=$(
echo "$buffer" | sed 's|[a-zA-Z0-9_-/]*/|…/|g'
)
[ "${#result}" -gt $cols ] || {
echo -n "$result"
return
}
# at this point, it's better to provide _some_ useful information for
# _some_ runners, so just truncate the previous result (we probably have
# too many runners for the current terminal width)
echo -n "${result::$((cols-1))}…"
}
# Report of all runners' status on a single line
#
# Idle runners are not output for now, since that increases the likelihood
# that we will not output something when runners are done doing their jobs
# (including overwriting the PS1).
runner-report-line() {
local -r root="${1?Missing root}"
# buffer output so that our report does not get mixed with normal
# runner output
local buffer=$( runner-report-all "$root" | awk '
/^command: idle/,/^$/ { next } # skip idle
/^command:/ { printf "[%s ", $NF } # e.g. "[foo/bar.xmlo "
/^elapsed:/ { printf "%ds] ", $2 } # e.g. "2s] "
' )
# ensure proper empty output without formatting if there is no line
test -n "$buffer" || return 0
# bash has checkwinsize, but that runs after every command; try to use
# tput, defaulting to 80. Note that we have to check this every time, in
# case the terminal has been resized.
local -ri cols=$(tput cols || echo 80)
# rather than worrying about line wrapping, fit to one line
if [[ "${#buffer}" -gt $cols ]]; then
buffer=$(elide-paths $cols "$buffer")
fi
# output in bold, overwrite our line that may already be present here, and
# place cursor at beginning of the line so any runner output will
# overwrite
echo -en "\e[1m$buffer\e[0m\r"
}
# Clean up child processes before exit
#
# This should be called before exit (perhaps by a trap). Kills
# the entire process group.
#
# Do not attach this to a SIGTERM trap or it will infinitely
# recurse.
cleanup()
{
rm -rf "$root"
kill 0
}
# Output usage information and exit
usage()
{
cat <<EOF
Usage: $0 [--kill] [runpath]
Start tamed and runners. Do not fork into background process.
The default value of RUNPATH is \`/run/user/$UID/tamed'.
Only one runner is currently supported. tamed exits once all
runners have terminated. Runners will be killed once they are
inactive for at least TAMED_STALL_SECONDS (default 1), unless
the process identified by TAMED_SPAWNER_PID is still running.
For example, a build script may wish to set TAMED_SPAWNER_PID
to the process id of make itself. It defaults to the actual
parent process id (PPID), so tamed will not kill itself if
run manually on a shell (unless the shell exits first).
TAMED_RUNTAB_OUT can specify a file in which to write job
start times (as seconds from the Unix epoch); durations
(in milliseconds); and commands from each of the runners.
The table is tab-delimited. Here are some useful examples:
# format nicely into columns and view in pager
$ column runtab | less
# sort by runtime descending (second column)
$ sort -rnk2 runtab
# take the runtime and command columns
$ cut -2,3 runtab
# convert milliseconds into minutes (!) and sort desc
$ awk '{ \$2 = \$2 / 1000 / 60; print }' runtab | sort -nrk2
# convert to CSV (assuming no quoting is needed)
$ tr '\t' , < runtab > runtab.csv
Options:
--help show this message
--kill kill a runing tamed at path RUNPATH
--report display runner report (this is subject to change
in later versions)
Environment Variables:
TAMED_STALL_SECONDS number of seconds of runner inactivity before
runner is automatically killed (default 1)
TAMED_SPAWNER_PID inhibit stalling while this process is running
(default PPID)
TAMED_JAVA_OPTS opts to pass to dslc, and in turn, the JVM
TAMED_TUI run in TUI mode (provide UI features like a
dynamic runner status line)
TAMED_RUNTAB_OUT file into which aggregate runner report will
be written (otherwise reports are only
available per-runner while tamed is running)
EOF
exit $EX_USAGE
}
# Determine whether to enable TUI mode
#
# TUI (terminal UI) mode will augment the output with features that only
# make sense when running on a user's terminal, such as the runner status
# line.
tui-check()
{
test "$TAMED_TUI" == 1 || return 0
tui_mode=1
log "tamed is running in TUI mode (TAMED_TUI=0 to disable)"
}
# Whether we're running in TUI mode
in-tui-mode()
{
test -n "$tui_mode"
}
# If in TUI mode, continuously update the last line of output with runner
# status
#
# This is not an easy undertaking with how our build process currently
# works. Make is responsible, currently, for echoing lines, and so we must
# frequently re-echo our status line in an attempt to redisplay the line
# after it is overwritten.
#
# Further, most output is unaware that the entire line needs to be
# overwritten; if output is not properly transformed in the Makefile, then
# portions of the status line may remain in the history, partly overwritten
# by build output.
#
# Another concern is that we do not want to keep outputting after the
# process is finished, which would overwrite the PS1. To try to avoid this,
# we omit idle runner output and only clear the line _once_ when the status
# line is empty, in the hope that all runners will be idle for long enough
# before the build completes, make exists, exits, and the PS1 is output.
#
# If not in TUI mode, this does nothing.
tui-runner-status-line()
{
in-tui-mode || return 0
local cache= cleared=
while ! spawner-dead; do
# this will fail if no runners have been created yet, so just ignore
# it; if we fail to output the status line, the build will still work
cache=$(runner-report-line "$root" 2>/dev/null)
# if the line is empty, clear the output _once_ (to get rid of
# whatever was there before), but do not do it again, otherwise we
# risk overwriting lines post-build (like the PS1 or late-stage make
# targets).
if [ -z "$cache" -a -z "$cleared" ]; then
log -n ""
cleared=1
sleep 1
continue
fi
cleared=
# output the cache frequently to try to overcome build output
for i in {0..9}; do
log -n "$cache"
sleep 0.1
done
done
}
# Run tamed
main()
{
local kill= report=
case "${1:-}" in
--kill) kill=1; shift;;
--report) report=1; shift;;
--help) usage;;
esac
root="${1:-/run/user/$UID/tamed}"
# report requested
test -z "$report" || {
runner-report-all "$root"
exit
}
# kill if requested
test -z "$kill" || {
kill-running "$root"
exit
}
abort-if-running "$root"
tui-check
runtab-check-and-clear
# clean up background processes before we exit
trap exit TERM
trap cleanup EXIT
# start fresh
rm -rf "$root"; mkdir -p "$root"
local -i pid=$$
echo $pid > "$root/pid"
# start with a single runner; we'll spawn more if requested
spawn-runner 0 "$root"
trap "spawn-next-runner '$root'" USR1
# status line reporting on runners for TUI mode
tui-runner-status-line &
# wait for runners to complete or for a signal to be received by this
# process that terminates `wait'
while true; do
wait -n || {
status=$?
# ignore USR{1,2}
if [ $status -ne 138 -a $status -ne 140 ]; then
exit $status
fi
}
done
}
main "$@"

37
bootstrap 100755
View File

@ -0,0 +1,37 @@
#!/bin/bash
# Bootstrap from source repository
#
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
##
set -euo pipefail
export SAXON_CP="${SAXON_CP?Missing path to saxon9he.jar}"
export RATER_CLASSPATH="${RATER_CLASSPATH:-$SAXON_CP}"
export HOXSL="${HOXSL?Missing path to hoxsl}"
test "${1:-}" = -n || git submodule update --init --recursive
(
cd progtest \
&& { which npm && npm install || true; } \
&& ./autogen.sh && ./configure
) \
&& ( cd tamer && ./bootstrap ) \
&& { test -e hoxsl || test -L hoxsl || ln -s ../hoxsl; } \
&& autoreconf -fvi \
&& ./configure \

View File

@ -1,215 +0,0 @@
# @configure_input@
#
# Compiles packages written in the Calc DSL.
#
# Copyright (C) 2018 R-T Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
# Note that this build process is unconventional in order to avoid the startup
# costs that would be associated with executing dslc with each and every package
# (see the other Makefile for more information). Therefore, everything is
# written to .cqueue for later processing by dslc.
#
# The issue of re-building based on timestamps---which Make would normally take
# care of exclusively---must also be given special care now that we are handling
# the building separately from Make. Each enqueued request also touches the
# destination file to update its timestamp, ensuring that it is seen by Make as
# modified (as if it were compiled) and therefore will trigger the building of
# the targets that depend upon it. In the case of the object files (xmlo), a
# temporary file is created when it is enqueued. As part of the queued request
# for compilation is a request to delete this temporary file. In the event that
# the build fails, this temporary file will be seen and will force a rebuild of
# the file, despite its timestamp.
#
# The same issue does not exist for xmle, js, and html files, since they have
# linear dependency trees and dslc will rm the file on failure, which
# obliterates the timestamp.
# #
path_rates := $(path_suppliers)/rates
path_map := map
path_c1map := $(path_map)/c1
path_dsl := rater
path_ui := ui
path_suppliers := suppliers
path_lv := lovullo
path_srv := srv
src_suppliers := $(wildcard $(path_suppliers)/*.xml)
src_map := $(wildcard $(path_map)/*.xml)
src_c1map := $(wildcard $(path_c1map)/*.xml)
dest_summary_html := $(patsubst \
$(path_suppliers)/%.xml, \
$(path_suppliers)/%.html, \
$(src_suppliers))
dest_standalone := $(patsubst \
$(path_suppliers)/%.xml, \
$(path_suppliers)/%.js, \
$(src_suppliers))
dest_map := $(patsubst \
$(path_map)/%.xml, \
$(path_map)/%.xmle, \
$(src_map))
dest_c1map := $(patsubst \
$(path_c1map)/%.xml, \
$(path_c1map)/%.php, \
$(src_c1map))
compiled_suppliers := $(src_suppliers:.xml=.xmlo)
linked_suppliers := $(src_suppliers:.xml=.xmle)
comma := ,
extless_supp_delim := $(subst .xml,,$(subst .xml ,$(comma),$(src_suppliers)))
cqueue=.cqueue
ant = @ANT@ -e
.DELETE_ON_ERROR:
.PHONY: default clean \
interp-rate-tables summary-html c1map \
standalones program-ui program-ui-immediate program-data-copy \
do-build version FORCE
# these files will never be deleted when Make considers them to be intermediate
# (e.g. when building summary pages), since they are still needed or take a
# while to build
.PRECIOUS: %.js %.xml %.xmle %.xmlo
SHELL = /bin/bash -O extglob
default: program-ui c1map FORCE
program-ui: standalones ui/package.js ui/Program.js program-ui-immediate
program-ui-immediate: ui/html/index.phtml
include suppliers.mk
# starts with a fresh cqueue
prexmlo:
@>$(cqueue)
summary-html: $(dest_summary_html) ;
%.html: %.js
@echo "summary $*.xmle $@" >>.cqueue
@touch $@
standalones: $(dest_standalone)
%.xmle: %.xmlo
@echo "link $< $@" >>.cqueue
@touch $@
%.js: %.xmle
@echo "standalone $< $@" >>.cqueue
@touch $@
# C1 XML (specific recipes are in suppliers.mk)
c1map: $(dest_c1map)
%.dot: %.xmlo
@echo "dot $< $@" >> .cqueue
%.dote: %.xmle
@echo "dot $< $@" >> .cqueue
%.svg: %.dote
dot -Tsvg "$<" > "$@"
%.svg: %.dot
dot -Tsvg "$<" > "$@"
%.xml: %.dat
rater/tools/tdat2xml $< > $@
%.xml: %.typelist
rater/tame/build-aux/list2typedef $(*F) < $< > $@
%.csvo: %.csvm
rater/tools/csvm2csv $< > $@
%.csvo: %.csvi
rater/tools/csvi $< > $@
%.csvo: %.csv
cp $< $@
%.xml: %.csvo
rater/tools/csv2xml $< > $@
version: .version.xml
.version.xml: FORCE
git log HEAD^.. -1 --pretty=format:'<version>%h</version>' > .version.xml
ui/program.expanded.xml: ui/program.xml | .version.xml
@echo "progui-expand $< $@" >> .cqueue
ui/Program.js: ui/program.expanded.xml ui/package.js
@echo "progui-class $< $@ include-path=../../../ui/" >> .cqueue
ui/html/index.phtml: ui/program.expanded.xml
@echo "progui-html $< $@ out-path=./" >> .cqueue
ui/package-dfns.xmlo: ui/package-dfns.xml
ui/package-dfns.xml: ui/program.expanded.xml
@echo "progui-pkg $< $@" >> .cqueue
ui/package-map.xmlo: ui/package-map.xml
ui/package-map.xml: ui/program.expanded.xml ui/package-dfns.xml
@echo "progui-pkg-map $< $@" >> .cqueue
# for the time being, this does not depend on clean-rate-tables because $(ant) will
specs:
$(MAKE) -C doc/specs
#
# this will eventually go away once we don't have X-repo klugery
# for the time being, this does not depend on clean-rate-tables because ant will
# run it
clean:
find $(path_suppliers) $(path_map) $(path_c1map) common/ rater/core rater/lv \( \
-name '*.xmlo' \
-o -name '*.xmle' \
-o -name '*.js' \
-o -name '*.html' \
-o -name '*.dep' \
-o -name '*.tmp' \
\) -exec rm -v {} \;
rm -rf $(path_ui)/package-dfns.* \
$(path_ui)/package-map.* \
$(path_ui)/program.expanded.xml \
$(path_ui)/include.js \
$(path_ui)/Program.js \
$(path_ui)/html
find . -path '*/tables/*.csvm' -o -path '*/territories/*.dat' \
| sed 's/\.csvm$$/\.xml/; s/\.dat$$/\.xml/' \
| xargs rm -fv
# generates a Makefile that will properly build all package dependencies; note
# that territory and rate packages also have includes; see top of this file for
# an explanation
suppliers.mk:
$(ant) pkg-dep \
&& mv $(path_ui)/program.dep $(path_ui)/package-dfns.dep
xmlo_cmd='@echo "validate $$(patsubst %.tmp,%.xml,$$<) $$@" >> .cqueue \
&& echo "compile $$(patsubst %.tmp,%.xml,$$<) $$@" >> .cqueue \
&& echo "rm $$(patsubst %.xmlo,%.tmp,$$@)" >> .cqueue \
&& touch $$@ \
&& touch -d +1sec $$(patsubst %.xmlo,%.tmp,$$@) >> .cqueue' \
./rater/tame/build-aux/gen-make common/ $(path_suppliers)/ $(path_dsl)/ $(path_map)/ $(path_ui)/ >$@ \
&& ./rater/tame/build-aux/gen-c1make $(path_c1map)/*.xml >>$@
me-a-sandwich:
@test $$EUID -eq 0 \
&& echo 'You actually ran me as root? Are you insane!?' \
|| echo 'Make it yourself.'
# simply forces a job to run, thereby forcing the invocation of the secondary
# Makefile (this is not explicitly required, because of prepare, but signifies
# intent and is self-documenting)
FORCE: ;

View File

@ -2,7 +2,7 @@
#
# TAME Makefile
#
# Copyright (C) 2018 R-T Specialty, LLC.
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
@ -16,85 +16,295 @@
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
# This fragment exists as a kluge to provide support for running a command
# after all targets have been run (in this case, dslc).
#
# A list of everything to be compiled is output into .cqueue, which is then
# picked up by dslc; this avoids the overhead of starting the JVM,
# recompiling XSL stylesheets, etc, which is quite substantial.
#
# !!! Unfortunately, this does not yet support parallel job execution.
##
path_rates := $(path_suppliers)/rates
path_map := map
path_c1map := $(path_map)/c1
path_dsl := rater
path_dsl := $(CALCROOT)
path_tame := $(path_dsl)/tame
path_ui := ui
path_tests := test
path_suppliers := suppliers
path_lv := lovullo
path_srv := srv
path_lvroot := lvroot
path_c1root := c1root
path_common := common
path_intralov_root := "intralov-root/@program@"
.PHONY: FORCE prepare program-data-copy lvroot program-ui-immediate test
src_suppliers := $(wildcard $(path_suppliers)/*.xml)
src_map := $(wildcard $(path_map)/*.xml)
src_c1map := $(wildcard $(path_c1map)/*.xml)
JAVA_HEAP_SIZE ?= 5120M
JAVA_STACK_SIZE ?= 5M
src_common := $(shell find $(path_common) -name '*.xml')
xmlo_common := $(patsubst %.xml, %.xmlo, $(src_common))
# Intended to be (optionally) overridden from the command line
SUPPLIERS=$(src_suppliers) $(path_ui)/package.xml
suppliers_strip=$(patsubst %.xml, %.strip.js, $(SUPPLIERS))
dest_summary_html := $(patsubst \
$(path_suppliers)/%.xml, \
$(path_suppliers)/%.html, \
$(src_suppliers))
dest_standalone := $(patsubst \
$(path_suppliers)/%.xml, \
$(path_suppliers)/%.js, \
$(src_suppliers))
dest_standalone_strip := $(patsubst \
$(path_suppliers)/%.js, \
$(path_suppliers)/%.strip.js, \
$(dest_standalone))
dest_map := $(patsubst \
$(path_map)/%.xml, \
$(path_map)/%.xmle, \
$(src_map))
dest_c1map := $(patsubst \
$(path_c1map)/%.xml, \
$(path_c1map)/%.php, \
$(src_c1map))
# Program fragments combined to form one large program.expanded.xml
# TODO: Move into liza-proguic
program_fragments=$(shell \
find $(path_ui)/program/ -name '*.xml' 2>/dev/null \
| LC_ALL=C sort \
| tr '\n' ' ' \
)
# Packages associated with each program step.
# TODO: Move into liza-proguic
package_dfns_pkgs = $(shell \
find $(path_ui)/package/ -name 'progui-pkg-*.xml' 2>/dev/null 2>/dev/null \
| LC_ALL=C sort \
| tr '\n' ' ' \
)
package_dfns_xmlos = $(patsubst %.xml, %.xmlo, $(package_dfns_pkgs))
# Dependencies for suppliers.mk.
src_suppliersmk = $(src_common) \
$(shell find $(path_suppliers) ui/package ui/map rater/core -name '*.xml') \
$(program_fragments) \
ui/program.xml \
ui/package-dfns.xml
compiled_suppliers := $(src_suppliers:.xml=.xmlo)
linked_suppliers := $(src_suppliers:.xml=.xmle)
comma := ,
extless_supp_delim := $(subst .xml,,$(subst .xml ,$(comma),$(src_suppliers)))
ant = @ANT@ -e
.PHONY: FORCE default program-data-copy lvroot c1root test \
default clean interp-rate-tables summary-html c1map standalones common \
strip program-ui version FORCE
default: program-ui c1map FORCE
.DELETE_ON_ERROR:
# less verbose output; all to runlog
define saneout
time -f 'total time: %E' awk ' \
BEGIN { e=0; w=0; } \
{ printf "[%d] ", systime() >> ".runlog"; print >> ".runlog"; } \
/^~~~~\[begin /,/^~~~~\[end / { next } \
/^rm / { next } \
/^Exception|^\t+at / { \
if ( /^E/ ) { \
print; \
print "Stack trace written to .runlog"; \
} \
next; \
} \
/[Ww]arning:|[Nn]otice:/ { printf "\033[0;33m"; w++; out=1; } \
/[Ff]atal:/ { printf "\033[0;31m"; out=1; } \
/!|[Ee]rror:/ { printf "\033[0;31m"; e++; out=1; } \
/internal:/ { printf "\033[0;35m"; out=1; } \
/internal error:/ { printf "\033[1m"; out=1; } \
/^[^[]/ || out { print; printf "\033[0;0m"; out=0; } \
END { printf "%d error(s); %d warning(s).\n", e, w; } \
'
endef
# keep all intermediate files for easy introspection
.SECONDARY:
define _go
touch .cqueue \
&& ( test -s .cqueue || echo "Nothing to be done for \`$@'." ) \
&& echo "$(JAVA_HEAP_SIZE) $(JAVA_STACK_SIZE)" \ \
&& CLASSPATH="$(RATER_CLASSPATH):rater/src/dslc.jar" \
$(JAVA) -Xmx$(JAVA_HEAP_SIZE) -Xss$(JAVA_STACK_SIZE) \
com.lovullo.dslc.DslCompiler < .cqueue 2>&1 \
| $(saneout); \
exit $${PIPESTATUS[0]}; \
@>.cqueue
endef
SHELL = /bin/bash -O extglob -O nullglob
SHELL = /bin/bash -O extglob
# propagate to tame{,d}
export TAME_CMD_WAITTIME
export TAMED_STALL_SECONDS
export TAMED_JAVA_OPTS
export TAMED_TUI
export TAMED_RUNTAB_OUT
TAMED_SPAWNER_PID=$(shell echo $$PPID)
export TAMED_SPAWNER_PID
# Optional timestamping for TAME commands
TS = 0
TS_FMT=%s
tamed_clear__1 = @printf '\e[2K' # clear line
tame__ts_0 = $(tamed_clear__$(TAMED_TUI)) # clear line if TUI
tame__ts_1 = @printf '[%($(TS_FMT))T] '
TAME_TS = $(tame__ts_$(TS))
all: program-data-copy
program-ui-immediate:
@>.cqueue
@$(MAKE) --no-print-directory -f Makefile.2 program-ui-immediate
@$(MAKE) program-data-copy
@$(_go)
# Building all common files is useful in a distributed pipeline so that
# suppliers can be concurrently built without rebuilding common dependencies
common: $(xmlo_common)
program-data-copy:
@>.cqueue
@$(MAKE) --no-print-directory -f Makefile.2 standalones program-ui c1map
@$(_go)
program-ui: ui/package.strip.js ui/Program.js ui/html/index.phtml
# Handle an intermediate step as we transition to the new compiler.
# If a source file is paired with an `*.experimental` file with the same
# stem, then it will trigger compilation using `xmlo-experimental`. The
# file may contain additional arguments to the pass to the compiler.
%.xmli: %.xml %.experimental
$(path_tame)/tamer/target/release/tamec --emit xmlo-experimental $$(grep -v '^#' $*.experimental) -o $@ $<
%.xmli: %.xml
$(path_tame)/tamer/target/release/tamec --emit xmlo -o $@ $<
%.graphml: %.xmlo
$(TAME_TS)
$(path_tame)/tamer/target/release/tameld --emit graphml -o $@ $<
# Individual dependencies appear in suppliers.mk (see below)
%.xmlo: %.xmli $(path_tame)/.rev-xmlo
$(TAME_TS)
$(TAME) compile $< $@ $(TAME_PARAMS)
# Note the `$()' here to prevent Automake from inlining this file---it is
# to be generated when imports change, which can be at any time.
include $()suppliers.mk
summary-html: $(dest_summary_html) ;
%.html: %.js %.xmle
$(TAME_TS)
$(TAME) summary $*.xmle $@ $(TAME_PARAMS)
standalones: $(dest_standalone)
strip: $(dest_standalone_strip) ui/package.strip.js
%.xmle: %.xmlo $(path_tame)/.rev-xmle
$(TAME_TS)
$(path_tame)/tamer/target/release/tameld --emit xmle -o $@ $<
%.js: %.xmle
$(TAME_TS)
$(TAME) standalone $< $@ $(TAME_PARAMS)
%.strip.js: %.js
cp $< $@
$(path_tame)/tools/strip $@
# C1 XML (specific recipes are in suppliers.mk)
c1map: $(dest_c1map)
%.dot: %.xmlo
$(TAME_TS)
$(TAME) dot $< $@ $(TAME_PARAMS)
%.dote: %.xmle
$(TAME_TS)
$(TAME) dot $< $@ $(TAME_PARAMS)
%.neo4j: %.xmlo
$(TAME) neo4j $< $@ $(TAME_PARAMS)
%.neo4je: %.xmle
$(TAME) neo4j $< $@ $(TAME_PARAMS)
%.svg: %.dote
dot -Tsvg "$<" > "$@"
%.svg: %.dot
dot -Tsvg "$<" > "$@"
# These are deprecated and will be removed in a future version of TAME (in
# favor of CSVM tables).
%.xml: %.dat rater/core/tdat.xmlo rater/tools/tdat2xml
rater/tools/tdat2xml $< > $@
%.xml: %.typelist rater/tame/build-aux/list2typedef
rater/tame/build-aux/list2typedef $(*F) < $< > $@
%.csvo: %.csvm rater/tools/csvm2csv
rater/tools/csvm2csv $< > $@
%.csvo: %.csvi rater/tools/csvi
rater/tools/csvi $< > $@
%.csvo: %.csv
cp $< $@
%.xml: %.csvo rater/tools/csv2xml
rater/tools/csv2xml $< > $@
# All lookup tables rely on rater/core/vector/package. This rule applies to
# xmlo files only when there is a corresponding csvo file. Note that this
# relies on .SECONDARY above to work properly.
#
# TODO: This is necessary right now because of the current depgen
# process. Once that is eliminated in favor of individual dependency files
# (e.g. the %.d convention), this can go away since dependency generation
# can properly take place for the various file formats.
%.xmlo: %.csvo rater/core/vector/table.xmlo
# This target is always run, but only update the file (and thus its
# timestamp) if the hash actually changes, so that we do not rebuild any
# dependencies unnecessarily.
version: .version.xml
.version.xml: FORCE
git log HEAD^.. -1 --pretty=format:'<version>%h</version>' > $@.new
cmp $@ $@.new || mv $@.new $@
$(RM) $@.new
ui/program.expanded.xml: ui/program.xml $(program_fragments) .version.xml
$(TAME_TS)
$(TAME) progui-expand $< $@ $(TAME_PARAMS)
ui/Program.js: ui/program.expanded.xml ui/package.js
$(TAME_TS)
$(TAME) progui-class $< $@ include-path=$$(pwd)/ui/ $(TAME_PARAMS)
ui/html/index.phtml: ui/program.expanded.xml
$(TAME_TS)
$(TAME) progui-html $< $@ out-path=./ $(TAME_PARAMS)
ui/package-dfns.xmlo: ui/package-dfns.xml $(package_dfns_xmlos)
ui/package-dfns.xml: ui/program.expanded.xml
$(TAME_TS)
$(TAME) progui-pkg $< $@ $(TAME_PARAMS)
$(package_dfns_pkgs): ui/package-dfns.xml
ui/package-map.xmlo: ui/package-map.xml ui/package-dfns.xmlo $(package_dfns_xmlos)
ui/package-map.xml: ui/program.expanded.xml ui/package-dfns.xml
$(TAME_TS)
$(TAME) progui-pkg-map $< $@ $(TAME_PARAMS)
# for the time being, this does not depend on clean-rate-tables because $(ant) will
specs:
$(MAKE) -C doc/specs
# for the time being, this does not depend on clean-rate-tables because ant will
# run it
clean:
find -L $(path_suppliers) $(path_map) $(path_c1map) common/ rater/core rater/lv \( \
-name '*.xmlo' \
-o -name '*.xmle' \
-o -name '*.xmli' \
-o -name '*.js' \
-o -name '*.html' \
-o -name '*.dep' \
-o -name '*.tmp' \
-o -name '*.php' \
\) -exec rm -v {} \;
rm -rf $(path_ui)/package-dfns.* \
$(path_ui)/package-map.* \
$(path_ui)/program.expanded.xml \
$(path_ui)/include.js \
$(path_ui)/Program.js \
$(path_ui)/html
find . -path '*/tables/*.csvm' -o -path '*/territories/*.dat' \
| sed 's/\.csvm$$/\.xml/; s/\.dat$$/\.xml/' \
| xargs rm -fv
# A target to be optionally overridden by `bootstrap.mk`.
.PHONY: bootstrap-if-necessary
bootstrap-if-necessary: FORCE
# Targets intended to be run before the generation of `suppliers.mk`.
# This should be used to re-bootstrap the system if necessary
# (see `bootstrap-if-necessary` target).
-include bootstrap.mk
# Generates a Makefile that will properly build all package
# dependencies. The redirect of ant to /dev/null is because it's still too
# noisy even with -q---the "BUILD SUCCESSFUL" line is confusing, considering
# it's merely a small part of a broader build.
suppliers.mk: $(src_suppliersmk) | bootstrap-if-necessary
$(ant) -q pkg-dep >/dev/null
find $(path_ui)/program/ -name '*.dep' | xargs cat $(path_ui)/program.dep | sort -u \
> $(path_ui)/package-dfns.dep
$(RM) $(path_ui)/program.dep
$(path_dsl)/tame/build-aux/gen-make $(SRCPATHS) > $@
test ! -d $(path_c1map) || $(path_dsl)/tame/build-aux/gen-c1make $(path_c1map)/*.xml >> $@
# TODO: There is a potential for conflict in copying files to
# src/node/programs/rater/programs/@program@. Note that the `for' loop is
# used here to handle the situation where no such files exist.
program-data-copy: standalones program-ui c1map .version.xml
mkdir -p "$(path_lv)/src/node/program/rater/programs/@program@"
mkdir -p "$(path_lv)/src/node/program/classify"
mkdir -p "$(path_lv)/src/node/program/ui/custom"
@ -111,7 +321,10 @@ program-data-copy:
"$(path_lv)/src/node/program/ui/custom/"
cp -v "$(path_srv)/rater.js" \
"$(path_lv)/src/node/program/rater/programs/@program@.js"
cp -v "$(path_ui)/package.js" \
for f in "$(path_srv)/"!(rater).js; do \
cp -v "$$f" "$(path_lv)/src/node/program/rater/programs/@program@/"; \
done
cp -v "$(path_ui)/package.strip.js" \
"$(path_lv)/src/node/program/classify/@program@.js"
cp -v "$(path_ui)/"{Program,include,package}.js \
"$(path_lv)/src/_gen/scripts/program/@program@/"
@ -120,11 +333,11 @@ program-data-copy:
cp -v "$(path_suppliers)/"*.js \
"$(path_lv)/src/node/program/rater/programs/@program@"
test ! -d "$(path_c1map)" || cp -v "$(path_c1map)/"*.php \
"$(path_lv)/src/lib/c1/interfaces/c1/contract/@program@/"
"@C1_IMPORT_MAPDEST@/@program@/"
ant -f "$(path_lv)/build.xml" js-mod-order
# TODO: merge this and the above
lvroot: prepare
lvroot: summary-html program-ui c1map strip
mkdir -p "$(path_lvroot)/src/node/program/rater/programs/@program@"
mkdir -p "$(path_lvroot)/src/node/program/classify"
mkdir -p "$(path_lvroot)/src/node/program/ui/custom"
@ -134,9 +347,12 @@ lvroot: prepare
mkdir -p "$(path_lvroot)/src/lib/c1/interfaces/c1/contract/@program@"
cp -v "$(path_srv)/rater.js" \
"$(path_lvroot)/src/node/program/rater/programs/@program@.js"
for f in "$(path_srv)/"!(rater).js; do \
cp -v "$$f" "$(path_lvroot)/src/node/program/rater/programs/@program@/"; \
done
cp -v "$(path_suppliers)/"*.js \
"$(path_lvroot)/src/node/program/rater/programs/@program@"
cp -v "$(path_ui)/package.js" \
cp -v "$(path_ui)/package.strip.js" \
"$(path_lvroot)/src/node/program/classify/@program@.js"
cp -v "$(path_ui)/"{Program,include,package}.js \
"$(path_lvroot)/src/_gen/scripts/program/@program@/"
@ -149,34 +365,40 @@ lvroot: prepare
test ! -d "$(path_c1map)" || cp -v "$(path_c1map)/"*.php \
"$(path_lvroot)/src/lib/c1/interfaces/c1/contract/@program@/"
# used by newer systems (note that lvroot still contains the c1map files so
# as not to break BC)
c1root: c1map
mkdir -p "$(path_c1root)/src/RSG/ImportBundle/Lib/interfaces/c1/contract/@program@/"
cp -v "$(path_c1map)/"*.php \
"$(path_c1root)/src/RSG/ImportBundle/Lib/interfaces/c1/contract/@program@/"
intralov-root: summary-html
mkdir -p "$(path_intralov_root)/"{rater/scripts,suppliers}
ln -fL $(path_dsl)/summary.css "$(path_intralov_root)/rater"
ln -fL $(path_dsl)/scripts/*.js "$(path_intralov_root)/rater/scripts/"
ln -fL $(path_suppliers)/*.{html,js} "$(path_intralov_root)/suppliers"
# because of the crazy wildcard target below, we want to ignore
# some Automake-generated stuff
%.am:
%.m4:
%.ac:
%: prepare
@if [[ "$@" != [Mm]akefile ]]; then \
$(MAKE) --no-print-directory -f Makefile.2 $@; \
$(_go); \
fi
clean:
$(MAKE) --no-print-directory -f Makefile.2 clean
prepare: FORCE
@>.cqueue
# Suppliers to check may be overridden using SUPPLIERS. Multiple suppliers
# should be space-delimited. Note that the UI is considered to be a special
# type of supplier (ui/package.xml) and is included by default in the value
# of SUPPLIERS.
check-am: $(suppliers_strip)
$(path_dsl)/build-aux/progtest-runner $(path_tests) $(SUPPLIERS)
test: check
check-am: standalones ui/package.js
@$(path_dsl)/build-aux/progtest-runner $(path_suppliers) $(path_tests)
@$(path_dsl)/build-aux/progtest-runner ui/package.xml $(path_tests)/ui
kill-tamed: tamed-die
tamed-die:
$(TAME_TS)
$(TAME) --kill
me-a-sandwich:
@test $$EUID -eq 0 \
&& echo 'You actually ran me as root? Are you insane!?' \
|| echo 'Make it yourself.'
FORCE: ;
# optionally include a "program.mk" file if it is
# present in the project's root directory
-include program.mk

View File

@ -0,0 +1,98 @@
#!/bin/bash
# Check for inappropriate coupling between packages
#
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
# THIS SCRIPT ASSUMES NO SPACES IN FILE NAMES (because `make' wouldn't work
# with those files anyway)!
#
# This script works by filtering `lsimports' output and then reformatting it
# for display.
##
declare -r mypath="$( dirname $0 )"
# Invoke `lsimports' relative to our path.
lsimports()
{
"$mypath/lsimports" "$@"
}
# Report violations in a user-friendly manner by reformatting lsimports
# output
report-violations()
{
awk '{ print "coupling violation: " $1 " must not import " $2 }'
}
# Input and return maps that do not match supplier names. Note that this
# will return ui.xml.
non-supplier-maps()
{
find map -name '*.xml' -a \! -wholename 'map/c1/*' \
| grep -vf <( ls suppliers/*.xml | xargs -n1 basename )
}
# Output packages associated with a given supplier (with the exception of
# the suppliers/$name.xml package).
supplier-packages()
{
local -r name=${1?Missing supplier name}
test ! -d "suppliers/$name" || find "suppliers/$name" -name '*.xml'
test ! -f "map/$name.xml" || echo "map/$name.xml"
test ! -f "map/return/$name.xml" || echo "map/return/$name.xml"
}
# Find violations, producing filtered lsimports output.
find-violations()
{
# Suppliers must not be imported by common or UI packages.
lsimports $( find common ui -name '*.xml' ) \
$( non-supplier-maps ) \
| grep ' /suppliers/'
# Suppliers must not import other suppliers or UI packages.
# TODO: Check against supplier maps
for supplier in suppliers/*.xml; do
local name=$( basename "$supplier" .xml )
lsimports "$supplier" $( supplier-packages "$name" ) \
| grep ' /suppliers/\|/ui/' \
| grep -v " /suppliers/$name/"
done
}
# Find violations and report any failures, exiting with a non-zero status if
# any violations are found.
main()
{
local -r bad=$( find-violations )
test -z "$bad" || {
report-violations <<< "$bad"
return 1
}
}
main "$@"

View File

@ -0,0 +1,17 @@
// externs for Closure Compiler
var module = {
exports: {
rater: {
knownFields: {},
supplier: "",
meta: {},
consts: {},
params: {},
classify: {
desc: {},
fromMap: {},
},
fromMap: {},
},
},
};

View File

@ -2,7 +2,7 @@
#
# Compiles the given CSV into a table definition
#
# Copyright (C) 2016 R-T Specialty, LLC.
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
@ -62,17 +62,16 @@ BEGIN {
# output package header
printf \
"<?xml-stylesheet type=\"text/xsl\" href=\"%1$srater/summary.xsl\"?>\n" \
"<?xml version=\"1.0\"?>\n" \
"<package\n" \
" xmlns=\"http://www.lovullo.com/rater\"\n" \
" xmlns:c=\"http://www.lovullo.com/calc\"\n" \
" xmlns:t=\"http://www.lovullo.com/rater/apply-template\"\n" \
" name=\"suppliers/rates/tables/%2$s\"\n" \
" desc=\"%2$s rate table\">\n\n" \
" <!--\n" \
" WARNING: This file was generated by csv2xml; do not modify!\n" \
" -->\n\n" \
" <import package=\"/rater/core\" />\n" \
" <import package=\"/rater/core/base\" />\n" \
" <import package=\"/rater/core/vector/table\" />\n\n", \
rootpath, name

View File

@ -2,7 +2,7 @@
#
# Performs interpolation for columns in a CSV and outputs the result
#
# Copyright (C) 2016 R-T Specialty, LLC.
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by

View File

@ -0,0 +1,223 @@
#!/usr/bin/awk -f
#
# Expands a "magic" CSV file into a normal CSV
#
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
# "Magic" CSVs simply exist to make life easier: they permit comments, blank
# lines, variables, sub-delimiter expansion, and any number of ranges per line.
# Ranges will be expanded in every combination, making rate tables highly
# maintainable.
#
# Variables are also supported when defined using :var=val. Variables may
# expand into ranges, 'cause they're awesome. Multiple variables may be
# delimited by semi-colons, as may multiple values.
#
# For example:
# :foo=1--3
# $foo;7;9--10:$foo, 5--10,1/1/2017
#
# Would generate:
# 1, 5, 1483246800
# 1, 6, 1483246800
# ...
# 5, 10, 1483246800
# 2, 5, 1483246800
# ...
# 9, 5, 14832468005
# ...
# 1, 5, 1483246800
# 1, 6, 1483246800
# ...
##
BEGIN {
date_cmd = "stdbuf -o0 date -f- +%s"
}
END {
close( date_cmd )
}
# Parse a date string into a Unix timestamp (memoized)
#
# This spawns a single process for date and reads from standard in. Even
# then, though, date parsing is very slow for many thousands of rows, so the
# output is also cached in `date_cache'.
function parse_date( i, src )
{
src = $i
if ( date_cache[ src ] )
{
$i = date_cache[ src ]
return
}
print $i |& date_cmd
date_cmd |& getline $i
date_cache[ src ] = $i;
}
# Expand variable with its value, if any
function expand_vars( s, value )
{
# attempt to parse variable (may expand into a range)
if ( match( s, /^\$([a-zA-Z_-]+)$/, m ) )
{
value = vars[ m[1] ];
if ( value == "" )
{
print "error: unknown variable reference: `$" m[1] "'" > "/dev/stderr"
exit 1
}
return value
}
return s
}
# Expand line
function parseline( i, m, j, me, orig )
{
if ( i > NF )
{
print
return
}
orig = $i
# expand variables before any processing so that expansions
# can include any type of formatting
$i = expand_vars( $i )
if ( match( $i, /^([0-9]+\/){2}[0-9]+$/, m ) )
{
parse_date( i );
}
# check first for delimiters
if ( match( $i, /^([^;]+);(.*)$/, m ) )
{
# give it a shot with the first value
$i = m[1]
parseline( i )
# strip off the first value and process with following value(s)
$i = m[2]
parseline( i )
# we've delegated; we're done
$i = orig
return
}
# parse range
if ( match( $i, /^([^-]+)--([^-]+)$/, m ) )
{
j = expand_vars( m[1] )
me = expand_vars( m[2] )
if ( !match( j, /^[0-9]+$/ ) || !match( me, /^[0-9]+$/ ) )
{
print "error: invalid range: `" $i "'" > "/dev/stderr"
exit 1
}
do
{
$i = j
parseline( i + 1 )
} while ( j++ < me )
}
else
{
parseline( i + 1 );
}
# restore to original value
$i = orig
}
BEGIN {
# we're parsing CSVs
FS = "[[:space:]]*,[[:space:]]*"
OFS = ","
has_directives = 0
directives = "!(NODIRECTIVES)"
}
# skip all lines that begin with `#', which denotes a comment, or are empty
/^#|^$/ { next; }
# directives are echoed back and are intended for processing by
# the parent csvm2csv script
/^!/ && output_started {
print "error: directive must appear before header: `" $0 "'" > "/dev/stderr"
exit 1
}
/^!/ && has_directives {
print "error: all directives must be on one line: `" $0 "'" > "/dev/stderr"
exit 1
}
/^!/ {
has_directives = 1
directives = $0
next
}
# lines that begin with a colon are variable definitions
/^:/ {
if ( !match( $0, /^:([a-zA-Z_-]+)=(.*?)$/, m ) )
{
print "error: invalid variable definition: `" $0 "'" > "/dev/stderr"
exit 1
}
vars[ m[1] ] = m[2]
next
}
# Always begin output with a line for directives, even if there are
# none. This makes subsequent processing much easier, since we won't have
# to conditionally ignore the top line.
!output_started {
print directives
output_started = 1
}
# lines that need any sort of processing (ranges, dates, etc)
/--|;|\$[a-zA-Z_-]|\// { parseline( 1 ); next; }
# all other lines are normal; simply output them verbatim
{
# this assignment will ensure that awk processes the output, ensuring that
# extra spaces between commas are stripped
$1=$1
print
}

View File

@ -1,8 +1,7 @@
#!/usr/bin/awk -f
#
#!/bin/bash
# Compiles a "magic" CSV file into a normal CSV
#
# Copyright (C) 2016, 2018 R-T Specialty, LLC.
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
@ -17,150 +16,97 @@
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
# "Magic" CSVs simply exist to make life easier: they permit comments, blank
# lines, variables, sub-delimiter expansion, and any number of ranges per line.
# Ranges will be expanded in every combination, making rate tables highly
# maintainable.
# For format of CSVMs, see `csvm-expand'.
#
# Variables are also supported when defined using :var=val. Variables may
# expand into ranges, 'cause they're awesome. Multiple variables may be
# delimited by semi-colons, as may multiple values.
#
# For example:
# :foo=1--3
# $foo;7;9--10:$foo, 5--10,1/1/2017
#
# Would generate:
# 1, 5, 1483246800
# 1, 6, 1483246800
# ...
# 5, 10, 1483246800
# 2, 5, 1483246800
# ...
# 9, 5, 14832468005
# ...
# 1, 5, 1483246800
# 1, 6, 1483246800
# ...
# To disable sorting of CSVM output, use the `!NOSORT' directive before the
# header line.
##
set -o pipefail
# Expand variable with its value, if any
function expand_vars( s, value )
# account for symlinks, since historically this script lives in a different
# directory and has been symlinked for compatibility
declare -r mypath=$( dirname "$( readlink -f "$0" )" )
# Generate -k arguments for GNU sort given a CSV header
#
# The generated arguments will be of the form -k1,1n ... -kl,ln, where `l'
# is the total number of header entries.
#
# For example, given this header:
# foo, bar, baz
# the output would be:
# -k1,1n -k2,2n -k3,3n
sort-key-args()
{
# attempt to parse variable (may expand into a range)
if ( match( s, /^\$([a-zA-Z_-]+)$/, m ) )
{
value = vars[ m[1] ];
local -r header="${1?Missing CSV header}"
if ( value == "" )
{
print "error: unknown variable reference: `$" m[1] "'" > "/dev/stderr"
exit 1
local -i i=0
# generate -ki,in for each column (notice that a trailing
# comma is added to the header because of the read delimiter)
while read -d,; do
echo -n "-k$((++i)),${i}n "
done <<< "$header,"
}
# Sort every column of CSV
#
# The columns will all be sorted left-to-right. The header is left in place
# as the first row.
csv-sort()
{
# the first line of the expanded CSVM is the CSV header
local header; read -r header
local -r keys=$( sort-key-args "$header" )
# all remaining input (which is now sans header) is sorted
echo "$header"
sort -t, $keys -
}
# Output usage information
#
# Kudos to you if you understand the little Easter egg.
usage()
{
cat <<EOU
Usage: $0 [FILE]
Expand CSVM represented by FILE or stdin into a CSV
The columns of the expanded CSV will be automatically sorted
left-to-right. To inhibit this behavior, use the \`!NOSORT'
directive anywhere before the header line in the source CSVM.
Options:
--help Output usage information.
This program has magic CSV powers.
EOU
exit 64 # EX_USAGE
}
# Sort CSV rows left-to-right unless the `!NOSORT' directive is provided
main()
{
test ! "$1" == --help || usage
"$mypath/csvm-expand" "$@" \
| {
local directives; read -r directives
# ignore sorting if given NOSORT directive
if [[ "$directives" =~ NOSORT ]]; then
cat
else
csv-sort "$sort"
fi
}
return value
}
return s
}
# Expand line
function parseline( i, m, j, me, orig )
{
if ( i > NF )
{
print
return
}
orig = $i
# expand variables before any processing so that expansions
# can include any type of formatting
$i = expand_vars( $i )
if ( match( $i, /^([0-9]+\/){2}[0-9]+$/, m ) )
{
cmd = "date --date=" $i " +%s"
cmd |& getline $i
close(cmd)
}
# check first for delimiters
if ( match( $i, /^([^;]+);(.*)$/, m ) )
{
# give it a shot with the first value
$i = m[1]
parseline( i )
# strip off the first value and process with following value(s)
$i = m[2]
parseline( i )
# we've delegated; we're done
$i = orig
return
}
# parse range
if ( match( $i, /^([^-]+)--([^-]+)$/, m ) )
{
j = expand_vars( m[1] )
me = expand_vars( m[2] )
if ( !match( j, /^[0-9]+$/ ) || !match( me, /^[0-9]+$/ ) )
{
print "error: invalid range: `" $i "'" > "/dev/stderr"
exit 1
}
do
{
$i = j
parseline( i + 1 )
} while ( j++ < me )
}
else
{
parseline( i + 1 );
}
# restore to original value
$i = orig
}
BEGIN {
# we're parsing CSVs
FS = " *, *"
OFS = ","
}
# skip all lines that begin with `#', which denotes a comment, or are empty
/^#|^$/ { next; }
# lines that begin with a colon are variable definitions
/^:/ {
if ( !match( $0, /^:([a-zA-Z_-]+)=(.*?)$/, m ) )
{
print "error: invalid variable definition: `" $0 "'" > "/dev/stderr"
exit 1
}
vars[ m[1] ] = m[2]
next
}
# lines that need any sort of processing (ranges, dates, etc)
/--|;|\$[a-zA-Z_-]|\// { parseline( 1 ); next; }
# all other lines are normal; simply output them verbatim
{
# this assignment will ensure that awk processes the output, ensuring that
# extra spaces between commas are stripped
$1=$1
print
}
main "$@"

View File

@ -1,7 +1,7 @@
#!/bin/bash
# Generates GNU Make recipes for c1map build
#
# Copyright (C) 2018 R-T Specialty, LLC.
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
@ -75,8 +75,7 @@ c1recipe()
)
echo "$dir/$base.php: $file $includes"
echo -e '\t@echo "c1map $< $@" >> .cqueue'
echo -e '\t@touch $@'
echo -e '\t$(TAME) c1map $< $@'
}

View File

@ -2,7 +2,7 @@
#
# Generates Makefile containing dependencies for each package
#
# Copyright (C) 2016 R-T Specialty, LLC.
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
@ -54,24 +54,12 @@ resolv-path()
# rule for building
[ -z "$GEN_MAKE" ] && {
echo "%.xmlo:: %.tmp"
echo -e "\t@rm -f \$@ \$<"
[ -n "$xmlo_cmd" ] \
&& echo -e "\t$xmlo_cmd" \
|| echo -e "\ttouch \$@"
echo "%.xmlo:: %.xml | prexmlo"
[ -n "$xmlo_cmd" ] \
&& echo -e "\t$xmlo_cmd" \
|| echo -e "\ttouch \$@"
export GEN_MAKE="$( pwd )/$0"
exec "$GEN_MAKE" "$@"
}
until [ $# -eq 0 ]; do (
path="${1%%/}"
echo "[gen-make] scanning $path" >&2
cd "$path" || exit $?
@ -83,8 +71,6 @@ until [ $# -eq 0 ]; do (
d="${dpath##*/}"
sansext="${d%.*}"
echo "[gen-make] found $path/$d" >&2
# this might be derived from another file
# TODO: handle all cases, not just typelists!
if [ -f "$sansext.typelist" ]; then
@ -94,17 +80,32 @@ until [ $# -eq 0 ]; do (
# begin this file's dependencies
echo -n "$path/$sansext.xmlo: $path/$sansext.xml "
# only further process dependency files
if [[ ! $dpath =~ .dep$ ]]; then
echo
continue;
fi
# output deps
while read dep; do
ext=.xmlo
# a trailing `$' means "leave the path alone"; don't automatically
# add the extension in this case
if [[ "$dep" =~ \$$ ]]; then
dep="${dep:0:-1}"
ext=
fi
# if the first character is a slash, then it's relative to the project
# root---the resolution has already been done for us!
if [ "${dep:0:1}" == '/' ]; then
echo -n " ${dep:1}.xmlo"
echo -n " ${dep:1}$ext"
continue
fi
echo -n ' '
resolv-path "$path/$dep.xmlo"
resolv-path "$path/$dep$ext"
done < "$d"
echo
@ -113,6 +114,8 @@ until [ $# -eq 0 ]; do (
# recurse on every subdirectory
for p in */; do
[ "$p" == ./ -o "$p" == ../ ] && continue
[ "$p" == node_modules/ -o "$p" == tame/ ] && continue
[ ! -d "$p" ] || ( cd "$OLDPWD" && "$GEN_MAKE" "$path/$p" ) || {
echo "fatal: failed to recurse on $( pwd )/$path/$p" >&2
exit 1

View File

@ -2,7 +2,7 @@
/**
* Generate regular expressions to match a list of zip codes
*
* Copyright (C) 2016 R-T Specialty, LLC.
* Copyright (C) 2014-2023 Ryan Specialty, LLC.
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

View File

@ -1,7 +1,7 @@
#!/bin/bash
# Generates typedef from list of strings
#
# Copyright (C) 2018 R-T Specialty, LLC.
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
@ -150,9 +150,12 @@ main()
cat <<EOF
<?xml version="1.0"?>
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
xmlns:t="http://www.lovullo.com/rater/apply-template"
title="$typedef Type">
<typedef name="$typedef" desc="$typedef">
<enum type="integer">
<item name="${typedef^^}_NONE" value="0" desc="NONE" />
EOF
while read line; do

View File

@ -0,0 +1,51 @@
#!/bin/bash
# Output absolute import paths for each provided package
#
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
# All arguments must be paths to the source XML files for a package. The
# output will be of the form:
#
# file.xml: /absolute/import/path
#
# Relative paths will be concatenated with the directory name of the source
# file. Parent references are resolved by replacing "/foo/../bar" with
# "/bar". Absolute imports are taken as-is.
#
# Namespace prefixes are ignored and the first attribute to the `import'
# node must be `@package', and must appear on the same line.
##
grep -H '<\([a-z]\+:\)\?import \+\(package\|path\)=' "$@" \
| awk -F': |"' '
# prefix with filename
{ printf "%s ", $1 }
# absolute paths should just be echoed
$3 ~ /^\// { print $3; next }
# otherwise concatenate import with package directory
{
dir = gensub( /[^/]+.xml/, "", 1, $1 )
path = "/" dir $3
# resolve parent references
while ( path ~ /\/\.\.\// ) {
sub( /\/[^/]+\/\.\.\//, "/", path )
}
print path
}'

View File

@ -1,6 +1,6 @@
# Common build configuration for TAME-based build systems
#
# Copyright (C) 2017 R-T Specialty, LLC.
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
@ -33,17 +33,23 @@ AM_INIT_AUTOMAKE([foreign])
AC_ARG_VAR([JAVA], [The Java executable])
AC_ARG_VAR([ANT], [Apache Ant])
AC_ARG_VAR([DSLC_JAR], [Path to DSL Compiler JAR])
AC_ARG_VAR([TAME], [Path to TAME])
AC_ARG_VAR([TAME], [The TAME compiler])
AC_ARG_VAR([TAME_PARAMS], [key=value pairs for XSLT-based compiler global params])
AC_ARG_VAR([RATER_CLASSPATH], [DSL Compiler Saxon class path])
AC_ARG_VAR([PROGUI_TEST_PATH], [Path to JavaScript tests for Program UI])
# Required version of TAME
AC_SUBST([tame_needed_ver], [1.0.0])
# This can also be set via the environment or during a make invocation
AC_ARG_VAR([TAMED_TUI], [Enable TUI (text UI) mode for tamed])
# Auto-discover Java and Ant paths
AC_CHECK_PROGS(JAVA, [java])
AC_CHECK_PROGS(ANT, [ant])
# Destination paths for local development
AC_CHECK_FILE(c1-import,
[AC_SUBST(C1_IMPORT_MAPDEST, c1-import/src/RSG/ImportBundle/Lib/interfaces/c1/contract)],
[AC_SUBST(C1_IMPORT_MAPDEST, lovullo/src/lib/c1/interfaces/c1/contract)])
AS_IF([test "$JAVA"],,
[AC_MSG_ERROR([missing java])])
AS_IF([test "$ANT"],,
@ -52,6 +58,13 @@ AS_IF([test "$ANT"],,
# Automake runs before shell is available, thus the separate m4 variable
CALCROOT="m4_defn(`calc_root')"
# Default source paths for BC
test -n "$SRCPATHS" || SRCPATHS='common/ suppliers/ map/ ui/ rater/'
AC_MSG_NOTICE([using source paths: $SRCPATHS])
AC_SUBST([CALCROOT], [$CALCROOT])
AC_SUBST([SRCPATHS], [$SRCPATHS])
# Checks to ensure that dslc is built, and gives instructions on how to
# build it otherwise. We do not want to build that for them---that can be
# added to a bootstrap script, but isn't permissible in build scripts.
@ -64,42 +77,18 @@ AS_IF([test ! "$DSLC_JAR"],
# TAME is the compiler (whereas dslc invokes it, keeps things in memory, etc)
AS_IF([test ! "$TAME"],
[AC_CHECK_FILE([$CALCROOT/tame],
[AC_SUBST([TAME], [$CALCROOT/tame])],
[AC_CHECK_FILE([$CALCROOT/tame/bin/tame],
[AC_SUBST([TAME], [$CALCROOT/tame/bin/tame])],
[AC_MSG_ERROR(
[TAME not found])])],
[])
AC_MSG_CHECKING([TAME version])
AC_SUBST_FILE([tame_version])
tame_version=$( cat "$TAME/VERSION" )
# We get subtle errors or potential compiler bugs if the TAME version is
# incorrect; check for >= the required version
AS_VERSION_COMPARE([$tame_version], [$tame_needed_ver],
[
AC_MSG_RESULT([$tame_version])
AC_MSG_ERROR([TAME version $tame_needed_ver or greater required])
],
[AC_MSG_RESULT([$tame_version])],
[AC_MSG_RESULT([$tame_version (>$tame_needed_ver)])])
# @program@ in *.in files will be replaced with the program name provided by AC_INIT
AC_SUBST([program], AC_PACKAGE_NAME)
# Final files to be output by `configure'. The path before the colon is the
# destination name; after the colon is the source.
AC_CONFIG_FILES(Makefile:m4_defn(`calc_root')/build-aux/Makefile.in
Makefile.2:m4_defn(`calc_root')/build-aux/Makefile.2.in)
AC_CONFIG_FILES(Makefile:m4_defn(`calc_root')/build-aux/Makefile.in)
# Generate configure script
AC_OUTPUT
# we want this to run as part of the configure script, not during M4
# expansion
"$CALCROOT/build-aux/suppmk-gen"
AC_MSG_NOTICE([complete
You may now run `make` to build.])

View File

@ -1,7 +1,7 @@
#!/bin/bash
# Generates all dependency graphs
#
# Copyright (C) 2018 R-T Specialty, LLC.
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by

View File

@ -1,7 +1,7 @@
#!/bin/bash
# Run test cases for supplier
#
# Copyright (C) 2018 R-T Specialty, LLC.
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
@ -15,28 +15,24 @@
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
# This program is intended to be called directly by make; it's API is
# subject to change. Please use `make check` as appropriate.
##
declare path_suppliers="${1?Missing supplier path}"
declare path_tests="${2?Missing supplier test path}"
declare -i result=0
declare suppliers
# if a file was provided, use it as the sole supplier; otherwise,
# treat it as a directory of suppliers
if [ -f "$path_suppliers" ]; then
suppliers=( "$path_suppliers" )
path_suppliers=$( dirname "$path_suppliers" )
else
suppliers=( "$path_suppliers"/*.xml )
fi
# The first argument indicates the test directory.
declare -r path_tests=${1?Missing test path}
shift
# run tests for each supplier individually
for supplier in "${suppliers[@]}"; do
# All remaining arguments are taken to be a list of suppliers to test.
for supplier in "$@"; do
base=$( basename "$supplier" .xml )
tests=$( find -L "$path_tests"/"$base"/ -name '*.yml' )
path_suppliers=$( dirname "$supplier" )
tests=$( find -L "$path_tests"/"$base"/ -name '*.yml' | LC_ALL=c sort )
echo
echo "$path_suppliers/$base"
@ -47,7 +43,8 @@ for supplier in "${suppliers[@]}"; do
exit 1
}
rater/tame/progtest/bin/runner "$path_suppliers/$base.js" $tests \
# note that this depends on the _stripped_ version
rater/tame/progtest/bin/runner "$path_suppliers/$base.strip.js" $tests \
|| result=1
done

View File

@ -0,0 +1,80 @@
#!/bin/bash
# Determine whether a release looks okay.
#
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
# This should be run as part of a CI system to prohibit bad tags.
declare -r RELEASE_FILE="${RELEASE_FILE:-RELEASES.md}"
tag-date()
{
local -r t="${1?Missing tag}"
git show "$t" --date=short \
| awk '/^Date: / { print $2; exit }'
}
declare -r tag=$(git describe --abbrev=0)
declare -r tagdate=$(tag-date "$tag")
suggest-fix()
{
echo
echo "Here are the commands you should use to correct this"
echo "bad tag:"
echo " \$ git tag -d $tag"
echo " \$ tools/mkrelease $tag"
echo " \$ git push -f --tags $tag"
}
# Check for NEXT heading first so that we can provide more clear guidance
# for what to do.
echo -n "checking $RELEASE_FILE for missing 'NEXT' heading... "
! grep -q '^NEXT$' "$RELEASE_FILE" || {
echo "FAIL"
echo "error: $RELEASE_FILE contains 'NEXT' heading" >&2
echo
echo "$RELEASE_FILE must be updated to replace the 'NEXT'"
echo "heading with the version and date being deployed."
echo
echo "The script in tools/mkrelease will do this for you."
suggest-fix
exit 1
}
echo "OK"
# A missing NEXT heading could also mean that no release notes exist at all
# for this tag. Check.
echo -n "checking $RELEASE_FILE for '$tag' heading... "
grep -q "^$tag ($tagdate)\$" "$RELEASE_FILE" || {
echo "FAIL"
echo "error: $RELEASE_FILE does not contain heading for $tag" >&2
echo
echo "$RELEASE_FILE has not been updated with release notes"
echo "for $tag."
echo
echo "The heading should read: '$tag ($tagdate)'"
suggest-fix
exit 1
}
echo "OK"

View File

@ -3,7 +3,7 @@
/**
* Generate territory matrices from data files
*
* Copyright (C) 2016 R-T Specialty, LLC.
* Copyright (C) 2014-2023 Ryan Specialty, LLC.
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@ -19,7 +19,7 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
?>
<?xml-stylesheet type="text/xsl" href="../../rater/summary.xsl"?>
<?xml version="1.0"?>
<lv:package
xmlns:lv="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"

View File

@ -1,7 +1,7 @@
#!/bin/bash
# Test csvm2csv
#
# Copyright (C) 2018 R-T Specialty, LLC.
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
@ -38,7 +38,10 @@ run-test()
test $? -eq 0 || return 1
# expected output
diff <( cat <<< "$expected" ) <( cat <<< "$given" )
diff <( cat <<< "$expected" ) <( cat <<< "$given" ) || {
echo "test $testsum failure" >&2
return 1
}
}
@ -92,11 +95,11 @@ test-delim()
declare -r expected='header,line
1,2
3,6
3,9
4,2
4,6
4,9
3,6
3,9'
4,9'
run-test "$input" "$expected"
}
@ -117,6 +120,22 @@ $bar_baz-quux,$foo'
run-test "$input" "$expected"
}
# same as above but with whitespace
test-whitespace-including-tabs-ok()
{
declare -r input='header, line
:foo=1
:bar_baz-quux=2
$foo, 1
$bar_baz-quux, $foo'
declare -r expected='header,line
1,1
2,1'
run-test "$input" "$expected"
}
test-range-delim()
{
@ -179,11 +198,12 @@ test-var-with-var()
:baz=$range;$foo
$baz, 5'
# note that the output is sorted
declare -r expected='header,line
2,5
2,5
3,5
4,5
2,5'
4,5'
run-test "$input" "$expected"
}
@ -203,17 +223,56 @@ $foo'
}
test-directive-stripped()
{
declare -r input='!DIRECTIVE
header, line'
declare -r expected='header,line'
run-test "$input" "$expected"
}
test-no-sort()
{
declare -r input='!NOSORT
header, line
1,1
0,0'
declare -r expected='header,line
1,1
0,0'
run-test "$input" "$expected"
}
# all directives should be put on a single line
test-fail-multi-directive()
{
declare -r input='!DIRECTIVE1
!DIRECTIVE2
header, line'
((testsum++))
local result
! result=$( ../csvm2csv 2>&1 <<< "$input" ) || return 1
[[ "$result" =~ !DIRECTIVE2 ]]
}
test-fail-unknown-var-ref()
{
((testsum++))
local -r result=$(
../csvm2csv 2>&1 <<< '$undefined' \
&& echo '(test failure: expected failure)'
)
local result
! result=$( ../csvm2csv 2>&1 <<< '$undefined' ) || return 1
grep -q 'unknown.*\$undefined' <<< "$result" \
|| return 1
[[ "$result" =~ unknown.*\$undefined ]]
}
@ -221,13 +280,10 @@ test-fail-non-numeric-range()
{
((testsum++))
local -r result=$(
../csvm2csv 2>&1 <<< 'A--Z' \
&& echo '(test failure: expected failure)'
)
local result
! result=$( ../csvm2csv 2>&1 <<< 'A--Z' ) || return 1
grep -q 'invalid range.*A--Z' <<< "$result" \
|| return 1
[[ "$result" =~ invalid\ range.*A--Z ]]
}
@ -235,13 +291,10 @@ test-fail-invalid-var-dfn()
{
((testsum++))
local -r result=$(
../csvm2csv 2>&1 <<< ':BAD@#=var' \
&& echo '(test failure: expected failure)'
)
local result
! result=$( ../csvm2csv 2>&1 <<< ':BAD@#=var' ) || return 1
grep -q 'invalid variable definition.*:BAD@#=var' <<< "$result" \
|| return 1
[[ "$result" =~ invalid\ variable\ definition.*:BAD@#=var ]]
}
@ -254,6 +307,10 @@ test-comment \
&& test-var-with-range-delim \
&& test-var-with-var \
&& test-var-zero-ref \
&& test-directive-stripped \
&& test-no-sort \
&& test-whitespace-including-tabs-ok \
&& test-fail-multi-directive \
&& test-fail-unknown-var-ref \
&& test-fail-non-numeric-range \
&& test-fail-invalid-var-dfn \
@ -263,7 +320,7 @@ test-comment \
}
# safety check
test "$testsum" -eq 12 || {
test "$testsum" -eq 16 || {
echo 'error: did not run all csvm2csv tests!' >&2
exit 1
}

View File

@ -1,7 +1,7 @@
#!/bin/bash
# Test list2typedef
#
# Copyright (C) 2018 R-T Specialty, LLC.
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
@ -38,9 +38,12 @@ Second'\''s @ @Line
declare -r expected='<?xml version="1.0"?>
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
xmlns:t="http://www.lovullo.com/rater/apply-template"
title="FooType Type">
<typedef name="FooType" desc="FooType">
<enum type="integer">
<item name="FOOTYPE_NONE" value="0" desc="NONE" />
<item name="FOOTYPE_FIRST" value="2706493105" desc="First" />
<item name="FOOTYPE_SECONDS_LINE" value="3512333918" desc="Second'\''s @ @Line" />
<item name="FOOTYPE_THIRD" value="519392729" desc="!!!THIRD!!!" />

View File

@ -4,7 +4,7 @@
* Given a set of sorted zips, generates a regular expression to match only the
* given input
*
* Copyright (C) 2016 R-T Specialty, LLC.
* Copyright (C) 2014-2023 Ryan Specialty, LLC.
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

View File

@ -1,6 +1,6 @@
# For use by automake and autoconf
#
# Copyright (C) 2015 R-T Specialty, LLC.
# Copyright (C) 2014-2023 Ryan Specialty, LLC.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
@ -23,9 +23,7 @@ m4_if(ver, [], [m4_exit(1)])
AC_INIT([tame], [ver], [dev@lovullo.com])
AC_CONFIG_AUX_DIR([tools])
AM_INIT_AUTOMAKE([foreign])
# target that should be be added to everything except doc/
AM_EXTRA_RECURSIVE_TARGETS([all-nodoc])
AM_EXTRA_RECURSIVE_TARGETS([bin all-nodoc])
# provide more granular version numbers based on the version string, using
# the format MAJOR.MINOR.REV[-SUFFIX], where SUFFIX can itself contain
@ -40,6 +38,7 @@ AC_SUBST(REV, m4_argn(3, ver_split))
AC_SUBST(SUFFIX, m4_argn(4, ver_split))
AC_ARG_VAR([JAVA], [The Java executable])
AC_ARG_VAR([JAVA_OPTS], [Java options])
AC_CHECK_PROGS(JAVA, [java])
AC_ARG_VAR([SAXON_CP], [Saxon class path])
@ -52,7 +51,26 @@ AS_IF(test ! -d "$HOXSL",
AC_MSG_ERROR([hoxsl path '$HOXSL' does not exist!]))
AC_MSG_RESULT(found)
AC_CONFIG_FILES([Makefile doc/Makefile src/init.xsl VERSION])
# BC with RATER_CLASSPATH
DSLC_CLASSPATH="$SAXON_CP:${DSLC_CLASSPATH:-$RATER_CLASSPATH}"
AC_SUBST(DSLC_CLASSPATH, [$DSLC_CLASSPATH])
AC_SUBST([AUTOGENERATED],
["THIS FILE IS AUTOGENERATED! DO NOT MODIFY! See *.in."])
# Documentation
set_devnotes='@set DEVNOTES'
AC_ARG_ENABLE(
[devnotes],
[AS_HELP_STRING([--enable-devnotes],
[include notes in manual for TAME developers (enabled by default)])],
[test "x$enableval" != xno || set_devnotes="@c $set_devnotes"])
AC_SUBST([SET_DEVNOTES], [$set_devnotes])
AC_CONFIG_FILES([Makefile doc/Makefile doc/config.texi src/init.xsl VERSION])
AC_CONFIG_FILES([bin/dslc],
[chmod +x bin/dslc])
AC_OUTPUT

1
core/.gitignore vendored
View File

@ -1,5 +1,6 @@
# when core is built
*.xmlo
*.xmli
*.xmle
*.js
*.dep

View File

@ -1,674 +0,0 @@
GNU GENERAL PUBLIC LICENSE
Version 3, 29 June 2007
Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The GNU General Public License is a free, copyleft license for
software and other kinds of works.
The licenses for most software and other practical works are designed
to take away your freedom to share and change the works. By contrast,
the GNU General Public License is intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users. We, the Free Software Foundation, use the
GNU General Public License for most of our software; it applies also to
any other work released this way by its authors. You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.
To protect your rights, we need to prevent others from denying you
these rights or asking you to surrender the rights. Therefore, you have
certain responsibilities if you distribute copies of the software, or if
you modify it: responsibilities to respect the freedom of others.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must pass on to the recipients the same
freedoms that you received. You must make sure that they, too, receive
or can get the source code. And you must show them these terms so they
know their rights.
Developers that use the GNU GPL protect your rights with two steps:
(1) assert copyright on the software, and (2) offer you this License
giving you legal permission to copy, distribute and/or modify it.
For the developers' and authors' protection, the GPL clearly explains
that there is no warranty for this free software. For both users' and
authors' sake, the GPL requires that modified versions be marked as
changed, so that their problems will not be attributed erroneously to
authors of previous versions.
Some devices are designed to deny users access to install or run
modified versions of the software inside them, although the manufacturer
can do so. This is fundamentally incompatible with the aim of
protecting users' freedom to change the software. The systematic
pattern of such abuse occurs in the area of products for individuals to
use, which is precisely where it is most unacceptable. Therefore, we
have designed this version of the GPL to prohibit the practice for those
products. If such problems arise substantially in other domains, we
stand ready to extend this provision to those domains in future versions
of the GPL, as needed to protect the freedom of users.
Finally, every program is threatened constantly by software patents.
States should not allow patents to restrict development and use of
software on general-purpose computers, but in those that do, we wish to
avoid the special danger that patents applied to a free program could
make it effectively proprietary. To prevent this, the GPL assures that
patents cannot be used to render the program non-free.
The precise terms and conditions for copying, distribution and
modification follow.
TERMS AND CONDITIONS
0. Definitions.
"This License" refers to version 3 of the GNU General Public License.
"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
"The Program" refers to any copyrightable work licensed under this
License. Each licensee is addressed as "you". "Licensees" and
"recipients" may be individuals or organizations.
To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy. The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.
A "covered work" means either the unmodified Program or a work based
on the Program.
To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy. Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.
To "convey" a work means any kind of propagation that enables other
parties to make or receive copies. Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.
An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License. If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.
1. Source Code.
The "source code" for a work means the preferred form of the work
for making modifications to it. "Object code" means any non-source
form of a work.
A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.
The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form. A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.
The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities. However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work. For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.
The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.
The Corresponding Source for a work in source code form is that
same work.
2. Basic Permissions.
All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met. This License explicitly affirms your unlimited
permission to run the unmodified Program. The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work. This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.
You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force. You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright. Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.
Conveying under any other circumstances is permitted solely under
the conditions stated below. Sublicensing is not allowed; section 10
makes it unnecessary.
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.
When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.
4. Conveying Verbatim Copies.
You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.
You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.
5. Conveying Modified Source Versions.
You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:
a) The work must carry prominent notices stating that you modified
it, and giving a relevant date.
b) The work must carry prominent notices stating that it is
released under this License and any conditions added under section
7. This requirement modifies the requirement in section 4 to
"keep intact all notices".
c) You must license the entire work, as a whole, under this
License to anyone who comes into possession of a copy. This
License will therefore apply, along with any applicable section 7
additional terms, to the whole of the work, and all its parts,
regardless of how they are packaged. This License gives no
permission to license the work in any other way, but it does not
invalidate such permission if you have separately received it.
d) If the work has interactive user interfaces, each must display
Appropriate Legal Notices; however, if the Program has interactive
interfaces that do not display Appropriate Legal Notices, your
work need not make them do so.
A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit. Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.
6. Conveying Non-Source Forms.
You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:
a) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by the
Corresponding Source fixed on a durable physical medium
customarily used for software interchange.
b) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by a
written offer, valid for at least three years and valid for as
long as you offer spare parts or customer support for that product
model, to give anyone who possesses the object code either (1) a
copy of the Corresponding Source for all the software in the
product that is covered by this License, on a durable physical
medium customarily used for software interchange, for a price no
more than your reasonable cost of physically performing this
conveying of source, or (2) access to copy the
Corresponding Source from a network server at no charge.
c) Convey individual copies of the object code with a copy of the
written offer to provide the Corresponding Source. This
alternative is allowed only occasionally and noncommercially, and
only if you received the object code with such an offer, in accord
with subsection 6b.
d) Convey the object code by offering access from a designated
place (gratis or for a charge), and offer equivalent access to the
Corresponding Source in the same way through the same place at no
further charge. You need not require recipients to copy the
Corresponding Source along with the object code. If the place to
copy the object code is a network server, the Corresponding Source
may be on a different server (operated by you or a third party)
that supports equivalent copying facilities, provided you maintain
clear directions next to the object code saying where to find the
Corresponding Source. Regardless of what server hosts the
Corresponding Source, you remain obligated to ensure that it is
available for as long as needed to satisfy these requirements.
e) Convey the object code using peer-to-peer transmission, provided
you inform other peers where the object code and Corresponding
Source of the work are being offered to the general public at no
charge under subsection 6d.
A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.
A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling. In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage. For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product. A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.
"Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source. The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.
If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information. But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).
The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed. Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.
7. Additional Terms.
"Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law. If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.
When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it. (Additional permissions may be written to require their own
removal in certain cases when you modify the work.) You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.
Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:
a) Disclaiming warranty or limiting liability differently from the
terms of sections 15 and 16 of this License; or
b) Requiring preservation of specified reasonable legal notices or
author attributions in that material or in the Appropriate Legal
Notices displayed by works containing it; or
c) Prohibiting misrepresentation of the origin of that material, or
requiring that modified versions of such material be marked in
reasonable ways as different from the original version; or
d) Limiting the use for publicity purposes of names of licensors or
authors of the material; or
e) Declining to grant rights under trademark law for use of some
trade names, trademarks, or service marks; or
f) Requiring indemnification of licensors and authors of that
material by anyone who conveys the material (or modified versions of
it) with contractual assumptions of liability to the recipient, for
any liability that these contractual assumptions directly impose on
those licensors and authors.
All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10. If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term. If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.
If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.
Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.
8. Termination.
You may not propagate or modify a covered work except as expressly
provided under this License. Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).
However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.
Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License. If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.
9. Acceptance Not Required for Having Copies.
You are not required to accept this License in order to receive or
run a copy of the Program. Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance. However,
nothing other than this License grants you permission to propagate or
modify any covered work. These actions infringe copyright if you do
not accept this License. Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.
10. Automatic Licensing of Downstream Recipients.
Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License. You are not responsible
for enforcing compliance by third parties with this License.
An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations. If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License. For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.
11. Patents.
A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based. The
work thus licensed is called the contributor's "contributor version".
A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version. For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.
In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement). To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.
If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients. "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.
If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.
A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License. You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.
12. No Surrender of Others' Freedom.
If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all. For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.
13. Use with the GNU Affero General Public License.
Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU Affero General Public License into a single
combined work, and to convey the resulting work. The terms of this
License will continue to apply to the part which is the covered work,
but the special requirements of the GNU Affero General Public License,
section 13, concerning interaction through a network will apply to the
combination as such.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of
the GNU General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the
Program specifies that a certain numbered version of the GNU General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of the
GNU General Public License, you may choose any version ever published
by the Free Software Foundation.
If the Program specifies that a proxy can decide which future
versions of the GNU General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.
Later license versions may give you additional or different
permissions. However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.
15. Disclaimer of Warranty.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. Limitation of Liability.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.
17. Interpretation of Sections 15 and 16.
If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
Also add information on how to contact you by electronic and paper mail.
If the program does terminal interaction, make it output a short
notice like this when it starts in an interactive mode:
<program> Copyright (C) <year> <name of author>
This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, your program's commands
might be different; for a GUI interface, you would use an "about box".
You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU GPL, see
<http://www.gnu.org/licenses/>.
The GNU General Public License does not permit incorporating your program
into proprietary programs. If your program is a subroutine library, you
may consider it more useful to permit linking proprietary applications with
the library. If this is what you want to do, use the GNU Lesser General
Public License instead of this License. But first, please read
<http://www.gnu.org/philosophy/why-not-lgpl.html>.

View File

@ -1,57 +0,0 @@
# TAME Core
Core library for TAME, providing generic abstractions for common
operations.
This library has accumulated a bit of cruft, is disorganized, and has
not been substantially refactored to take advantage of new language
features. It is a work in progress.
## Features
- BDD abstraction;
- Classification match manipulation;
- Common operations on numbers;
- Conditional evaluation helpers;
- Core primitive declarations;
- Interpolation;
- Interval mapping;
- Matrix and vector manipulation;
- Query matrices as data tables;
- Value mappings; and
- Other miscellaneous stuff.
## What is TAME?
TAME is The Adaptive Metalanguage, a programming language and system of tools
designed to aid in the development, understanding, and maintenance of systems
performing numerous calculations on a complex graph of dependencies,
conditions, and a large number of inputs.
This system was developed at R-T Specialty Buffalo to handle the complexity of
comparative insurance rating systems. It is a domain-specific language (DSL)
that itself encourages, through the use of templates, the creation of sub-DSLs.
TAME itself is at heart a calculator—processing only numerical input and
output—driven by quantifiers as predicates. Calculations and quantifiers are
written declaratively without concern for order of execution.
The system has powerful dependency resolution and data flow capabilities.
TAME consists of a macro processor (implementing a metalanguage), numerous
compilers for various targets (JavaScript, HTML documentation and debugging
environment, LaTeX, and others), linkers, and supporting tools. The input
grammar is XML, and the majority of the project (including the macro processor,
compilers, and linkers) are written in XSLT. There is a reason for that odd
choice; until an explanation is provided, know that someone is perverted enough
to implement a full compiler stack in XSLT.
## License
This program is free software: you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the Free
Software Foundation, either version 3 of the License, or (at your option)
any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE. See the GNU General Public License for more details.

View File

@ -1,154 +0,0 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2018 R-T Specialty, LLC.
This file is part of tame-core.
tame-core is free software: you can redistribute it and/or modify it
under the terms of the GNU General Public License as
published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
-->
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
xmlns:t="http://www.lovullo.com/rater/apply-template"
core="true"
desc="Aggregating Values">
<import package="base" export="true" />
<import package="vector/cmatch" export="true" />
Aggregate templates simplify aggregating values through various means.
Unless otherwise specified,
the default means of aggregation is summation.
<section title="Symbol-Based Aggregation">
For large numbers of values,
the most convenient way to aggregate is by matching on symbol names.
Note that symbols must be available for a match to occur.
All imported symbols are immediately available,
but \tt{expand-sequence} may need to be used for symbols produced by
the same package.
\ref{_aggregate-rate-each_} aggregates values of generators (usually
referred to by \tt{rate-each}) through summation.
A \tt{rate-each} block is generated to perform the summation.
Since \tt{rate-each} multiplies its body by \tt{_CMATCH_},
zero symbols would normally result in the summation of \tt{_CMATCH_}
itself, which is not desirable;
this template always includes \ref{ZERO} in the body to defend
against this,
causing a yield of~$0.00$ if there are no symbol matches.
<template name="_aggregate-rate-each_"
desc="Aggregate generator values by symbol prefix">
<param name="@class@" desc="Iterator class (omit for scalars)" />
<param name="@prefix@" desc="Symbol prefix" />
<param name="@yields@" desc="Scalar yield name (optional)">
<text></text>
</param>
<param name="@generates@" desc="Generator name (optional)">
<text></text>
</param>
<rate-each class="@class@" yields="@yields@"
generates="@generates@" index="k">
<c:sum>
<!-- prevent summing _CMATCH_ if there are no symbols (see above
comments) -->
<c:value-of name="ZERO"
label="Guard against zero symbol matches" />
<inline-template>
<for-each>
<sym-set name-prefix="@prefix@" type="gen" />
</for-each>
<c:value-of name="@sym_name@" index="k" />
</inline-template>
</c:sum>
</rate-each>
</template>
\ref{_aggregate-rate_} is analgous to \ref{_aggregate-rate-each_},
handling only scalar~\tt{@yields@}.
A \tt{rate} block is generated to aggregate by summation.
To prevent an empty rate block from being generated if there are no
symbol matches,
\ref{ZERO} is always included as part of the summation.
<template name="_aggregate-rate_"
desc="Aggregate scalar results by symbol prefix">
<param name="@prefix@" desc="Symbol prefix" />
<param name="@yields@" desc="Scalar yield name" />
<rate yields="@yields@">
<c:sum>
<!-- prevent completely empty rate block -->
<c:value-of name="ZERO"
label="Guard against zero symbol matches" />
<inline-template>
<for-each>
<sym-set name-prefix="@prefix@" type="rate" />
</for-each>
<c:value-of name="@sym_name@" />
</inline-template>
</c:sum>
</rate>
</template>
\ref{_aggregate-classify_} aggregates classifications.
Keep in mind that classifications act as universal quantifiers by default,
meaning zero symbol matches will produce a match and a scalar~$1$;
existential quantifiers (\tt{@any@} set to \tt{true}) will \emph{not}
match and will produce the scalar~$0$.
<template name="_aggregate-classify_"
desc="Aggregate classification results by symbol prefix">
<param name="@prefix@" desc="Symbol prefix" />
<param name="@as@" desc="Classification name" />
<param name="@desc@" desc="Generated classification description" />
<param name="@yields@" desc="Vector yield name (optional)">
<text></text>
</param>
<param name="@any@"
desc="Existential classification (default false, universal)">
<text></text>
</param>
<classify as="@as@" yields="@yields@" desc="@desc@" any="@any@">
<inline-template>
<for-each>
<sym-set name-prefix="@prefix@" type="class" />
</for-each>
<t:match-class name="@sym_name@" />
</inline-template>
</classify>
</template>
</section>
</package>

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2017 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -19,6 +19,7 @@
-->
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
xmlns:t="http://www.lovullo.com/rater/apply-template"
core="true"
desc="Aliasing Values">

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2015, 2018 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -18,6 +18,7 @@
along with this program. If not, see <http://www.gnu.org/licenses/>.
-->
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
xmlns:t="http://www.lovullo.com/rater/apply-template"
core="true"
desc="Assertions">

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2015, 2017, 2018 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -19,6 +19,7 @@
-->
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
xmlns:t="http://www.lovullo.com/rater/apply-template"
core="true"
desc="Base features">
@ -45,19 +46,6 @@
desc="Dummy value; this set is populated upon entering
each rate block" />
</const>
The runtime is responsible for populating \ref{__DATE_YEAR__} with
a proper value representing the current year.
\todo{TAME is deterministic with this one exception; remove it and
have users use the params from {\tt datetime} instead if they need this
datum.}
<const name="__DATE_YEAR__" magic="true"
value="0" type="integer"
desc="Current year"
sym="\widehat{D^\gamma}" />
</section>
@ -151,14 +139,12 @@
<classify as="always"
desc="Always true"
yields="alwaysTrue"
keep="true" />
yields="alwaysTrue" />
<classify as="never"
any="true"
desc="Never true"
yields="neverTrue"
keep="true" />
yields="neverTrue" />
</section>
@ -216,5 +202,112 @@
<warning>Ignored block!</warning>
</template>
</section>
<section title="Calculations">
These templates represent calculations that used to be defined as XSLT
templates before TAME's template system existed.
<extern name="___yield" type="rate" dtype="float" dim="0" />
<template name="_yield_"
desc="Final scalar result provided to caller">
<param name="@values@" desc="Yield calculation" />
<rate yields="___yield" local="true">
<param-copy name="@values@" />
</rate>
</template>
<template name="_rate-each_"
desc="Convenience template that expands to a lv:rate block summing over
the magic _CMATCH_ set with the product of its value">
<param name="@values@"
desc="Yield calculation" />
<param name="@generates@" desc="Generator name (optional)">
<text></text>
</param>
<param name="@yields@" desc="Yield (optional)">
<text>_</text>
<param-value name="@generates@" />
</param>
<!-- at least one of generates or yields is required -->
<if name="@yields@" eq="">
<if name="@generates@" eq="">
<error>must provide at least one of @generates or @yields</error>
</if>
</if>
<param name="@class@"
desc="Space-delimited classifications for predicated iteration" />
<param name="@no@"
desc="Space-delimited classifications for predicated iteration to prevent matches">
<text></text>
</param>
<param name="@index@"
desc="Generator index" />
<param name="@dim@" desc="Dim (optional)">
<text></text>
</param>
<param name="@gensym@" desc="Generator TeX symbol">
<text></text>
</param>
<rate class="@class@" no="@no@" yields="@yields@"
gentle-no="true"
desc="Total {@yields@} premium">
<c:sum of="_CMATCH_" dim="@dim@" sym="@gensym@"
generates="@generates@" index="@index@"
desc="Set of individual {@yields@} premiums">
<c:product>
<c:value-of name="_CMATCH_" index="@index@"
label="One if {@class@} and not {@no@} (if provided), otherwise zero" />
<param-copy name="@values@" />
</c:product>
</c:sum>
</rate>
</template>
</section>
<section title="Feature Flags">
These templates alter the behavior of the TAME compiler or runtime.
They will be removed at some point in the future.
<section title="Classification System">
The template \tt{_use-new-classification-system_} sets a compile-time
flag that will cause all following sibling classifications to be
compiled using the new classification system.
Once the feature is enabled by default,
this template will become a noop and will begin to emit a warning,
before eventually being removed.
It is possible to mix both old and new classifications within the same
package,
though such behavior may lead to confusion in certain cases.
For more information on where the new and old system differ,
see the \tt{core/test/core/class} specification.
<template name="_use-new-classification-system_"
desc="Compile following-sibling::lv:classify using the new
classification system">
<!-- Even though this is a template param-meta, it will only affect
following-sibling for performance reasons -->
<param-meta name="___feature-newclassify" value="1" />
<t:todo desc="remove _use-new-classification-system_ application;
the new classification system is enabled by default
and this template no longer has any effect" />
</template>
</section>
</section>
</package>

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2015 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -19,6 +19,7 @@
-->
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
xmlns:t="http://www.lovullo.com/rater/apply-template"
core="true"
desc="Generic conditionals">
@ -35,7 +36,7 @@
<c:case>
<c:when name="or_a">
<c:eq>
<c:const value="0" type="integer" desc="Return B if A is 0" />
<c:const value="0" desc="Return B if A is 0" />
</c:eq>
</c:when>
@ -65,7 +66,7 @@
<param-value name="@name@" />
</param>
<c:const value="@value@" type="@type@" desc="@desc@">
<c:const value="@value@" desc="@desc@">
<!-- TODO: non-index option -->
<c:when name="@name@" index="@index@">
<c:eq>
@ -86,14 +87,14 @@
<if name="@index@">
<c:when name="@name@" index="@index@">
<c:gt>
<c:const value="0" type="integer" desc="Use override if greater than 0" />
<c:const value="0" desc="Use override if greater than 0" />
</c:gt>
</c:when>
</if>
<unless name="@index@">
<c:when name="@name@">
<c:gt>
<c:const value="0" type="integer" desc="Use override if greater than 0" />
<c:const value="0" desc="Use override if greater than 0" />
</c:gt>
</c:when>
</unless>
@ -122,11 +123,11 @@
<c:case>
<c:when name="@name@" index="@index@">
<c:eq>
<c:const value="0" type="integer" desc="No value" />
<c:const value="0" desc="No value" />
</c:eq>
</c:when>
<c:const value="@default@" type="integer" desc="Default value" />
<c:const value="@default@" desc="Default value" />
</c:case>
<c:otherwise>
@ -149,13 +150,13 @@
</param>
<!-- simply returns a constant value for the class match -->
<rate-each class="@class@" accumulate="none" generates="@generates@" index="k">
<rate-each class="@class@" generates="@generates@" index="k">
<c:product>
<if name="@value@">
<c:value-of name="@value@" />
</if>
<unless name="@value@">
<c:const value="@const@" type="float" desc="@desc@" />
<c:const value="@const@" desc="@desc@" />
</unless>
<!-- if this is not provided, then the c:product will be optimized away -->

View File

@ -0,0 +1,7 @@
AC_INIT([tame-core], [0.0.0])
m4_define(`calc_root', ../rater)
SRCPATHS=.
m4_include([../build-aux/m4/calcdsl.m4])

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2018 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2015, 2017 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -94,7 +94,7 @@
<c:value-of name="@default@" index="k" />
</if>
<unless name="@default@">
<c:const value="0" type="integer" desc="Condition not met, but no default" />
<c:const value="0" desc="Condition not met, but no default" />
</unless>
</c:case>
</if>
@ -105,7 +105,7 @@
<c:case>
<c:when name="@yearset@" index="k">
<c:gt>
<c:const value="0" type="integer" desc="Only calculate difference if a value is available" />
<c:const value="0" desc="Only calculate difference if a value is available" />
</c:gt>
</c:when>

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2016 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -19,6 +19,7 @@
-->
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
xmlns:t="http://www.lovullo.com/rater/apply-template"
core="true"
title="Dummy Values">

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2017 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -19,6 +19,7 @@
-->
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
xmlns:t="http://www.lovullo.com/rater/apply-template"
core="true"
desc="Extern Definition">
@ -46,7 +47,7 @@
<text>1</text>
</param>
<extern name=":class:{@as@}" type="class" dim="0"
<extern name=":class:{@as@}" type="class" dim="@dim@"
yields="@yields@" />
<if name="@yields@">

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2016, 2018 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -37,15 +37,19 @@
Some notable TODOs:
\begin{enumerate}
\item Support scalar results;
\item Fail on zero premium unless explicitly stated;
\item Fail on negative premium (use a credit template); and
\item Support scalar results; and
\item Rounding direction (currently only nearest).
\end{enumerate}
\todo{Template to abstract these {\tt rate-each} generation
templates.}
<param name="assert_ignore_premium_zero" type="boolean"
desc="Ignore assertion failures for $0 premiums" />
<param name="assert_ignore_premium_negative" type="boolean"
desc="Ignore assertion failures for negative premiums" />
<template name="_premium_"
desc="A premium dollar amount">
@ -113,6 +117,14 @@
</if>
</unless>
<param name="@allow-zero@" desc="Allow value of zero (default false)">
<text>false</text>
</param>
<param name="@allow-negative@" desc="Allow negative value (default false)">
<text>false</text>
</param>
<rate-each class="@class@" no="@no@" yields="@yields@"
generates="@generates@" index="@index@"
@ -155,6 +167,52 @@
</unless>
</unless>
</rate-each>
<!-- assertion for non-zero -->
<unless name="@allow-zero@" eq="true">
<unless name="@generates@" eq="">
<t:assert failure="{@desc@} ({@generates@}) must not yield a value
of 0 for any index">
<any>
<match on="assert_ignore_premium_zero" />
<t:match-ne on="@generates@" value="ZERO" />
</any>
</t:assert>
</unless>
<unless name="@yields@" eq="">
<t:assert failure="{@desc@} ({@yields@}) must not yield a value of 0">
<any>
<match on="assert_ignore_premium_zero" />
<t:match-ne on="@yields@" value="ZERO" />
</any>
</t:assert>
</unless>
</unless>
<!-- assertion for non-negative -->
<unless name="@allow-negative@" eq="true">
<unless name="@generates@" eq="">
<t:assert failure="{@desc@} ({@generates@}) must not yield a negative
value for any index">
<any>
<match on="assert_ignore_premium_negative" />
<t:match-gte on="@generates@" value="ZERO" />
</any>
</t:assert>
</unless>
<unless name="@yields@" eq="">
<t:assert failure="{@desc@} ({@yields@}) must not yield a negative
value">
<any>
<match on="assert_ignore_premium_negative" />
<t:match-gte on="@yields@" value="ZERO" />
</any>
</t:assert>
</unless>
</unless>
</template>
@ -314,13 +372,13 @@
<unless name="@generates@" eq="">
<t:assert failure="{@generates@} must not yield a negative value
for any index">
<t:match-gt on="@generates@" value="ZERO" />
<t:match-gte on="@generates@" value="ZERO" />
</t:assert>
</unless>
<unless name="@yields@" eq="">
<t:assert failure="{@yields@} must not yield a negative value">
<t:match-gt on="@yields@" value="ZERO" />
<t:match-gte on="@yields@" value="ZERO" />
</t:assert>
</unless>
</unless>
@ -380,10 +438,15 @@
<text></text>
</param>
<param name="@allow-zero@" desc="Allow value of zero (default false)">
<text></text>
</param>
<t:factor _prefix="credit"
class="@class@" no="@no@" yields="@yields@" sym="@sym@"
generates="@generates@" index="@index@" gensym="@gensym@"
allow-zero="@allow-zero@"
default="@default@"
desc="@desc@">
<param-copy name="@values@" />
@ -444,10 +507,15 @@
<text></text>
</param>
<param name="@allow-zero@" desc="Allow value of zero (default false)">
<text></text>
</param>
<t:factor _prefix="debit"
class="@class@" no="@no@" yields="@yields@" sym="@sym@"
generates="@generates@" index="@index@" gensym="@gensym@"
allow-zero="@allow-zero@"
default="@default@"
desc="@desc@">
<param-copy name="@values@" />

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2015 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -68,7 +68,7 @@
<unless name="@default@" eq="">
<c:otherwise>
<c:const value="@default@" type="integer" desc="No mapping" />
<c:const value="@default@" desc="No mapping" />
</c:otherwise>
</unless>
</c:cases>

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2015 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -19,6 +19,7 @@
-->
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
xmlns:t="http://www.lovullo.com/rater/apply-template"
core="true"
desc="Numeric computations dealing with boolean algebra">
@ -28,10 +29,10 @@
<function name="not" desc="Negates a boolean value" sym="\lnot">
<param name="not_value" type="boolean" desc="Boolean value to negate" />
<c:const value="1" type="boolean" desc="Value of 1 if given value is zero">
<c:const value="1" desc="Value of 1 if given value is zero">
<c:when name="not_value">
<c:eq>
<c:const value="0" type="boolean" desc="Value to assert against for returning 1" />
<c:const value="0" desc="Value to assert against for returning 1" />
</c:eq>
</c:when>
</c:const>

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2015 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2015 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -33,30 +33,6 @@
<import package="round" export="true" />
<!-- even more trivial, but again, cuts down on code -->
<template name="_scalarToAccum_" desc="Simply accumulates a scalar">
<param name="@scalar@" desc="Scalar to accumulate" />
<param name="@accum@" desc="Accumulator to accumulate into" />
<!-- this is useless, but required -->
<param name="@yields@" desc="Value to yield into, since it's required (useless)">
<text>__accum_</text>
<param-value name="@accum@" />
<text>_</text>
<param-value name="@scalar@" />
</param>
<param name="@type@" desc="Accumulation method">
<text>all</text>
</param>
<rate yields="@yields@">
<accumulate into="@accum@" type="@type@" />
<c:value-of name="@scalar@" />
</rate>
</template>
<!--
Map values falling within adjacent intervals

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2015 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -79,7 +79,7 @@
<c:apply name="max" label="@label@">
<c:arg name="max1">
<c:const value="0" type="integer" desc="Do not allow a value under 0" />
<c:const value="0" desc="Do not allow a value under 0" />
</c:arg>
<c:arg name="max2">
@ -102,7 +102,7 @@
<param-value name="@generates@" />
</param>
<rate-each class="@class@" accumulate="none" yields="@yields@" generates="@generates@" index="k">
<rate-each class="@class@" yields="@yields@" generates="@generates@" index="k">
<c:apply name="max">
<c:arg name="max1">
<c:value-of name="@a@" index="k" />
@ -135,12 +135,11 @@
<c:arg name="min1">
<!-- deprecated -->
<if name="@value@">
<c:const value="@value@" type="float" desc="@desc@" />
<c:const value="@value@" desc="@desc@" />
</if>
<unless name="@value@">
<c:value-of name="@name@"
index="@index@"
type="float"
label="@desc@" />
</unless>
</c:arg>
@ -162,7 +161,7 @@
<c:apply name="max" label="{@label@}, minimum of 1">
<c:arg name="max1">
<c:const value="@min@" type="float" desc="Minimum value" />
<c:const value="@min@" desc="Minimum value" />
</c:arg>
<c:arg name="max2">
@ -179,10 +178,10 @@
<param name="@desc@" desc="Description" />
<c:gte>
<c:const value="@min@" type="float" desc="{@desc@}; minimum" />
<c:const value="@min@" desc="{@desc@}; minimum" />
</c:gte>
<c:lte>
<c:const value="@max@" type="float" desc="{@desc@}; maximum" />
<c:const value="@max@" desc="{@desc@}; maximum" />
</c:lte>
</template>
@ -202,7 +201,6 @@
<param name="@generates@" desc="Variable to generate into" />
<param name="@when@" desc="Conditional bump" />
<param name="@class@" desc="Class to match on" />
<param name="@keep@" desc="Value of keep flag" />
<!-- alternative to @name@ -->
<param name="@const@" desc="Constant value, instead of named" />
@ -211,7 +209,7 @@
<param name="@maxpercent@" desc="Maximum percent" />
<rate yields="_{@generates@}" keep="@keep@">
<rate yields="_{@generates@}">
<c:sum of="@name@" index="k" generates="@generates@" desc="Bumped value">
<c:cases>
<!-- if a condition was provided, check it first -->
@ -238,7 +236,7 @@
</c:when>
<!-- just return the value provided -->
<c:const value="0" type="float" desc="Zero value" />
<c:const value="0" desc="Zero value" />
</c:case>
</if>
@ -266,8 +264,8 @@
</unless>
<c:quotient label="Percent as real number">
<c:const value="@percent@" type="integer" desc="Whole percent" />
<c:const value="100" type="integer" desc="Divisor to convert percent to real number" />
<c:const value="@percent@" desc="Whole percent" />
<c:const value="100" desc="Divisor to convert percent to real number" />
</c:quotient>
</if>
@ -279,7 +277,7 @@
</unless>
<if name="@const@">
<c:const value="@const@" type="float" desc="Constant minimum value" />
<c:const value="@const@" desc="Constant minimum value" />
</if>
</c:value>
</c:values>
@ -305,8 +303,8 @@
</unless>
<c:quotient label="Max percent as real number">
<c:const value="@maxpercent@" type="integer" desc="Whole max percent" />
<c:const value="100" type="integer" desc="Divisor to convert max percent to real number" />
<c:const value="@maxpercent@" desc="Whole max percent" />
<c:const value="100" desc="Divisor to convert max percent to real number" />
</c:quotient>
</c:product>
</c:value>

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2015 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -132,11 +132,11 @@
<c:value-of name="@name@" index="@index@" />
</if>
<unless name="@name@">
<c:const value="@value@" type="float" desc="@desc@" />
<c:const value="@value@" desc="@desc@" />
</unless>
</c:product>
<c:const value="100" type="integer" desc="Convert to rational number" />
<c:const value="100" desc="Convert to rational number" />
</c:quotient>
</c:sum>
</template>

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2015 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -42,8 +42,8 @@
<c:arg name="round_real_n">
<c:expt>
<c:const value="10" type="integer" desc="Decimal base" />
<c:const value="@precision@" type="integer" desc="Exponent" />
<c:const value="10" desc="Decimal base" />
<c:const value="@precision@" desc="Exponent" />
</c:expt>
</c:arg>
</c:apply>
@ -111,7 +111,7 @@
desc="Exponential/step divisor">
<c:product>
<c:expt>
<c:const value="10" type="integer"
<c:const value="10"
desc="Decimal base" />
<c:value-of name="@exp@" />
</c:expt>
@ -196,7 +196,7 @@
<c:floor>
<c:sum>
<c:value-of name="roundval" />
<c:const value="0.5" type="float" desc="Raises value in a manner that it can be properly rounded by a floor" />
<c:const value="0.5" desc="Raises value in a manner that it can be properly rounded by a floor" />
</c:sum>
</c:floor>
</function>
@ -229,7 +229,7 @@
<c:apply name="round_real">
<c:arg name="round_real_n">
<c:const value="100" type="integer" desc="Round to the nearest 100th" />
<c:const value="100" desc="Round to the nearest 100th" />
</c:arg>
<c:arg name="round_real_val">
@ -248,15 +248,15 @@
<c:quotient>
<param-copy name="@values@" />
<c:expt>
<c:const value="10" type="integer" desc="Decimal base" />
<c:const value="@digits@" type="integer" desc="Number of digits" />
<c:const value="10" desc="Decimal base" />
<c:const value="@digits@" desc="Number of digits" />
</c:expt>
</c:quotient>
</c:ceil>
<c:expt>
<c:const value="10" type="integer" desc="Decimal base" />
<c:const value="@digits@" type="integer" desc="Number of digits" />
<c:const value="10" desc="Decimal base" />
<c:const value="@digits@" desc="Number of digits" />
</c:expt>
</c:product>
</template>

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2018 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.

28
core/retry.xml 100644
View File

@ -0,0 +1,28 @@
<?xml version="1.0"?>
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
xmlns:t="http://www.lovullo.com/rater/apply-template"
title="Retry rating state">
<import package="assert" export="true" />
<import package="base" />
<import package="extern" />
<import package="vector/cmatch" export="true" />
<t:classify-extern yields="__retry" dim="0" />
<template name="_suggest-retry-when_" desc="Retry Rating">
<param name="@values@" desc="Rule matches" />
<classify as="__retry" yields="__retry"
desc="Retry state for a supplier">
<param-copy name="@values@" />
</classify>
<t:assert failure="Retrying suppliers are ineligible"
as="-assert-supplier-pending">
<t:match-eq on="__retry" value="FALSE" />
</t:assert>
</template>
</package>

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2017, 2018 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -122,57 +122,57 @@
<inline-template>
<for-each>
<set state_const="STATE_AK" state_upper="AK" state_lower="ak" state_name="Alaska" />
<set state_const="STATE_AL" state_upper="AL" state_lower="al" state_name="Alabama" />
<set state_const="STATE_AR" state_upper="AR" state_lower="ar" state_name="Arkansas" />
<set state_const="STATE_AZ" state_upper="AZ" state_lower="az" state_name="Arizona" />
<set state_const="STATE_CA" state_upper="CA" state_lower="ca" state_name="California" />
<set state_const="STATE_CO" state_upper="CO" state_lower="co" state_name="Colorado" />
<set state_const="STATE_CT" state_upper="CT" state_lower="ct" state_name="Connecticut" />
<set state_const="STATE_DC" state_upper="DC" state_lower="dc" state_name="Washington DC" />
<set state_const="STATE_DE" state_upper="DE" state_lower="de" state_name="Delaware" />
<set state_const="STATE_FL" state_upper="FL" state_lower="fl" state_name="Florida" />
<set state_const="STATE_GA" state_upper="GA" state_lower="ga" state_name="Georgia" />
<set state_const="STATE_HI" state_upper="HI" state_lower="hi" state_name="Hawaii" />
<set state_const="STATE_IA" state_upper="IA" state_lower="ia" state_name="Iowa" />
<set state_const="STATE_ID" state_upper="ID" state_lower="id" state_name="Idaho" />
<set state_const="STATE_IL" state_upper="IL" state_lower="il" state_name="Illinois" />
<set state_const="STATE_IN" state_upper="IN" state_lower="in" state_name="Indiana" />
<set state_const="STATE_KS" state_upper="KS" state_lower="ks" state_name="Kansas" />
<set state_const="STATE_KY" state_upper="KY" state_lower="ky" state_name="Kentucky" />
<set state_const="STATE_LA" state_upper="LA" state_lower="la" state_name="Louisiana" />
<set state_const="STATE_MA" state_upper="MA" state_lower="ma" state_name="Massachusetts" />
<set state_const="STATE_MD" state_upper="MD" state_lower="md" state_name="Maryland" />
<set state_const="STATE_ME" state_upper="ME" state_lower="me" state_name="Maine" />
<set state_const="STATE_MI" state_upper="MI" state_lower="mi" state_name="Michigan" />
<set state_const="STATE_MN" state_upper="MN" state_lower="mn" state_name="Minnesota" />
<set state_const="STATE_MO" state_upper="MO" state_lower="mo" state_name="Missouri" />
<set state_const="STATE_MS" state_upper="MS" state_lower="ms" state_name="Mississippi" />
<set state_const="STATE_MT" state_upper="MT" state_lower="mt" state_name="Montana" />
<set state_const="STATE_NC" state_upper="NC" state_lower="nc" state_name="North Carolina" />
<set state_const="STATE_ND" state_upper="ND" state_lower="nd" state_name="North Dakota" />
<set state_const="STATE_NE" state_upper="NE" state_lower="ne" state_name="Nebraska" />
<set state_const="STATE_NH" state_upper="NH" state_lower="nh" state_name="New Hampshire" />
<set state_const="STATE_NJ" state_upper="NJ" state_lower="nj" state_name="New Jersey" />
<set state_const="STATE_NM" state_upper="NM" state_lower="nm" state_name="New Mexico" />
<set state_const="STATE_NV" state_upper="NV" state_lower="nv" state_name="Nevada" />
<set state_const="STATE_NY" state_upper="NY" state_lower="ny" state_name="New York" />
<set state_const="STATE_OH" state_upper="OH" state_lower="oh" state_name="Ohio" />
<set state_const="STATE_OK" state_upper="OK" state_lower="ok" state_name="Oklahoma" />
<set state_const="STATE_OR" state_upper="OR" state_lower="or" state_name="Oregon" />
<set state_const="STATE_PA" state_upper="PA" state_lower="pa" state_name="Pennsylvania" />
<set state_const="STATE_RI" state_upper="RI" state_lower="ri" state_name="Rhode Island" />
<set state_const="STATE_SC" state_upper="SC" state_lower="sc" state_name="South Carolina" />
<set state_const="STATE_SD" state_upper="SD" state_lower="sd" state_name="South Dakota" />
<set state_const="STATE_TN" state_upper="TN" state_lower="tn" state_name="Tennessee" />
<set state_const="STATE_TX" state_upper="TX" state_lower="tx" state_name="Texas" />
<set state_const="STATE_UT" state_upper="UT" state_lower="ut" state_name="Utah" />
<set state_const="STATE_VA" state_upper="VA" state_lower="va" state_name="Virginia" />
<set state_const="STATE_VT" state_upper="VT" state_lower="vt" state_name="Vermont" />
<set state_const="STATE_WA" state_upper="WA" state_lower="wa" state_name="Washington" />
<set state_const="STATE_WI" state_upper="WI" state_lower="wi" state_name="Wisconsin" />
<set state_const="STATE_WV" state_upper="WV" state_lower="wv" state_name="West Virginia" />
<set state_const="STATE_WY" state_upper="WY" state_lower="wy" state_name="Wyoming" />
<set state_const="STATE_AK" state_upper="AK" state_lower="ak" upper_lower="Ak" state_name="Alaska" />
<set state_const="STATE_AL" state_upper="AL" state_lower="al" upper_lower="Al" state_name="Alabama" />
<set state_const="STATE_AR" state_upper="AR" state_lower="ar" upper_lower="Ar" state_name="Arkansas" />
<set state_const="STATE_AZ" state_upper="AZ" state_lower="az" upper_lower="Az" state_name="Arizona" />
<set state_const="STATE_CA" state_upper="CA" state_lower="ca" upper_lower="Ca" state_name="California" />
<set state_const="STATE_CO" state_upper="CO" state_lower="co" upper_lower="Co" state_name="Colorado" />
<set state_const="STATE_CT" state_upper="CT" state_lower="ct" upper_lower="Ct" state_name="Connecticut" />
<set state_const="STATE_DC" state_upper="DC" state_lower="dc" upper_lower="Dc" state_name="Washington DC" />
<set state_const="STATE_DE" state_upper="DE" state_lower="de" upper_lower="De" state_name="Delaware" />
<set state_const="STATE_FL" state_upper="FL" state_lower="fl" upper_lower="Fl" state_name="Florida" />
<set state_const="STATE_GA" state_upper="GA" state_lower="ga" upper_lower="Ga" state_name="Georgia" />
<set state_const="STATE_HI" state_upper="HI" state_lower="hi" upper_lower="Hi" state_name="Hawaii" />
<set state_const="STATE_IA" state_upper="IA" state_lower="ia" upper_lower="Ia" state_name="Iowa" />
<set state_const="STATE_ID" state_upper="ID" state_lower="id" upper_lower="Id" state_name="Idaho" />
<set state_const="STATE_IL" state_upper="IL" state_lower="il" upper_lower="Il" state_name="Illinois" />
<set state_const="STATE_IN" state_upper="IN" state_lower="in" upper_lower="In" state_name="Indiana" />
<set state_const="STATE_KS" state_upper="KS" state_lower="ks" upper_lower="Ks" state_name="Kansas" />
<set state_const="STATE_KY" state_upper="KY" state_lower="ky" upper_lower="Ky" state_name="Kentucky" />
<set state_const="STATE_LA" state_upper="LA" state_lower="la" upper_lower="La" state_name="Louisiana" />
<set state_const="STATE_MA" state_upper="MA" state_lower="ma" upper_lower="Ma" state_name="Massachusetts" />
<set state_const="STATE_MD" state_upper="MD" state_lower="md" upper_lower="Md" state_name="Maryland" />
<set state_const="STATE_ME" state_upper="ME" state_lower="me" upper_lower="Me" state_name="Maine" />
<set state_const="STATE_MI" state_upper="MI" state_lower="mi" upper_lower="Mi" state_name="Michigan" />
<set state_const="STATE_MN" state_upper="MN" state_lower="mn" upper_lower="Mn" state_name="Minnesota" />
<set state_const="STATE_MO" state_upper="MO" state_lower="mo" upper_lower="Mo" state_name="Missouri" />
<set state_const="STATE_MS" state_upper="MS" state_lower="ms" upper_lower="Ms" state_name="Mississippi" />
<set state_const="STATE_MT" state_upper="MT" state_lower="mt" upper_lower="Mt" state_name="Montana" />
<set state_const="STATE_NC" state_upper="NC" state_lower="nc" upper_lower="Nc" state_name="North Carolina" />
<set state_const="STATE_ND" state_upper="ND" state_lower="nd" upper_lower="Nd" state_name="North Dakota" />
<set state_const="STATE_NE" state_upper="NE" state_lower="ne" upper_lower="Ne" state_name="Nebraska" />
<set state_const="STATE_NH" state_upper="NH" state_lower="nh" upper_lower="Nh" state_name="New Hampshire" />
<set state_const="STATE_NJ" state_upper="NJ" state_lower="nj" upper_lower="Nj" state_name="New Jersey" />
<set state_const="STATE_NM" state_upper="NM" state_lower="nm" upper_lower="Nm" state_name="New Mexico" />
<set state_const="STATE_NV" state_upper="NV" state_lower="nv" upper_lower="Nv" state_name="Nevada" />
<set state_const="STATE_NY" state_upper="NY" state_lower="ny" upper_lower="Ny" state_name="New York" />
<set state_const="STATE_OH" state_upper="OH" state_lower="oh" upper_lower="Oh" state_name="Ohio" />
<set state_const="STATE_OK" state_upper="OK" state_lower="ok" upper_lower="Ok" state_name="Oklahoma" />
<set state_const="STATE_OR" state_upper="OR" state_lower="or" upper_lower="Or" state_name="Oregon" />
<set state_const="STATE_PA" state_upper="PA" state_lower="pa" upper_lower="Pa" state_name="Pennsylvania" />
<set state_const="STATE_RI" state_upper="RI" state_lower="ri" upper_lower="Ri" state_name="Rhode Island" />
<set state_const="STATE_SC" state_upper="SC" state_lower="sc" upper_lower="Sc" state_name="South Carolina" />
<set state_const="STATE_SD" state_upper="SD" state_lower="sd" upper_lower="Sd" state_name="South Dakota" />
<set state_const="STATE_TN" state_upper="TN" state_lower="tn" upper_lower="Tn" state_name="Tennessee" />
<set state_const="STATE_TX" state_upper="TX" state_lower="tx" upper_lower="Tx" state_name="Texas" />
<set state_const="STATE_UT" state_upper="UT" state_lower="ut" upper_lower="Ut" state_name="Utah" />
<set state_const="STATE_VA" state_upper="VA" state_lower="va" upper_lower="Va" state_name="Virginia" />
<set state_const="STATE_VT" state_upper="VT" state_lower="vt" upper_lower="Vt" state_name="Vermont" />
<set state_const="STATE_WA" state_upper="WA" state_lower="wa" upper_lower="Wa" state_name="Washington" />
<set state_const="STATE_WI" state_upper="WI" state_lower="wi" upper_lower="Wi" state_name="Wisconsin" />
<set state_const="STATE_WV" state_upper="WV" state_lower="wv" upper_lower="Wv" state_name="West Virginia" />
<set state_const="STATE_WY" state_upper="WY" state_lower="wy" upper_lower="Wy" state_name="Wyoming" />
</for-each>
<param-copy name="@values@" />

View File

@ -1 +0,0 @@
state.xml

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2018 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2015 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -19,6 +19,7 @@
-->
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
xmlns:t="http://www.lovullo.com/rater/apply-template"
core="true"
desc="Territory data support (used in conjunction with tdat script)">
@ -36,8 +37,8 @@
</param>
<rate-each class="@class@" accumulate="none" yields="@yields@" generates="@generates@" index="k">
<c:const value="@code@" type="integer" desc="Territory code" />
<rate-each class="@class@" yields="@yields@" generates="@generates@" index="k">
<c:const value="@code@" desc="Territory code" />
</rate-each>
</template>
</package>

View File

@ -1,264 +0,0 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2018 R-T Specialty, LLC.
This file is part of tame-core.
tame-core is free software: you can redistribute it and/or modify it
under the terms of the GNU General Public License as
published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
-->
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
xmlns:t="http://www.lovullo.com/rater/apply-template"
desc="Aggregate Package Specification">
<import package="../../base" />
<import package="../../test/spec" />
<import package="../../base" />
<import package="../../vector/cmatch" />
<import package="../../vector/stub" />
<import package="../../aggregate" />
<rate-each class="nclass3"
generates="aggregateGen1" index="k">
<c:const value="1" desc="Constant value" />
</rate-each>
<rate-each class="nclass3"
generates="aggregateGen2" index="k">
<c:value-of name="k" />
</rate-each>
<rate yields="aggregateRate1">
<c:const value="1" desc="Constant value" />
</rate>
<rate yields="aggregateRate2">
<c:const value="3" desc="Constant value" />
</rate>
<classify as="agg-class-1"
desc="Aggregate test 1">
<match on="AGG_1VEC" />
</classify>
<classify as="agg-class-2"
desc="Aggregate test 2">
<t:match-gt on="AGG_INCVEC" const="0" />
</classify>
<t:n-vector n="3" name="AGG_1VEC" value="1" />
<const name="AGG_INCVEC" desc="Incrementing vector">
<item value="0" />
<item value="1" />
<item value="2" />
</const>
<t:describe name="aggregate template">
<t:describe name="_aggregate-rate-each_">
<t:aggregate-rate-each class="nclass3" yields="yieldAggReEmpty"
prefix="doesNotExist"
generates="genAggReEmpty" />
<t:aggregate-rate-each class="nclass3" yields="yieldAggReNonEmpty"
prefix="aggregateGen"
generates="genAggReNonEmpty" />
<t:describe name="with no symbols">
<t:it desc="produces 0">
<t:given>
<c:sum>
<c:value-of name="yieldAggReEmpty" />
<c:sum of="genAggReEmpty" />
</c:sum>
</t:given>
<t:expect>
<t:match-result eq="0" />
</t:expect>
</t:it>
</t:describe>
<t:describe name="with symbols">
<t:it desc="sums respective index of each symbol">
<t:given>
<c:sum of="genAggReNonEmpty" />
</t:given>
<t:expect>
<!-- 1 + 2 + 3 -->
<t:match-result eq="6" />
</t:expect>
</t:it>
<t:it desc="yields sum of symbols">
<t:given>
<c:value-of name="yieldAggReNonEmpty" />
</t:given>
<t:expect>
<!-- same as above -->
<t:match-result eq="6" />
</t:expect>
</t:it>
</t:describe>
</t:describe>
<t:describe name="_aggregate-rate_">
<t:aggregate-rate prefix="doesNotExist" yields="yieldAggRateEmpty" />
<t:aggregate-rate prefix="aggregateRate" yields="yieldAggRateNonEmpty" />
<t:describe name="with no symbols">
<t:it desc="yields 0">
<t:given>
<c:value-of name="yieldAggRateEmpty" />
</t:given>
<t:expect>
<t:match-result eq="0" />
</t:expect>
</t:it>
</t:describe>
<t:describe name="with symbols">
<t:it desc="yields sum of symbols">
<t:given>
<c:value-of name="yieldAggRateNonEmpty" />
</t:given>
<t:expect>
<t:match-result eq="4" />
</t:expect>
</t:it>
</t:describe>
</t:describe>
<t:describe name="_aggregate-classify_">
<t:describe name="as a univiersal quantifier">
<t:aggregate-classify prefix="does-not-exist" as="class-agg-univ-empty"
desc="Aggregate universal class empty test"
yields="classAggUnivEmpty" />
<t:aggregate-classify prefix="agg-class-" as="class-agg-univ-nonempty"
desc="Aggregate class nonempty test"
yields="classAggUnivNonEmpty" />
<t:describe name="with no symbols">
<t:it desc="produces scalar 1">
<t:given>
<c:value-of name="classAggUnivEmpty" />
</t:given>
<t:expect>
<t:match-result eq="1" />
</t:expect>
</t:it>
</t:describe>
<t:describe name="with symbols">
<t:it desc="generates matching class">
<rate-each class="class-agg-univ-nonempty"
yields="aggUnivNonEmptyCheck"
index="k">
<c:const value="1" desc="Truth check" />
</rate-each>
<t:expect>
<!-- two non-zero in AGG_INCVEC -->
<t:match-eq on="aggUnivNonEmptyCheck" const="2" />
</t:expect>
</t:it>
<t:it desc="produces vector">
<t:given>
<c:sum of="classAggUnivNonEmpty" />
</t:given>
<t:expect>
<!-- two non-zero in AGG_INCVEC -->
<t:match-result eq="2" />
</t:expect>
</t:it>
</t:describe>
</t:describe>
<t:describe name="as a existential quantifier">
<t:aggregate-classify prefix="does-not-exist" as="class-agg-exist-empty"
desc="Aggregate existersal class empty test"
yields="classAggExistEmpty"
any="true" />
<t:aggregate-classify prefix="agg-class-" as="class-agg-exist-nonempty"
desc="Aggregate class nonempty test"
yields="classAggExistNonEmpty"
any="true" />
<t:describe name="with no symbols">
<t:it desc="produces scalar 0">
<t:given>
<c:value-of name="classAggExistEmpty" />
</t:given>
<t:expect>
<t:match-result eq="0" />
</t:expect>
</t:it>
</t:describe>
<t:describe name="with symbols">
<t:it desc="generates matching class">
<rate-each class="class-agg-exist-nonempty"
yields="aggExistNonEmptyCheck"
index="k">
<c:const value="1" desc="Truth check" />
</rate-each>
<t:expect>
<!-- all match in AGG_1VEC -->
<t:match-eq on="aggExistNonEmptyCheck" const="3" />
</t:expect>
</t:it>
<t:it desc="produces vector">
<t:given>
<c:sum of="classAggExistNonEmpty" />
</t:given>
<t:expect>
<!-- all match in AGG_1VEC -->
<t:match-result eq="3" />
</t:expect>
</t:it>
</t:describe>
</t:describe>
</t:describe>
</t:describe>
</package>

View File

@ -0,0 +1,733 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
tame-core is free software: you can redistribute it and/or modify it
under the terms of the GNU General Public License as
published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
-->
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
xmlns:t="http://www.lovullo.com/rater/apply-template"
desc="Classification System Specs">
<import package="../../test/spec" />
<import package="../../vector/stub" />
<import package="../../base" />
Note that many of these classifications may match on similar values to try
to thwart potential optimizations,
present or future,
but these approaches
may need further adjustment to thwart future optimizations (or a way to
explicitly inhibit them).
These tests are also written a bit lazily,
given the difficulties in matching comprehensively;
that ought to be fixed in the future.
<const name="MAT3X3" desc="3x3 Matrix, Ones">
<set desc="Row 0">
<item value="1" desc="0,0" />
<item value="1" desc="0,1" />
<item value="1" desc="0,2" />
</set>
<set desc="Row 1">
<item value="1" desc="1,0" />
<item value="1" desc="1,1" />
<item value="1" desc="1,2" />
</set>
<set desc="Row 2">
<item value="1" desc="2,0" />
<item value="1" desc="2,1" />
<item value="1" desc="2,2" />
</set>
</const>
<const name="MAT3X3Z" desc="3x3 Matrix, Zeroes">
<set desc="Row 0">
<item value="0" desc="0,0" />
<item value="0" desc="0,1" />
<item value="0" desc="0,2" />
</set>
<set desc="Row 1">
<item value="0" desc="1,0" />
<item value="0" desc="1,1" />
<item value="0" desc="1,2" />
</set>
<set desc="Row 2">
<item value="0" desc="2,0" />
<item value="0" desc="2,1" />
<item value="0" desc="2,2" />
</set>
</const>
<const name="MAT3X3OOZ" desc="3x3 Matrix, Columns 1, 1, 0">
<set desc="Row 0">
<item value="1" desc="0,0" />
<item value="1" desc="0,1" />
<item value="0" desc="0,2" />
</set>
<set desc="Row 1">
<item value="1" desc="1,0" />
<item value="1" desc="1,1" />
<item value="0" desc="1,2" />
</set>
<set desc="Row 2">
<item value="1" desc="2,0" />
<item value="1" desc="2,1" />
<item value="0" desc="2,2" />
</set>
</const>
<const name="MAT3X1" desc="3x2 Matrix, Ones">
<set desc="Row 0">
<item value="1" desc="0,0" />
</set>
<set desc="Row 1">
<item value="1" desc="1,0" />
</set>
<set desc="Row 2">
<item value="1" desc="2,0" />
</set>
</const>
<const name="MAT3X1Z" desc="3x2 Matrix, Zeroes">
<set desc="Row 0">
<item value="0" desc="0,0" />
</set>
<set desc="Row 1">
<item value="0" desc="1,0" />
</set>
<set desc="Row 2">
<item value="0" desc="2,0" />
</set>
</const>
<const name="MAT1X3Z" desc="1x3 Matrix, Zeroes">
<set desc="Row 0">
<item value="0" desc="0,0" />
<item value="0" desc="0,1" />
<item value="0" desc="0,2" />
</set>
</const>
<template name="_class-tests_" desc="Classification system tests">
<param name="@system@" desc="SUT (lowercase)" />
<param name="@systemuc@" desc="SUT (title case)">
<param-value name="@system@" ucfirst="true" />
</param>
<t:describe name="{@system@} classify">
<t:describe name="without predicates">
<t:it desc="yields TRUE for conjunction">
<classify as="conj-no-pred-{@system@}"
yields="conjNoPred{@systemuc@}"
desc="No predicate, conjunction" />
<t:given name="conjNoPred{@systemuc@}" />
<t:expect>
<t:match-result value="TRUE" />
</t:expect>
</t:it>
<t:it desc="yields FALSE for disjunction">
<classify as="disj-no-pred-{@system@}"
yields="disjNoPred{@systemuc@}"
any="true"
desc="No predicate, disjunction" />
<t:given name="disjNoPred{@systemuc@}" />
<t:expect>
<t:match-result value="FALSE" />
</t:expect>
</t:it>
</t:describe>
<t:describe name="with scalar predicates">
<t:it desc="yields TRUE when scalar value is TRUE">
<t:given-classify>
<match on="alwaysTrue" />
</t:given-classify>
<t:expect>
<t:match-result value="TRUE" />
</t:expect>
</t:it>
<t:it desc="yields FALSE when scalar value is FALSE">
<t:given-classify>
<match on="neverTrue" />
</t:given-classify>
<t:expect>
<t:match-result value="FALSE" />
</t:expect>
</t:it>
<t:it desc="yields TRUE for all-true scalar conjunction">
<t:given-classify>
<match on="alwaysTrue" />
<match on="neverTrue" value="FALSE" />
</t:given-classify>
<t:expect>
<t:match-result value="TRUE" />
</t:expect>
</t:it>
<t:it desc="yields TRUE for all-true scalar disjunction">
<t:given-classify>
<any>
<match on="alwaysTrue" />
<match on="neverTrue" value="FALSE" />
</any>
</t:given-classify>
<t:expect>
<t:match-result value="TRUE" />
</t:expect>
</t:it>
<t:it desc="yields TRUE for single-true scalar disjunction">
<t:given-classify>
<any>
<match on="alwaysTrue" />
<match on="neverTrue" />
</any>
</t:given-classify>
<t:expect>
<t:match-result value="TRUE" />
</t:expect>
</t:it>
</t:describe>
<t:describe name="with vector predicates">
<t:it desc="yields TRUE for all-true element-wise conjunction">
<t:given-classify-scalar>
<match on="NVEC3" value="ZERO" />
<match on="nClass3" value="TRUE" />
</t:given-classify-scalar>
<t:expect>
<t:match-result value="TRUE" />
</t:expect>
</t:it>
<t:it desc="yields FALSE for some-true element-wise conjunction">
<t:given-classify-scalar>
<match on="NVEC3" value="ZERO" />
<match on="nClass3" value="FALSE" />
</t:given-classify-scalar>
<t:expect>
<t:match-result value="FALSE" />
</t:expect>
</t:it>
<t:it desc="yields TRUE for some-true element-wise disjunction">
<t:given-classify-scalar>
<any>
<match on="NVEC3" value="ZERO" />
<match on="nClass3" value="FALSE" />
</any>
</t:given-classify-scalar>
<t:expect>
<t:match-result value="TRUE" />
</t:expect>
</t:it>
<t:it desc="yields FALSE for all-false element-wise disjunction">
<t:given-classify-scalar>
<any>
<match on="NVEC3" value="TRUE" />
<match on="nClass3" value="FALSE" />
</any>
</t:given-classify-scalar>
<t:expect>
<t:match-result value="FALSE" />
</t:expect>
</t:it>
The old classification system would interpret missing values as $0$,
which could potentially trigger a match.
The new classification system will always yield \tparam{FALSE}
regardless of predicate when values are undefined.
<t:describe name="of different lengths">
<if name="@system@" eq="legacy">
<t:describe name="with legacy classification system">
<t:it desc="interprets undefined values as zero during match">
<classify as="vec-len-mismatch-conj-{@system@}"
yields="vecLenMismatchConj{@systemuc@}"
desc="Multi vector length mismatch (legacy)">
<!-- actually ZERO for all indexes -->
<match on="NVEC3" value="ZERO" />
<!-- legacy system, implicitly zero for match -->
<match on="NVEC2" value="ZERO" />
</classify>
<t:given>
<c:value-of name="vecLenMismatchConj{@systemuc@}" index="#2" />
</t:given>
<t:expect>
<t:match-result value="TRUE" />
</t:expect>
</t:it>
</t:describe>
</if>
<if name="@system@" eq="new">
<t:describe name="with new classification system">
<t:it desc="yields false for conjunction rather than implicit zero">
<classify as="vec-len-mismatch-conj-{@system@}"
yields="vecLenMismatchConj{@systemuc@}"
desc="Multi vector length mismatch (new system)">
<!-- actually ZERO for all indexes -->
<match on="NVEC3" value="ZERO" />
<!-- must not be implicitly ZERO for third index -->
<match on="NVEC2" value="ZERO" />
</classify>
<t:given>
<c:value-of name="vecLenMismatchConj{@systemuc@}" index="#2" />
</t:given>
<t:expect>
<t:match-result value="FALSE" />
</t:expect>
</t:it>
</t:describe>
</if>
</t:describe>
</t:describe>
<t:describe name="with matrix predicates">
<t:it desc="yields TRUE for all-true element-wise conjunction">
<t:given-classify-scalar>
<match on="MAT3X3Z" value="FALSE" />
<match on="MAT3X3" value="TRUE" />
</t:given-classify-scalar>
<t:expect>
<t:match-result value="TRUE" />
</t:expect>
</t:it>
<t:it desc="yields FALSE for some-true element-wise conjunction">
<t:given-classify-scalar>
<match on="MAT3X3Z" value="TRUE" />
<match on="MAT3X3" value="TRUE" />
</t:given-classify-scalar>
<t:expect>
<t:match-result value="FALSE" />
</t:expect>
</t:it>
<t:it desc="yields TRUE for some-true element-wise disjunction">
<t:given-classify-scalar>
<any>
<match on="MAT3X3Z" value="ZERO" />
<match on="MAT3X3" value="ZERO" />
</any>
</t:given-classify-scalar>
<t:expect>
<t:match-result value="TRUE" />
</t:expect>
</t:it>
<t:it desc="yields FALSE for all-false element-wise disjunction">
<t:given-classify-scalar>
<any>
<match on="MAT3X3Z" value="TRUE" />
<match on="MAT3X3" value="FALSE" />
</any>
</t:given-classify-scalar>
<t:expect>
<t:match-result value="FALSE" />
</t:expect>
</t:it>
<t:describe name="of different column lengths">
Certain behavior is the same between the old and the new system---%
in particular,
when the match of lower length is first.
<classify as="mat-len-mismatch-first-conj-{@system@}"
yields="matLenMismatchFirstConj{@systemuc@}"
desc="Multi matrix length mismatch when first match">
<!-- fallthrough for undefined (note that this is
intentionally matching on FALSE to test against an
implicit 0 in place of undefined) -->
<match on="MAT3X1Z" value="FALSE" />
<!-- first two columns ones, last column zero -->
<match on="MAT3X3OOZ" value="TRUE" />
</classify>
<t:it desc="always yields FALSE when first match (TRUE)">
<t:given>
<c:value-of name="matLenMismatchFirstConj{@systemuc@}">
<c:index>
<c:value-of name="#1" />
</c:index>
<c:index>
<c:value-of name="#1" />
</c:index>
</c:value-of>
</t:given>
<t:expect>
<t:match-result value="FALSE" />
</t:expect>
</t:it>
<t:it desc="always yields FALSE when first match (FALSE)">
<t:given>
<c:value-of name="matLenMismatchFirstConj{@systemuc@}">
<c:index>
<c:value-of name="#2" />
</c:index>
<c:index>
<c:value-of name="#2" />
</c:index>
</c:value-of>
</t:given>
<t:expect>
<t:match-result value="FALSE" />
</t:expect>
</t:it>
<if name="@system@" eq="legacy">
<t:describe name="with legacy classification system">
The legacy system is frightenly problematic when the matrix of
lesser column length appears after the first match---%
the commutative properites of the system are lost,
and the value from the previous match falls through!
<classify as="mat-len-mismatch-conj-{@system@}"
yields="matLenMismatchConj{@systemuc@}"
desc="Multi matrix length mismatch (legacy)">
<!-- first two columns ones, last column zero -->
<match on="MAT3X3OOZ" value="TRUE" />
<!-- fallthrough for undefined (note that this is
intentionally matching on FALSE to test against an
implicit 0 in place of undefined) -->
<match on="MAT3X1Z" value="FALSE" />
</classify>
<!-- which means that it's not cummutatitve! -->
<t:it desc="causes values from previous match to fall through
into undefined (TRUE)">
<t:given>
<c:value-of name="matLenMismatchConj{@systemuc@}">
<c:index>
<c:value-of name="#1" />
</c:index>
<c:index>
<c:value-of name="#1" />
</c:index>
</c:value-of>
</t:given>
<t:expect>
<t:match-result value="TRUE" />
</t:expect>
</t:it>
<t:it desc="causes values from previous match to fall through
into undefined (FALSE)">
<t:given>
<c:value-of name="matLenMismatchConj{@systemuc@}">
<c:index>
<c:value-of name="#2" />
</c:index>
<c:index>
<c:value-of name="#2" />
</c:index>
</c:value-of>
</t:given>
<t:expect>
<t:match-result value="FALSE" />
</t:expect>
</t:it>
</t:describe>
</if>
<if name="@system@" eq="new">
<t:describe name="with new classification system">
<classify as="mat-len-mismatch-conj-{@system@}"
yields="matLenMismatchConj{@systemuc@}"
desc="Multi matrix length mismatch (new)">
<!-- first two columns ones, last column zero -->
<match on="MAT3X3OOZ" value="TRUE" />
<!-- must not fall through like legacy (must always be
FALSE; note that this is intentionally matching on
FALSE to test against an implicit 0 in place of
undefined) -->
<match on="MAT3X1Z" value="FALSE" />
</classify>
<!-- which means that it's not cummutatitve! -->
<t:it desc="is FALSE regardless of previous match or current
value (TRUE)">
<t:given>
<c:value-of name="matLenMismatchConj{@systemuc@}">
<c:index>
<c:value-of name="#1" />
</c:index>
<c:index>
<c:value-of name="#1" />
</c:index>
</c:value-of>
</t:given>
<t:expect>
<t:match-result value="FALSE" />
</t:expect>
</t:it>
<t:it desc="is FALSE regardless of previous match or current
value (FALSE)">
<t:given>
<c:value-of name="matLenMismatchConj{@systemuc@}">
<c:index>
<c:value-of name="#2" />
</c:index>
<c:index>
<c:value-of name="#2" />
</c:index>
</c:value-of>
</t:given>
<t:expect>
<t:match-result value="FALSE" />
</t:expect>
</t:it>
</t:describe>
</if>
</t:describe>
<t:describe name="of different row lengths">
<if name="@system@" eq="legacy">
<t:describe name="with legacy classification system">
The legacy classification system does something terrible when
the second match is the shorter---%
it discards the indexes entirely!
<classify as="mat-len-mismatch-rows-conj-{@system@}"
yields="matLenMismatchRowsConj{@systemuc@}"
desc="Multi matrix row mismatch (legacy)">
<!-- we _should_ have a 3x3 result matrix -->
<match on="MAT3X3OOZ" value="TRUE" />
<!-- but instead we get [[1, 1, 1], [0], [0]] because of
this match being second! -->
<match on="MAT1X3Z" value="FALSE" />
</classify>
<classify as="mat-len-mismatch-rows-first-conj-{@system@}"
yields="matLenMismatchRowsFirstConj{@systemuc@}"
desc="Multi matrix row mismatch first match (legacy)">
<match on="MAT1X3Z" value="TRUE" />
<match on="MAT3X3OOZ" value="TRUE" />
</classify>
<!-- note that this is testing buggy behavior; the new system
corrects it -->
<t:it desc="replaces all inner vectors of other rows">
<t:given>
<c:length-of>
<c:value-of name="matLenMismatchRowsConj{@systemuc@}">
<c:index>
<c:value-of name="#1" />
</c:index>
</c:value-of>
</c:length-of>
</t:given>
<t:expect>
<!-- were it not for this bug, it should be 3 -->
<t:match-result value="#1" />
</t:expect>
</t:it>
<!-- note that this is testing buggy behavior; the new system
corrects it -->
<t:it desc="considers only defined rows' values when smaller
is first">
<t:given>
<c:value-of name="matLenMismatchRowsFirstConj{@systemuc@}">
<c:index>
<c:value-of name="#1" />
</c:index>
<c:index>
<c:value-of name="#0" />
</c:index>
</c:value-of>
</t:given>
<!-- we get [[0, 0, 0], [1, 1, 0], [1, 1, 0]] -->
<!-- ^ -->
<t:expect>
<t:match-result value="#1" />
</t:expect>
</t:it>
</t:describe>
</if>
<if name="@system@" eq="new">
<t:describe name="with new classification system">
<classify as="mat-len-mismatch-rows-conj-{@system@}"
yields="matLenMismatchRowsConj{@systemuc@}"
desc="Multi matrix row mismatch (new)">
<match on="MAT3X3OOZ" value="TRUE" />
<!-- must yield FALSE rather than matching on 0 -->
<match on="MAT1X3Z" value="FALSE" />
</classify>
<classify as="mat-len-mismatch-rows-first-conj-{@system@}"
yields="matLenMismatchRowsFirstConj{@systemuc@}"
desc="Multi matrix row mismatch first match (new)">
<match on="MAT1X3Z" value="TRUE" />
<match on="MAT3X3OOZ" value="TRUE" />
</classify>
<t:it desc="retains shape of larger matrix">
<t:given>
<c:length-of>
<c:value-of name="matLenMismatchRowsConj{@systemuc@}">
<c:index>
<c:value-of name="#2" />
</c:index>
</c:value-of>
</c:length-of>
</t:given>
<t:expect>
<!-- unlike legacy system -->
<t:match-result value="#3" />
</t:expect>
</t:it>
<t:it desc="always yields FALSE for each undefined element (TRUE)">
<t:given>
<c:length-of>
<c:value-of name="matLenMismatchRowsConj{@systemuc@}">
<c:index>
<c:value-of name="#2" />
</c:index>
<c:index>
<c:value-of name="#1" />
</c:index>
</c:value-of>
</c:length-of>
</t:given>
<t:expect>
<t:match-result value="FALSE" />
</t:expect>
</t:it>
<t:it desc="always yields FALSE for each undefined element (FALSE)">
<t:given>
<c:length-of>
<c:value-of name="matLenMismatchRowsConj{@systemuc@}">
<c:index>
<c:value-of name="#1" />
</c:index>
<c:index>
<c:value-of name="#2" />
</c:index>
</c:value-of>
</c:length-of>
</t:given>
<t:expect>
<t:match-result value="FALSE" />
</t:expect>
</t:it>
<!-- unlike legacy -->
<t:it desc="is commutative with different row lengths">
<t:given>
<c:value-of name="matLenMismatchRowsFirstConj{@systemuc@}">
<c:index>
<c:value-of name="#1" />
</c:index>
<c:index>
<c:value-of name="#2" />
</c:index>
</c:value-of>
</t:given>
<t:expect>
<t:match-result value="FALSE" />
</t:expect>
</t:it>
</t:describe>
</if>
</t:describe>
</t:describe>
</t:describe>
</template>
<section title="Legacy System Tests">
<t:class-tests system="legacy" />
</section>
<section title="New System Tests">
<t:use-new-classification-system />
<t:class-tests system="new" />
</section>
</package>

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2016 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -37,7 +37,8 @@
<t:it desc="Performs no round when 'none'">
<t:premium class="length1"
generates="premRoundNone" index="k"
round="none">
round="none"
desc="Non-rounded premium">
<c:value-of name="#0.5" />
</t:premium>
@ -56,7 +57,8 @@
<t:it desc="Rounds to nearest integer when 'dollar' (up)">
<t:premium class="length1"
generates="premRoundDollarUp" index="k"
round="dollar">
round="dollar"
desc="Rounded-up premium">
<c:value-of name="#1.5" />
</t:premium>
@ -75,7 +77,8 @@
<t:it desc="Rounds to nearest integer when 'dollar' (down)">
<t:premium class="length1"
generates="premRoundDollarDown" index="k"
round="dollar">
round="dollar"
desc="Rounded-down premium">
<c:value-of name="#1.4" />
</t:premium>
@ -94,7 +97,8 @@
<t:it desc="Rounds to nearest cent when 'cent' (up)">
<t:premium class="length1"
generates="premRoundCentUp" index="k"
round="cent">
round="cent"
desc="Rounded-up premium to nearest cent">
<c:value-of name="#1.505" />
</t:premium>
@ -113,7 +117,8 @@
<t:it desc="Rounds to nearest cent when 'cent' (down)">
<t:premium class="length1"
generates="premRoundCentDown" index="k"
round="cent">
round="cent"
desc="Rounded-down premium to nearest cent">
<c:value-of name="#1.504" />
</t:premium>
@ -132,7 +137,8 @@
<t:it desc="Performs ceiling when 'up' (low)">
<t:premium class="length1"
generates="premRoundCeilLow" index="k"
round="up">
round="up"
desc="Ceiling premium (low)">
<c:value-of name="#1.1" />
</t:premium>
@ -151,7 +157,8 @@
<t:it desc="Performs ceiling when 'up' (high)">
<t:premium class="length1"
generates="premRoundCeilHigh" index="k"
round="up">
round="up"
desc="Ceiling premium (high)">
<c:value-of name="#1.7" />
</t:premium>
@ -170,7 +177,8 @@
<t:it desc="Performs floor when 'down' (low)">
<t:premium class="length1"
generates="premRoundFloorLow" index="k"
round="down">
round="down"
desc="Floor premium (low)">
<c:value-of name="#1.1" />
</t:premium>
@ -189,7 +197,8 @@
<t:it desc="Performs floor when 'down' (high)">
<t:premium class="length1"
generates="premRoundFloorHigh" index="k"
round="down">
round="down"
desc="Floor premium (high)">
<c:value-of name="#1.7" />
</t:premium>

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2015 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2015 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2015, 2018 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -50,7 +50,7 @@
<const name="VALUE_VEC" sym="V"
type="float"
desc="Vector of values">
<item value="0" />
<item value="0" desc="Unused (see VALUE_VEC_INDEX)" />
<item value="5.5" desc="Same as VALUE_MID" />
</const>
<const name="VALUE_VEC_INDEX" sym="\nu"

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0"?>
<!--
Copyright (C) 2015 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -19,19 +19,31 @@
-->
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
core="true"
desc="Vector operations">
xmlns:t="http://www.lovullo.com/rater/apply-template"
desc="Retry definition specs">
<import package="../spec" />
<import package="../../base" />
<import package="../../retry" />
<!-- we are a meta-package -->
<import package="vector/arithmetic" export="true" />
<import package="vector/cmatch" export="true" />
<import package="vector/convert" export="true" />
<import package="vector/count" export="true" />
<import package="vector/list" export="true" />
<import package="vector/matrix" export="true" />
<import package="vector/table" export="true" />
<t:describe name="_suggest-retry-when_">
<!-- Due to the assertion nature of this template, positive cases aren't
testable-->
<t:it desc="retry is never allowed">
<t:suggest-retry-when>
<t:match-class name="never" />
</t:suggest-retry-when>
<t:given>
<c:value-of name="__retry" />
</t:given>
<t:expect>
<t:match-result eq="0" />
</t:expect>
</t:it>
</t:describe>
<import package="vector/common" export="true" />
</package>

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2015, 2018 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -30,13 +30,17 @@
<import package="numeric/percent" />
<import package="numeric/round" />
<import package="vector/define" />
<import package="vector/filter" />
<import package="vector/fold" />
<import package="vector/interpolate" />
<import package="vector/length" />
<import package="vector/stub" />
<import package="vector/minmax" />
<import package="vector/table" />
<import package="aggregate" />
<import package="class" />
<import package="insurance" />
<import package="retry" />
<import package="symbol" />
<import package="tplgen" />

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2018 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2018 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2016 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0"?>
<!--
Copyright (C) 2015 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -19,15 +19,32 @@
-->
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
core="true"
desc="Numeric computations">
xmlns:t="http://www.lovullo.com/rater/apply-template"
desc="Vector definition specs">
<!-- we are a meta-package -->
<import package="numeric/boolean" export="true" />
<import package="numeric/convert" export="true" />
<import package="numeric/minmax" export="true" />
<import package="numeric/round" export="true" />
<import package="../../spec" />
<import package="numeric/common" export="true" />
<import package="../../../base" />
<import package="../../../vector/define" />
<t:describe name="_define-vector_">
<t:it desc="defines a global vector">
<t:define-vector generates="testVector"
desc="Test vector">
<c:vector>
<c:value-of name="#1" />
<c:value-of name="#2" />
</c:vector>
</t:define-vector>
<t:given>
<c:sum of="testVector" />
</t:given>
<t:expect>
<t:match-result eq="3" />
</t:expect>
</t:it>
</t:describe>
</package>

View File

@ -0,0 +1,176 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
tame-core is free software: you can redistribute it and/or modify it
under the terms of the GNU General Public License as
published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
-->
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
xmlns:t="http://www.lovullo.com/rater/apply-template"
desc="Testing minimum and maximum values of vectors">
<import package="../../spec" />
<import package="../../../base" />
<import package="../../../vector/filter" />
<import package="../../../vector/stub" />
<const name="VFILTER_MASK_NONE" type="integer"
desc="4-vector masking nothing">
<item value="1" desc="First set" />
<item value="1" desc="Second set" />
<item value="1" desc="Third set" />
<item value="1" desc="Fourth set" />
</const>
<const name="VFILTER_MASK_MIDDLE" type="integer"
desc="4-vector with middle elements unmasked">
<item value="0" desc="First unset" />
<item value="1" desc="Second set" />
<item value="1" desc="Third set" />
<item value="0" desc="Fourth unset" />
</const>
<t:describe name="_vfilter-mask_">
<t:it desc="produces an empty vector given an empty vector">
<t:given>
<c:length-of>
<!-- empty body with no @name@ -->
<t:vfilter-mask mask="NVEC1">
</t:vfilter-mask>
</c:length-of>
</t:given>
<t:expect>
<t:match-result eq="0" />
</t:expect>
</t:it>
<t:it desc="acts as identity given an all-set mask">
<t:given>
<c:length-of>
<t:vfilter-mask name="NVEC4_SEQ" mask="VFILTER_MASK_NONE" />
</c:length-of>
</t:given>
<t:expect>
<t:match-result eq="4" />
</t:expect>
</t:it>
<t:it desc="retains original vector values">
<t:given>
<c:let>
<c:values>
<c:value name="vec" type="integer" set="vector"
desc="Result">
<t:vfilter-mask name="NVEC4_SEQ" mask="VFILTER_MASK_NONE" />
</c:value>
</c:values>
<c:sum of="vec" />
</c:let>
</t:given>
<t:expect>
<!-- 0 + 1 + 2 + 3 (NVEC4_SEQ) -->
<t:match-result eq="6" />
</t:expect>
</t:it>
<t:it desc="produces empty vector given all-unset mask">
<t:given>
<c:length-of>
<!-- NVEC4 is a 4-vector of 0s -->
<t:vfilter-mask name="NVEC4_SEQ" mask="NVEC4" />
</c:length-of>
</t:given>
<t:expect>
<t:match-result eq="0" />
</t:expect>
</t:it>
<t:describe name="given a partly set mask">
<t:it desc="removes masked vector elements">
<t:given>
<c:length-of>
<t:vfilter-mask name="NVEC4_SEQ" mask="VFILTER_MASK_MIDDLE" />
</c:length-of>
</t:given>
<t:expect>
<t:match-result eq="2" />
</t:expect>
</t:it>
<t:it desc="retains value of original vector elements">
<t:given>
<c:let>
<c:values>
<c:value name="vec" type="integer" set="vector"
desc="Result">
<t:vfilter-mask name="NVEC4_SEQ" mask="VFILTER_MASK_MIDDLE" />
</c:value>
</c:values>
<c:sum of="vec" />
</c:let>
</t:given>
<t:expect>
<!-- 1 + 2 -->
<t:match-result eq="3" />
</t:expect>
</t:it>
<!-- same as above, but inline -->
<t:it desc="masks inline vectors">
<t:given>
<c:let>
<c:values>
<c:value name="vec" type="integer" set="vector"
desc="Result">
<!-- inline values -->
<t:vfilter-mask mask="VFILTER_MASK_MIDDLE">
<c:value-of name="#10" />
<c:value-of name="#12" />
<c:value-of name="#14" />
<c:value-of name="#16" />
</t:vfilter-mask>
</c:value>
</c:values>
<c:sum of="vec" />
</c:let>
</t:given>
<t:expect>
<!-- 12 + 14 -->
<t:match-result eq="26" />
</t:expect>
</t:it>
</t:describe>
</t:describe>
</package>

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2018 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2015 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -130,9 +130,9 @@
<rate yields="interpTableMaxFieldValue">
<t:query-first-field table="interp-query-field-test"
field="value">
<t:when field="key">
<t:where-eq field="key">
<c:value-of name="interpTableMaxKeyValue" />
</t:when>
</t:where-eq>
</t:query-first-field>
</rate>
</section>
@ -170,10 +170,10 @@
key="key"
step="INTERP_TABLE_STEP"
actual="#300">
<t:when field="pred">
<c:const value="31" type="float"
<t:where-eq field="pred">
<c:const value="31"
desc="Test predicate value" />
</t:when>
</t:where-eq>
</t:interpolate-query-field>
</t:given>
@ -220,10 +220,10 @@
key="key"
step="INTERP_TABLE_STEP"
actual="#350">
<t:when field="pred">
<c:const value="31" type="float"
<t:where-eq field="pred">
<c:const value="31"
desc="Test predicate value" />
</t:when>
</t:where-eq>
</t:interpolate-query-field>
</t:given>

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2017 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -53,7 +53,7 @@
<t:given>
<c:length-of>
<t:first-nonempty>
<c:set />
<c:vector />
</t:first-nonempty>
</c:length-of>
</t:given>
@ -74,8 +74,8 @@
<t:given>
<c:length-of>
<t:first-nonempty>
<c:set />
<c:set />
<c:vector />
<c:vector />
</t:first-nonempty>
</c:length-of>
</t:given>
@ -96,10 +96,10 @@
<t:given>
<c:car>
<t:first-nonempty>
<c:set />
<c:set>
<c:vector />
<c:vector>
<c:const value="50" desc="Non-empty vector" />
</c:set>
</c:vector>
</t:first-nonempty>
</c:car>
</t:given>
@ -120,12 +120,12 @@
<t:given>
<c:car>
<t:first-nonempty>
<c:set>
<c:vector>
<c:const value="60" desc="Non-empty vector" />
</c:set>
<c:set>
</c:vector>
<c:vector>
<c:const value="70" desc="Non-empty vector" />
</c:set>
</c:vector>
</t:first-nonempty>
</c:car>
</t:given>

View File

@ -0,0 +1,99 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
tame-core is free software: you can redistribute it and/or modify it
under the terms of the GNU General Public License as
published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
-->
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
xmlns:t="http://www.lovullo.com/rater/apply-template"
desc="Testing minimum and maximum values of vectors">
<import package="../../spec" />
<import package="../../../base" />
<import package="../../../vector/minmax" />
<t:describe name="_minreduce_">
<t:it desc="reduces empty vector to 0">
<t:given>
<t:minreduce>
</t:minreduce>
</t:given>
<t:expect>
<t:match-result eq="0" />
</t:expect>
</t:it>
<t:it desc="reduces one-element vector to its only element">
<t:given>
<t:minreduce>
<c:value-of name="#3" />
</t:minreduce>
</t:given>
<t:expect>
<t:match-result eq="3" />
</t:expect>
</t:it>
<t:it desc="reduces n-element vector to its minimum">
<t:given>
<t:minreduce>
<c:value-of name="#7" />
<c:value-of name="#3" />
<c:value-of name="#1" />
<c:value-of name="#6" />
</t:minreduce>
</t:given>
<t:expect>
<t:match-result eq="1" />
</t:expect>
</t:it>
<t:it desc="reduces existing vectors when @isvector@">
<t:given>
<c:let>
<c:values>
<c:value name="vector" type="integer" set="vector"
desc="Test vector">
<c:vector>
<c:value-of name="#9" />
<c:value-of name="#3" />
<c:value-of name="#2" />
<c:value-of name="#5" />
</c:vector>
</c:value>
</c:values>
<t:minreduce isvector="true">
<c:value-of name="vector" />
</t:minreduce>
</c:let>
</t:given>
<t:expect>
<t:match-result eq="2" />
</t:expect>
</t:it>
</t:describe>
</package>

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2018 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.

View File

@ -0,0 +1,382 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
tame-core is free software: you can redistribute it and/or modify it
under the terms of the GNU General Public License as
published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
-->
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
xmlns:t="http://www.lovullo.com/rater/apply-template"
desc="Table Querying Specs">
<import package="../../spec" />
<import package="../../../base" />
<import package="../../../vector/table" />
<t:create-table name="test-table"
desc="Dummy table for query testing">
<t:table-column name="a"
index="0"
desc="Column A" />
<t:table-column name="b"
index="1"
desc="Column B" />
<t:table-column name="c"
index="2"
desc="Column C" />
<t:table-rows>
<t:table-row>
<t:table-value const="1" />
<t:table-value const="11" />
<t:table-value const="111" />
</t:table-row>
<t:table-row>
<t:table-value const="1" />
<t:table-value const="12" />
<t:table-value const="121" />
</t:table-row>
<t:table-row>
<t:table-value const="2" />
<t:table-value const="21" />
<t:table-value const="111" />
</t:table-row>
</t:table-rows>
</t:create-table>
<t:create-table name="test-table-seq"
desc="Dummy sequential table for query testing">
<t:table-column name="a"
index="0"
desc="Column A" />
<t:table-column name="b"
index="1"
desc="Column B" />
<t:table-rows>
<t:table-row>
<t:table-value const="1" />
<t:table-value const="1" />
</t:table-row>
<t:table-row>
<t:table-value const="2" />
<t:table-value const="1" />
</t:table-row>
<t:table-row>
<t:table-value const="2" />
<t:table-value const="2" />
</t:table-row>
<t:table-row>
<t:table-value const="5" />
<t:table-value const="1" />
</t:table-row>
<t:table-row>
<t:table-value const="5" />
<t:table-value const="2" />
</t:table-row>
<t:table-row>
<t:table-value const="5" />
<t:table-value const="3" />
</t:table-row>
<t:table-row>
<t:table-value const="7" />
<t:table-value const="1" />
</t:table-row>
<t:table-row>
<t:table-value const="7" />
<t:table-value const="2" />
</t:table-row>
<t:table-row>
<t:table-value const="7" />
<t:table-value const="3" />
</t:table-row>
<t:table-row>
<t:table-value const="7" />
<t:table-value const="4" />
</t:table-row>
</t:table-rows>
</t:create-table>
<t:describe name="_query-first-field_">
<t:it desc="returns first row of multi-row result">
<t:given>
<t:query-first-field table="test-table" field="c">
<t:where-eq field="a">
<c:value-of name="#1" />
</t:where-eq>
</t:query-first-field>
</t:given>
<t:expect>
<t:match-result eq="111" />
</t:expect>
</t:it>
<t:it desc="returns first row of single-row result">
<t:given>
<t:query-first-field table="test-table" field="c">
<t:where-eq field="a">
<c:value-of name="#1" />
</t:where-eq>
<t:where-eq field="b">
<c:value-of name="#12" />
</t:where-eq>
</t:query-first-field>
</t:given>
<t:expect>
<t:match-result eq="121" />
</t:expect>
</t:it>
</t:describe>
<t:describe name="_query-field_">
<t:describe name="with predicates">
<t:it desc="returns vector of field values">
<t:given>
<c:length-of>
<t:query-field table="test-table" field="c">
<t:where-eq field="a">
<c:value-of name="#1" />
</t:where-eq>
</t:query-field>
</c:length-of>
</t:given>
<t:expect>
<t:match-result eq="2" />
</t:expect>
</t:it>
<t:it desc="returns vector of field values even for single result">
<t:given>
<c:car>
<t:query-field table="test-table" field="c">
<t:where-eq field="a">
<c:value-of name="#1" />
</t:where-eq>
<t:where-eq field="b">
<c:value-of name="#11" />
</t:where-eq>
</t:query-field>
</c:car>
</t:given>
<t:expect>
<t:match-result eq="111" />
</t:expect>
</t:it>
</t:describe>
<t:describe name="with no predicates">
<t:it desc="returns vector of all field values">
<t:given>
<c:length-of>
<t:query-field table="test-table" field="c" />
</c:length-of>
</t:given>
<t:expect>
<t:match-result eq="3" />
</t:expect>
</t:it>
</t:describe>
<!-- TODO: tried using inline-template here but id generation was not
working as expected -->
<t:describe name="with CMP_OP_LT">
<t:it desc="matches less than a given value">
<t:given>
<c:let>
<c:values>
<c:value name="results" type="integer" set="vector">
<t:query-field table="test-table-seq" field="a">
<t:where-lt field="a">
<c:value-of name="#5" />
</t:where-lt>
</t:query-field>
</c:value>
</c:values>
<c:sum of="results" />
</c:let>
</t:given>
<t:expect>
<!-- 1 + 2 + 2 -->
<t:match-result eq="5" />
</t:expect>
</t:it>
</t:describe>
<t:describe name="with CMP_OP_LTE">
<t:it desc="matches less than or equal to a given value">
<t:given>
<c:let>
<c:values>
<c:value name="results" type="integer" set="vector">
<t:query-field table="test-table-seq" field="a">
<t:where-lte field="a">
<c:value-of name="#5" />
</t:where-lte>
</t:query-field>
</c:value>
</c:values>
<c:sum of="results" />
</c:let>
</t:given>
<t:expect>
<!-- 1 + 2 + 2 + 5 + 5 + 5 -->
<t:match-result eq="20" />
</t:expect>
</t:it>
</t:describe>
<t:describe name="with CMP_OP_GT">
<t:it desc="matches greater than a given value">
<t:given>
<c:let>
<c:values>
<c:value name="results" type="integer" set="vector">
<t:query-field table="test-table-seq" field="a">
<t:where-gt field="a">
<c:value-of name="#5" />
</t:where-gt>
</t:query-field>
</c:value>
</c:values>
<c:sum of="results" />
</c:let>
</t:given>
<t:expect>
<!-- 7 + 7 + 7 + 7 -->
<t:match-result eq="28" />
</t:expect>
</t:it>
</t:describe>
<t:describe name="with CMP_OP_GTE">
<t:it desc="matches greater than or equal to a given value">
<t:given>
<c:let>
<c:values>
<c:value name="results" type="integer" set="vector">
<t:query-field table="test-table-seq" field="a">
<t:where-gte field="a">
<c:value-of name="#5" />
</t:where-gte>
</t:query-field>
</c:value>
</c:values>
<c:sum of="results" />
</c:let>
</t:given>
<t:expect>
<!-- 5 + 5 + 5 + 7 + 7 + 7 + 7 -->
<t:match-result eq="43" />
</t:expect>
</t:it>
</t:describe>
</t:describe>
<t:describe name="_query-row_">
<t:describe name="with predicates">
<t:it desc="returns vector of rows">
<t:given>
<c:length-of>
<t:query-row table="test-table">
<t:where-eq field="a">
<c:value-of name="#1" />
</t:where-eq>
</t:query-row>
</c:length-of>
</t:given>
<t:expect>
<t:match-result eq="2" />
</t:expect>
</t:it>
<t:it desc="returns vector of rows even for single result">
<t:given>
<c:let>
<c:values>
<c:value name="first_row" type="integer" set="vector">
<c:car>
<t:query-row table="test-table">
<t:where-eq field="a">
<c:value-of name="#1" />
</t:where-eq>
<t:where-eq field="b">
<c:value-of name="#11" />
</t:where-eq>
</t:query-row>
</c:car>
</c:value>
</c:values>
<c:sum of="first_row" />
</c:let>
</t:given>
<t:expect>
<t:match-result eq="123" />
</t:expect>
</t:it>
</t:describe>
<t:describe name="with no predicates">
<t:it desc="returns vector of all rows">
<t:given>
<c:length-of>
<t:query-row table="test-table" />
</c:length-of>
</t:given>
<t:expect>
<t:match-result eq="3" />
</t:expect>
</t:it>
</t:describe>
</t:describe>
</package>

View File

@ -2,7 +2,7 @@
<!--
BDD specification framework
Copyright (C) 2015, 2018 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -49,8 +49,6 @@
<package xmlns="http://www.lovullo.com/rater"
xmlns:t="http://www.lovullo.com/rater/apply-template"
xmlns:c="http://www.lovullo.com/calc"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.lovullo.com/rater ../../rater.xsd"
core="true"
@ -76,8 +74,7 @@
-->
<classify as="expect-ok"
yields="@result@"
desc="All features conform to specifications"
keep="true">
desc="All features conform to specifications">
<inline-template>
<for-each>
<sym-set name-prefix="expect-conform-"
@ -136,29 +133,39 @@
<expand-sequence>
<expand-group>
<param-copy name="@values@">
<param-meta name="spec-name"
value="@__full_name@" />
<param-meta name="spec-prefix"
value="@__prefix@" />
</param-copy>
</expand-group>
<param-copy name="@values@">
<param-meta name="spec-name"
value="@__full_name@" />
<param-meta name="spec-prefix"
value="@__prefix@" />
</param-copy>
<!-- joins all generated classifications to provide a higher-level
failure if any expectations fail -->
<classify as="expect-conform-{@__prefix@}{@__uniq@}"
desc="{@__full_name@} meets expectations"
keep="true">
<inline-template>
<for-each>
<sym-set name-prefix="expect-that-{@__prefix@}"
type="class" />
</for-each>
<!-- XXX: there is a bug in expand-sequence where it does not wait for
all template expansions before continuing to expand the next item
in the sequence -->
<expand-sequence>
<expand-sequence>
<expand-sequence>
<expand-sequence>
<expand-sequence>
<classify as="expect-conform-{@__prefix@}{@__uniq@}"
desc="{@__full_name@} meets expectations">
<inline-template>
<for-each>
<sym-set name-prefix="expect-that-{@__prefix@}"
type="class" />
</for-each>
<t:match-class name="{@sym_name@}" />
</inline-template>
</classify>
<t:match-class name="{@sym_name@}" />
</inline-template>
</classify>
</expand-sequence>
</expand-sequence>
</expand-sequence>
</expand-sequence>
</expand-sequence>
</expand-sequence>
</template>
@ -244,6 +251,72 @@
<!--
Describe a classification-based value for expectation groups
The defined value is available to adjacent expectations through use of
`_match-result_`.
A `_given_` definition is not required; it exists as a convenient and
concise way to represent test data in clear terms.
Permitted children:
- Any match
-->
<template name="_given-classify_"
desc="Describe a classification-based value for expectation groups">
<param name="@values@" desc="Classification predicates" />
<param name="@__id@"
desc="Unique identifier to avoid symbol conflicts">
<text unique="true">given</text>
</param>
<param-meta name="spec-given-id"
value="{@__id@}Yield" />
<classify as="@__id@" yields="{@__id@}Yield"
desc="Given value generated via _given-clasify_">
<param-copy name="@values@" />
</classify>
</template>
<template name="_given-classify-scalar_"
desc="Describe a classification-based value for expectation groups">
<param name="@values@" desc="Classification predicates" />
<param name="@__id@"
desc="Unique identifier to avoid symbol conflicts">
<text unique="true">given</text>
</param>
<param-meta name="spec-given-id"
value="{@__id@}Yield" />
<classify as="{@__id@}-pre" yields="{@__id@}PreYield"
desc="Given value generated via _given-clasify-scalar_ pre-scalar">
<param-copy name="@values@" />
</classify>
<rate class="{@__id@}-pre" yields="__{@__id@}ScalarSum">
<c:sum of="_CMATCH_" />
</rate>
<classify as="@__id@" yields="{@__id@}Yield"
desc="Given value generated via _given-classify_">
<match on="__{@__id@}ScalarSum">
<c:gt>
<c:const value="0" desc="Any match" />
</c:gt>
</match>
</classify>
</template>
<!--
Describe a feature expectation

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2018 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2016 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -25,7 +25,7 @@
<import package="base" />
<!-- contains template dependencies -->
<import package="/rater/core/vector/cmatch" export="true" />
<import package="vector/cmatch" export="true" />
This package provides elementary integration with the UI through
@ -87,7 +87,76 @@
</param>
<t:match-class name="@__set_class@" value="@value@" />
<all>
<t:match-ui-applicable on="@on@" />
<t:match-class name="@__set_class@" value="@value@" />
</all>
</template>
The templates below are analogous to the generic match templates,
but they translate \tt{@on@} to the question param
and also check that the question is applicable (using
\ref{_match-ui-applicable_}).
<inline-template>
<for-each>
<set cmp="eq" />
<set cmp="ne" />
<set cmp="gt" />
<set cmp="gte" />
<set cmp="lt" />
<set cmp="lte" />
</for-each>
<template name="_match-ui-{@cmp@}_"
desc="Match UI value {@cmp@}">
<param name="@on@" desc="Question id" />
<!-- pick one -->
<param name="@value@" desc="Match against variable" />
<all>
<t:match-ui-applicable on="@on@" />
<match on="ui_q_{@on@}">
<dyn-node name="c:{@cmp@}">
<c:value-of name="@value@" />
</dyn-node>
</match>
</all>
</template>
<template name="_match-extern-ui-{@cmp@}_"
desc="Match UI value {@cmp@} - using externs">
<param name="@on@" desc="Question id" />
<param name="@value@" desc="Match against variable" />
<param name="@__yields@" desc="Generated visibility yields name">
<text>__isvis</text>
<param-value name="@on@" rmunderscore="true" lower="true" />
</param>
<!-- TODO: this default is transitionary, since there's no
meaningful default -->
<param name="@vis-dim@" desc="Classification dimensions (default 2)">
<text>2</text>
</param>
<extern name="ui_q_{@on@}" type="param" dtype="integer" dim="1" />
<extern name="@__yields@" type="cgen" dtype="boolean" dim="@vis-dim@" />
<all>
<match on="@__yields@" />
<match on="ui_q_{@on@}">
<dyn-node name="c:{@cmp@}">
<c:value-of name="@value@" />
</dyn-node>
</match>
</all>
</template>
</inline-template>
</section>
</package>

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2015 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -56,7 +56,7 @@
<text></text>
</param>
<rate accumulate="none" yields="@yields@">
<rate yields="@yields@">
<c:sum of="@a@" index="k" generates="@into@" desc="@gendesc@" sym="@sym@">
<c:value-of name="@a@" index="k" />
<c:value-of name="@b@" index="k" />

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2015, 2018 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -52,9 +52,6 @@
<template name="_cmatch-to-vector_" desc="Vectorizes a classification match">
<param name="@class@" desc="Classification match string" />
<param name="@generates@" desc="Variable to yield generates (will yield a vector)" />
<param name="@keep@" desc="Rate block @keep">
<text></text>
</param>
<param name="@yields@" desc="Dummy variable to yield generates (useless, but required)">
<text>__</text>
@ -74,7 +71,7 @@
<!-- this conversion is as simple as using a generator to yield the value
of _CMATCH_ for each index -->
<rate class="@class@" accumulate="none" yields="@yields@" always="true" keep="@keep@">
<rate class="@class@" yields="@yields@">
<c:sum of="_CMATCH_" index="k" generates="@generates@" desc="@gendesc@" sym="@sym@">
<c:value-of name="_CMATCH_" index="k" />
</c:sum>
@ -90,12 +87,8 @@
<text></text>
</param>
<param name="@keep@" desc="Rate block @keep">
<text></text>
</param>
<rate class="@class@" accumulate="none" yields="@yields@" sym="@sym@" keep="@keep@">
<rate class="@class@" yields="@yields@" sym="@sym@">
<!-- if any single one matches, then we want to yield a 1 -->
<c:apply name="maxreduce" maxreduce_set="_CMATCH_" />
</rate>
@ -124,10 +117,6 @@
<param-value snake="true" name="@as@" />
</param>
<param name="@keep@" desc="Whether to force compilation">
<text></text>
</param>
<param name="@sym@" desc="Optional yield symbol">
<text></text>
</param>
@ -141,18 +130,56 @@
<t:cmatch-to-scalar class="--{@as@}-pre"
yields="__{@yields@}Scalar"
sym="@sym@"
keep="@keep@" />
sym="@sym@" />
<classify as="@as@" yields="@yields@"
desc="@desc@"
keep="@keep@"
sym="@sym@">
<match on="__{@yields@}Scalar" />
</classify>
</template>
The vector analog to \ref{_classify-scalar_} is
\ref{_classify-vector_}. The results are undefined if the classification
would have otherwise yielded a scalar value (that is, do not use this to
force a scalar into a vector). It may work in an intuitive way, but it's
not designed to.
<template name="_classify-vector_"
desc="Classification with a forced-vector result">
<param name="@values@" desc="Predicates" />
<param name="@as@" desc="Classification name" />
<param name="@desc@" desc="Classification description" />
<param name="@yields@" desc="Scalar result name">
<text>__</text>
<param-value snake="true" name="@as@" />
</param>
<param name="@sym@" desc="Optional yield symbol">
<text></text>
</param>
<classify as="--{@as@}-pre"
yields="__{@yields@}Pre"
desc="{@desc@}, pre-vector">
<param-copy name="@values@" />
</classify>
<t:cmatch-to-vector class="--{@as@}-pre"
generates="__{@yields@}Vector"
sym="@sym@" />
<classify as="@as@" yields="@yields@"
desc="@desc@"
sym="@sym@">
<match on="__{@yields@}Vector" />
</classify>
</template>
<!--
Counts one for each classification vector match
@ -188,25 +215,11 @@
<template name="_match-{@cmp@}_" desc="Match value {@cmp@}">
<param name="@on@" desc="Value to assert" />
<!-- pick one -->
<param name="@const@" desc="Match against constant value" />
<param name="@value@" desc="Match against variable" />
<if name="@const@">
<warning>
@const@ is deprecated; use @value@ with a #-prefix instead.
</warning>
</if>
<match on="@on@">
<dyn-node name="c:{@cmp@}">
<if name="@const@">
<c:const value="@const@" type="float" desc="Comparison" />
</if>
<unless name="@const@">
<c:value-of name="@value@" />
</unless>
<c:value-of name="@value@" />
</dyn-node>
</match>
</template>

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2015 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -136,7 +136,7 @@
</c:gte>
</c:when>
<c:const value="-1" type="integer" desc="Not found" />
<c:const value="-1" desc="Not found" />
</c:case>
@ -199,7 +199,7 @@
<!-- generates a variable that can be recognized as an empty set (useful for
defaults to params that require sets) -->
<rate-each class="always" yields="__empty" generates="__emptySet" index="k">
<c:const value="0" type="integer" desc="Nothing" />
<c:const value="0" desc="Nothing" />
</rate-each>
</package>

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2015 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -19,6 +19,7 @@
-->
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
xmlns:t="http://www.lovullo.com/rater/apply-template"
core="true"
desc="Convert vectors into other types">

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2015 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -19,6 +19,7 @@
-->
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
xmlns:t="http://www.lovullo.com/rater/apply-template"
core="true"
desc="Vector element counting">
@ -30,7 +31,7 @@
<param name="count_set" type="integer" set="vector" desc="Vector to count" />
<c:sum of="count_set" index="k">
<c:const value="1" type="integer" desc="Add 1 for each value in the set" />
<c:const value="1" desc="Add 1 for each value in the set" />
</c:sum>
</function>
@ -51,7 +52,7 @@
</c:apply>
<!-- ensure the equation is not undefined if length = 0 -->
<c:const value="1" type="integer" desc="Add 1 to ensure equation is always defined" />
<c:const value="1" desc="Add 1 to ensure equation is always defined" />
</c:sum>
</c:quotient>
</c:ceil>

View File

@ -0,0 +1,66 @@
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
tame-core is free software: you can redistribute it and/or modify it
under the terms of the GNU General Public License as
published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
-->
<package xmlns="http://www.lovullo.com/rater"
xmlns:c="http://www.lovullo.com/calc"
xmlns:t="http://www.lovullo.com/rater/apply-template"
core="true"
desc="Vector definition">
This package provides a type of global~\tt{let}.
The term ``define'' is consistent with the influence of the language
Scheme on this system.
One a primitive is provided,
this abstraction can be modified to make use of it.
This addresses the challenge with defining arbitrary values.
TAME was designed for a map-reduce style of computation that
maintains index assocaitions.
This makes it easy to reason about data flow.
Consequently,
arbitrary definitions should be used sparingly until they become a core
feature of TAME.
<template name="_define-vector_"
desc="Define a vector with an arbitrary value">
<param name="@values@" desc="Definition" />
<param name="@generates@" desc="Vector name" />
<param name="@desc@" desc="Vector description" />
<rate yields="_{@generates@}">
<c:let>
<c:values>
<c:value name="value" type="float" set="vector"
desc="{@desc@} (intermediate value)">
<param-copy name="@values@" />
</c:value>
</c:values>
<!-- effectively means @generates@ = value -->
<c:sum of="value" generates="@generates@" index="k"
desc="@desc@">
<c:value-of name="value" index="k" />
</c:sum>
</c:let>
</rate>
</template>
</package>

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2018 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -24,9 +24,23 @@
desc="Filtering Vectors and Matrices">
<import package="../base" />
<import package="../when" />
<import package="list" />
<typedef name="CmpOp"
desc="Comparison operator">
<enum type="integer">
<!-- DO NOT REORDER; see mrange 'over' check -->
<item name="CMP_OP_EQ" value="1" desc="Equal (=)" />
<item name="CMP_OP_LT" value="2" desc="Less than (&lt;)" />
<item name="CMP_OP_LTE" value="3" desc="Less than or equal to (&lt;=)" />
<item name="CMP_OP_GT" value="4" desc="Greater than (&gt;)" />
<item name="CMP_OP_GTE" value="5" desc="Greater than or equal to (&gt;=)" />
</enum>
</typedef>
<section title="Vector Filtering">
<function name="vfilter_lookup"
desc="Filter predicate by value and use corresponding index in
@ -40,15 +54,126 @@
<c:value-of name="vector_src" index="start_index" />
</t:cons-until-empty>
</function>
\ref{_vfilter-mask_} allows filtering a vector using a boolean vector as
a mask.
If an index in the mask is~$0$,
then that corresponding index in the source vector will be removed.
The mask vector \should be the same length as the source vector.\footnote{
Remember that TAME treats undefined values as~$0$.}
<template name="_vfilter-mask_"
desc="Filter vector using a binary vector as a mask">
<param name="@values@" desc="Inline vector" />
<param name="@name@" desc="Named vector (in place of inline)" />
<param name="@mask@" desc="Mask vector" />
<c:apply name="_vfilter_mask" mask="@mask@">
<c:arg name="vector">
<if name="@name@">
<c:value-of name="@name@" />
</if>
<unless name="@name@">
<c:vector>
<param-copy name="@values@" />
</c:vector>
</unless>
</c:arg>
</c:apply>
</template>
<function name="_vfilter_mask"
desc="Filter source vector using binary vector as a mask">
<param name="vector" type="float" set="vector"
desc="Source vector" />
<param name="mask" type="integer" set="vector"
desc="Binary vector used to filter source vector" />
<c:let>
<c:values>
<c:value name="length" type="integer"
desc="Length of source vector">
<c:length-of>
<c:value-of name="vector" />
</c:length-of>
</c:value>
<!-- TODO: support constants in @index -->
<c:value name="curmask" type="integer"
desc="Current mask value">
<c:value-of name="mask">
<c:index>
<c:value-of name="#0" />
</c:index>
</c:value-of>
</c:value>
</c:values>
<c:cases>
<c:case label="No more elements in source vector">
<t:when-eq name="length" value="#0" />
<c:vector />
</c:case>
<c:case label="Skip non-match">
<t:when-eq name="curmask" value="FALSE" />
<c:recurse>
<c:arg name="vector">
<c:cdr>
<c:value-of name="vector" />
</c:cdr>
</c:arg>
<c:arg name="mask">
<c:cdr>
<c:value-of name="mask" />
</c:cdr>
</c:arg>
</c:recurse>
</c:case>
<c:otherwise>
<c:cons>
<c:value-of name="vector">
<c:index>
<c:value-of name="#0" />
</c:index>
</c:value-of>
<c:recurse>
<c:arg name="vector">
<c:cdr>
<c:value-of name="vector" />
</c:cdr>
</c:arg>
<c:arg name="mask">
<c:cdr>
<c:value-of name="mask" />
</c:cdr>
</c:arg>
</c:recurse>
</c:cons>
</c:otherwise>
</c:cases>
</c:let>
</function>
</section>
<section title="Matrix Filtering">
\ref{mfilter} handles complex filtering of matrices.
If the requested column~\tt{@col@} is marked as sequential with~\tt{@seq@},
a~$O(lg n)$ bisect algorithm will be used;
otherwise,
it will undergo a~$O(n)$ linear scan.
If the requested column~\tt{@col@} is marked as sequential with~\tt{@seq@}
\emph{and} the comparison operator is~\ref{CMP_OP_EQ},
then an~$O(lg n)$ bisect algorithm will be used;
otherwise,
it will undergo a~$O(n)$ linear scan.
<function name="mfilter"
desc="Filter matrix rows by column value">
@ -56,6 +181,7 @@
<param name="col" type="integer" desc="Column index to filter on" />
<param name="vals" type="float" desc="Column value to filter on" />
<param name="seq" type="boolean" desc="Is data sequential?" />
<param name="op" type="integer" desc="Comparison operator" />
<!-- merge the result of each condition in vals into a single set, which
has the effect of supporting multiple conditions on a single column of
@ -63,18 +189,16 @@
the lookups separately for each, we preserve the bisect-ability of the
condition. -->
<t:merge-until-empty set="vals" car="val" glance="TABLE_WHEN_MASK_VALUE">
<c:apply name="mrange" matrix="matrix" col="col" val="val" seq="seq">
<c:apply name="mrange" matrix="matrix" col="col" val="val" seq="seq" op="op">
<c:arg name="start">
<c:cases>
<!-- if we know that the data is sequential, then we may not need to
perform a linear search (if the dataset is large enough and the
column value is relatively distinct) -->
<!-- TODO: bisect is currently only performed for CMP_OP_EQ -->
<c:case>
<c:when name="seq">
<c:eq>
<c:value-of name="TRUE" />
</c:eq>
</c:when>
<t:when-eq name="op" value="CMP_OP_EQ" />
<t:when-eq name="seq" value="TRUE" />
<c:apply name="bisect" matrix="matrix" col="col" val="val">
<c:arg name="start">
@ -120,6 +244,105 @@
<param name="start" type="integer" desc="Starting index (inclusive)" />
<param name="end" type="integer" desc="Ending index (inclusive)" />
<param name="seq" type="boolean" desc="Is data sequential?" />
<param name="op" type="integer" desc="Comparison operator" />
<c:let>
<c:values>
<c:value name="matches" type="integer" set="vector"
desc="Matching row indexes of matrix">
<c:apply name="mrange_accum"
matrix="matrix" col="col" val="val"
start="start" end="end" seq="seq" op="op">
<!-- Matchines indexes will be accumulated into a vector (in
reverse) to permit TCO -->
<c:arg name="accum">
<c:vector />
</c:arg>
</c:apply>
</c:value>
</c:values>
<c:apply name="_mextract_rows" matrix="matrix" indexes="matches" i="#0">
<!-- Pre-compute so _mextract_rows doesn't have to -->
<c:arg name="length">
<c:length-of>
<c:value-of name="matches" />
</c:length-of>
</c:arg>
<!-- The final matrix will be accumulated to permit TCO (note that
this reverse the original reversal mentioned above, so the
final matrix is in the right order) -->
<c:arg name="accum">
<c:vector />
</c:arg>
</c:apply>
</c:let>
</function>
<function name="_mextract_rows"
desc="Pull rows from a matrix by index">
<param name="matrix" type="float" set="matrix" desc="Source matrix" />
<param name="indexes" type="integer" set="vector" desc="Indexes to extract" />
<param name="length" type="integer" desc="Length of indexes vector" />
<param name="i" type="integer" desc="Current index offset" />
<param name="accum" type="float" set="matrix" desc="Accumulator (matrix)" />
<param name="__experimental_guided_tco" type="float" desc="Experimental guided TCO" />
<c:cases>
<!-- When we're done, yield the accumulated value, representign our
final matrix -->
<c:case>
<t:when-eq name="i" value="length" />
<c:value-of name="accum" />
</c:case>
<c:otherwise>
<c:recurse __experimental_guided_tco="TRUE">
<c:arg name="i">
<c:sum>
<c:value-of name="i" />
<c:const value="1" desc="Proceed to next index" />
</c:sum>
</c:arg>
<c:arg name="accum">
<c:cons>
<!-- Add the row identified by the current index to the
accumulator. Note that this uses cons, so it adds it
to the head, but since the original mrange results are
reversed, this is precisely what we want—to reverse the
reversal -->
<c:value-of name="matrix">
<c:index>
<c:value-of name="indexes" index="i" />
</c:index>
</c:value-of>
<c:value-of name="accum" />
</c:cons>
</c:arg>
</c:recurse>
</c:otherwise>
</c:cases>
</function>
<function name="mrange_accum"
desc="Filter matrix rows by column value within a certain
range of indexes (inclusive)">
<param name="matrix" type="float" set="matrix" desc="Matrix to filter" />
<param name="col" type="integer" desc="Column index to filter on" />
<param name="val" type="float" desc="Column value to filter on" />
<param name="start" type="integer" desc="Starting index (inclusive)" />
<param name="end" type="integer" desc="Ending index (inclusive)" />
<param name="seq" type="boolean" desc="Is data sequential?" />
<param name="op" type="integer" desc="Comparison operator" />
<param name="accum" type="integer" set="vector" desc="Accumulator (row indexes)" />
<param name="__experimental_guided_tco" type="float" desc="Experimental guided TCO" />
<c:let>
<c:values>
@ -145,17 +368,8 @@
<c:value name="over" type="boolean"
desc="Did we pass the potential value in a sorted list?">
<c:value-of name="TRUE">
<c:when name="seq">
<c:eq>
<c:value-of name="TRUE" />
</c:eq>
</c:when>
<c:when name="curval">
<c:gt>
<c:value-of name="val" />
</c:gt>
</c:when>
<t:when-eq name="seq" value="TRUE" />
<t:when-gt name="curval" value="val" />
</c:value-of>
</c:value>
</c:values>
@ -163,47 +377,124 @@
<c:cases>
<!-- if we're done filtering, then return an empty set -->
<c:case>
<c:when name="start">
<c:gt>
<c:value-of name="end" />
</c:gt>
</c:when>
<t:when-gt name="start" value="end" />
<!-- empty set -->
<c:set />
<c:value-of name="accum" />
</c:case>
<!-- if the data is sequential and the next element is over the
requested value, then we're done -->
requested value, then we're done (can only be used for
equality and LT{,E}; need a GT{,E} version -->
<c:case>
<c:when name="over">
<c:eq>
<c:value-of name="TRUE" />
</c:eq>
</c:when>
<t:when-lte name="op" value="CMP_OP_LTE" />
<t:when-eq name="over" value="TRUE" />
<!-- empty set -->
<c:set />
<c:value-of name="accum" />
</c:case>
<c:otherwise>
<c:apply name="_mfilter" matrix="matrix" col="col" val="val"
start="start" end="end" seq="seq">
<c:arg name="cur">
<c:value-of name="matrix">
<!-- current row -->
<c:index>
<c:value-of name="start" />
</c:index>
<c:let>
<c:values>
<c:value name="cur" type="float"
desc="Current value">
<c:value-of name="matrix">
<!-- current row -->
<c:index>
<c:value-of name="start" />
</c:index>
<!-- requested column -->
<c:index>
<c:value-of name="col" />
</c:index>
</c:value-of>
</c:arg>
</c:apply>
<!-- requested column -->
<c:index>
<c:value-of name="col" />
</c:index>
</c:value-of>
</c:value>
</c:values>
<c:let>
<c:values>
<c:value name="found" type="boolean"
desc="Whether comparison matches">
<c:cases>
<c:case label="Equal">
<t:when-eq name="op" value="CMP_OP_EQ" />
<c:value-of name="TRUE">
<t:when-eq name="cur" value="val" />
</c:value-of>
</c:case>
<c:case label="Less than">
<t:when-eq name="op" value="CMP_OP_LT" />
<c:value-of name="TRUE">
<t:when-lt name="cur" value="val" />
</c:value-of>
</c:case>
<c:case label="Less than or equal to">
<t:when-eq name="op" value="CMP_OP_LTE" />
<c:value-of name="TRUE">
<t:when-lte name="cur" value="val" />
</c:value-of>
</c:case>
<c:case label="Greater than">
<t:when-eq name="op" value="CMP_OP_GT" />
<c:value-of name="TRUE">
<t:when-gt name="cur" value="val" />
</c:value-of>
</c:case>
<c:case label="Greater than or equal to">
<t:when-eq name="op" value="CMP_OP_GTE" />
<c:value-of name="TRUE">
<t:when-gte name="cur" value="val" />
</c:value-of>
</c:case>
</c:cases>
</c:value>
</c:values>
<!-- continue recursion using TCO so that we do not
exhaust the stack (this is an undocumented,
experimental feature that requires explicitly stating
that a recursive call is in tail position) -->
<c:recurse __experimental_guided_tco="TRUE">
<c:arg name="accum">
<c:cases>
<!-- If match, add the current row index to the
accumulator (cons, so note that it is added
in reverse) -->
<c:case>
<t:when-eq name="found" value="TRUE" />
<c:cons>
<c:value-of name="start" />
<c:value-of name="accum" />
</c:cons>
</c:case>
<!-- If no match, no change to accumulator -->
<c:otherwise>
<c:value-of name="accum" />
</c:otherwise>
</c:cases>
</c:arg>
<c:arg name="start">
<c:sum>
<c:value-of name="start" />
<c:const value="1" desc="Check next element" />
</c:sum>
</c:arg>
</c:recurse>
</c:let>
</c:let>
</c:otherwise>
</c:cases>
</c:let>
@ -211,58 +502,6 @@
</function>
<function name="_mfilter" desc="mfilter helper">
<param name="matrix" type="float" set="matrix" desc="Matrix to filter" />
<param name="col" type="integer" desc="Column index to filter on" />
<param name="val" type="float" desc="Column value to filter on" />
<param name="start" type="integer" desc="Starting index (aka current index)" />
<param name="end" type="integer" desc="Ending index" />
<param name="seq" type="integer" desc="Is data sequential?" />
<param name="cur" type="float" desc="Current value" />
<c:cases>
<c:case>
<c:when name="cur">
<c:eq>
<c:value-of name="val" />
</c:eq>
</c:when>
<c:cons>
<c:value-of name="matrix">
<c:index>
<c:value-of name="start" />
</c:index>
</c:value-of>
<c:apply name="mrange" matrix="matrix" col="col" val="val"
end="end" seq="seq">
<c:arg name="start">
<c:sum>
<c:value-of name="start" />
<c:const value="1" desc="Check next element" />
</c:sum>
</c:arg>
</c:apply>
</c:cons>
</c:case>
<c:otherwise>
<c:apply name="mrange" matrix="matrix" col="col" val="val"
end="end" seq="seq">
<c:arg name="start">
<c:sum>
<c:value-of name="start" />
<c:const value="1" desc="Check next element" />
</c:sum>
</c:arg>
</c:apply>
</c:otherwise>
</c:cases>
</function>
<section title="Bisecting">
Perform an~$O(lg n)$ bisect on a data set.
@ -329,16 +568,13 @@
<c:cases>
<!-- give up if we've reached our gap limit -->
<c:case>
<c:when name="gap">
<c:lte>
<c:value-of name="MFILTER_BISECT_GAP_MAX" />
</c:lte>
</c:when>
<t:when-lte name="gap" value="MFILTER_BISECT_GAP_MAX" />
<!-- we tried our best; return our current position -->
<c:value-of name="start" />
</c:case>
<!-- we have not yet reached our gap limit; keep going -->
<c:otherwise>
<c:let>
@ -378,22 +614,14 @@
<c:cases>
<!-- if the middle value is lower than our value, then take the upper half -->
<c:case>
<c:when name="mid">
<c:lt>
<c:value-of name="val" />
</c:lt>
</c:when>
<t:when-lt name="mid" value="val" />
<c:recurse start="mid_index" />
</c:case>
<!-- similarily, take the lower half if we over-shot -->
<c:case>
<c:when name="mid">
<c:gt>
<c:value-of name="val" />
</c:gt>
</c:when>
<t:when-gt name="mid" value="val" />
<c:recurse end="mid_index" />
</c:case>
@ -460,24 +688,14 @@
<c:cases>
<!-- if we have no more indexes to check, then we're done -->
<c:case>
<c:when name="i">
<c:eq>
<c:const value="0"
desc="Did we check the final (first) index?" />
</c:eq>
</c:when>
<t:when-eq name="i" value="#0" />
<!-- well, then, we're done -->
<c:value-of name="i" />
</c:case>
<!-- if the previous column value is the same value, then continue checking -->
<c:case>
<c:when name="prev">
<c:eq>
<c:value-of name="val" />
</c:eq>
</c:when>
<t:when-eq name="prev" value="val" />
<c:recurse>
<c:arg name="i">
@ -507,23 +725,7 @@
<c:cases>
<!-- if masked -->
<c:case>
<!-- no index provided -->
<unless name="@index@">
<c:when name="@name@">
<c:eq>
<c:value-of name="FALSE" />
</c:eq>
</c:when>
</unless>
<!-- index provided -->
<if name="@index@">
<c:when name="@name@" index="@index@">
<c:eq>
<c:value-of name="FALSE" />
</c:eq>
</c:when>
</if>
<t:when-eq name="@name@" index="@index@" value="FALSE" />
<!-- TODO: configurable mask via meta and/or param -->
<c:value-of name="TABLE_WHEN_MASK_VALUE" />

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2018 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -121,7 +121,7 @@
</c:case>
<c:otherwise label="Ignore on class vector non-match">
<c:set />
<c:vector />
</c:otherwise>
</c:cases>
</c:sum>

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2015 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -57,7 +57,7 @@
<c:value name="b" type="float" desc="Second set value">
<c:value-of name="orig_set">
<c:index>
<c:const value="1" type="integer" desc="Second index" />
<c:const value="1" desc="Second index" />
</c:index>
</c:value-of>
</c:value>
@ -72,10 +72,10 @@
</c:gt>
</c:when>
<c:set label="Re-ordered set such that the lower value is first in the vector">
<c:vector label="Re-ordered set such that the lower value is first in the vector">
<c:value-of name="b" />
<c:value-of name="a" />
</c:set>
</c:vector>
</c:case>
<!-- already ordered -->
@ -110,14 +110,14 @@
<c:case>
<c:when name="step">
<c:eq>
<c:const value="0" type="integer" desc="No step indicates identical values" />
<c:const value="0" desc="No step indicates identical values" />
</c:eq>
</c:when>
<!-- just return the first value; it's exact and no interpolation is necessary -->
<c:value-of name="set">
<c:index>
<c:const value="0" type="integer" desc="First index" />
<c:const value="0" desc="First index" />
</c:index>
</c:value-of>
</c:case>
@ -202,10 +202,10 @@
<t:query-field table="@table@"
field="@field@">
<!-- query for upper and lower values for interpolation -->
<t:when field="@key@">
<t:where-eq field="@key@">
<c:value-of name="low" />
<c:value-of name="high" />
</t:when>
</t:where-eq>
<param-copy name="@values@" />
</t:query-field>

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2017 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -45,9 +45,9 @@
<!-- avoid having to copy @values@ multiple times -->
<c:value name="_list" type="float"
desc="Result of body expression">
<c:set>
<c:vector>
<param-copy name="@values@" />
</c:set>
</c:vector>
</c:value>
</c:values>
@ -106,7 +106,7 @@
</c:gte>
</c:when>
<c:set />
<c:vector />
</c:case>
<!-- non-empty vector, return it -->

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2015, 2017 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -73,7 +73,7 @@
<c:case>
<c:when name="__valn">
<c:eq>
<c:const value="0" type="integer" desc="When there are no more elements in the set" />
<c:const value="0" desc="When there are no more elements in the set" />
</c:eq>
</c:when>
@ -83,7 +83,7 @@
</if>
<unless name="@base@">
<!-- return an empty set -->
<c:set />
<c:vector />
</unless>
</c:case>

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2015 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -80,7 +80,7 @@
<param-value name="@line@" />
</param>
<rate-each class="@line@" accumulate="none" yields="@yields@" generates="@into@" index="k">
<rate-each class="@line@" yields="@yields@" generates="@into@" index="k">
<!-- take the dot product of the two vectors (each part of a larger matrix)
to get the rate for the associated class code -->
<c:product dot="true" label="Dot product between the class and rate vectors for each location will yield the respective rate per location">

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2015 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -24,20 +24,110 @@
title="Maximum and Minimum Elements">
<import package="../base" />
<import package="../when" />
<import package="../numeric/common" />
<import package="../numeric/minmax" />
<section title="Vector Reduction">
Core currently only offers a~maximum reduction on
a~vector. \todo{Add a~minimum reduction.} \ref{_maxreduce_}
provides a convenient template-based abstraction.
Two types of reductions are provided for minimum and maximum respectively:
\ref{_minreduce_} and \ref{_maxreduce_}.\footnote{
This is because TAME does not have first-class functions.}
They both produce scalar values of the minimums and maximums
(respectively) of vectors.
<template name="_minreduce_"
desc="Reduce a vector to its minimum">
<param name="@values@" desc="Values to reduce" />
<param name="@isvector@" desc="Set to 'true' if the nodes should
not be wrapped in c:vector" />
<param name="@label@" desc="Application label">
<!-- default empty -->
<text></text>
</param>
<c:apply name="_minreduce" label="@label@">
<c:arg name="vector">
<if name="@isvector@" eq="true">
<param-copy name="@values@" />
</if>
<unless name="@isvector@" eq="true">
<c:vector>
<param-copy name="@values@" />
</c:vector>
</unless>
</c:arg>
</c:apply>
</template>
<function name="_minreduce"
desc="Minimum value in a vector">
<param name="vector" type="float" set="vector"
desc="Vector to find minimum of" />
<c:let>
<c:values>
<c:value name="length" type="integer"
desc="Length of vector">
<c:length-of>
<c:value-of name="vector" />
</c:length-of>
</c:value>
</c:values>
<c:cases>
<c:case label="Empty vector">
<t:when-eq name="length" value="#0" />
<c:value-of name="#0" />
</c:case>
<c:case label="Single-element vector">
<t:when-eq name="length" value="#1" />
<c:value-of name="vector">
<c:index>
<c:value-of name="#0" />
</c:index>
</c:value-of>
</c:case>
<c:otherwise label="Non-empty vector">
<c:apply name="min">
<c:arg name="min1">
<c:value-of name="vector">
<c:index>
<c:value-of name="#0" />
</c:index>
</c:value-of>
</c:arg>
<c:arg name="min2">
<c:recurse>
<c:arg name="vector">
<c:cdr>
<c:value-of name="vector" />
</c:cdr>
</c:arg>
</c:recurse>
</c:arg>
</c:apply>
</c:otherwise>
</c:cases>
</c:let>
</function>
<template name="_maxreduce_"
desc="Reduce a set to its maximum">
<param name="@values@" desc="Values to reduce" />
<param name="@isvector@" desc="Set to 'true' if the nodes should
not be wrapped in c:set" />
not be wrapped in c:vector" />
<param name="@label@" desc="Application label">
<!-- default empty -->
@ -50,9 +140,9 @@
<!-- if we were not provided with a vector (default), create
one out of the given nodes -->
<unless name="@isvector@" eq="true">
<c:set>
<c:vector>
<param-copy name="@values@" />
</c:set>
</c:vector>
</unless>
<!-- if they told us that they have provided a vector, then
@ -72,6 +162,8 @@
let~expressions and other convenience templates. It has since
been refactored slightly, but can be made to be more concise.}
<!-- TODO: rewrite this to be more concise, with the more lisp-like
recursive strategy of minreduce -->
<function name="maxreduce" desc="Reduce a set to its maximum">
<param name="maxreduce_set" type="float" set="vector"
desc="Set to find max of" />
@ -97,7 +189,7 @@
</c:eq>
</c:when>
<c:const value="0" type="integer" desc="No value" />
<c:const value="0" desc="No value" />
</c:case>
<!-- we have values; perform reduction -->
@ -195,7 +287,7 @@
<c:arg name="_maxreduce_i">
<c:sum>
<c:value-of name="_maxreduce_i" />
<c:const value="-1" type="integer" desc="Decrement index by 1" />
<c:const value="-1" desc="Decrement index by 1" />
</c:sum>
</c:arg>
</c:apply>
@ -223,12 +315,12 @@
<param-value name="@generates@" />
</param>
<rate-each class="@class@" accumulate="none" yields="@yields@" generates="@generates@" index="@index@">
<rate-each class="@class@" yields="@yields@" generates="@generates@" index="@index@">
<c:apply name="maxreduce">
<c:arg name="maxreduce_set">
<c:set>
<c:vector>
<param-copy name="@values@" />
</c:set>
</c:vector>
</c:arg>
</c:apply>
</rate-each>

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2018 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.

View File

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright (C) 2015, 2018 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.
@ -28,6 +28,7 @@
<!-- since templates are inlined, we need to make these symbols available to
avoid terrible confusion -->
<import package="../numeric/common" export="true"/>
<import package="../when" export="true"/>
<import package="common" export="true" />
<import package="filter" export="true" />
<import package="matrix" export="true" />
@ -314,37 +315,78 @@
<text>_RATE_TABLE</text>
</param>
<c:apply name="_mquery">
<c:arg name="matrix">
<c:value-of name="@matrix@" />
</c:arg>
<c:arg name="criteria">
<c:set>
<param-copy name="@values@">
<param-meta name="table_basename" value="@matrix@" />
</param-copy>
</c:set>
</c:arg>
<c:let>
<c:values>
<c:value name="_qparams" type="integer" set="matrix"
desc="Query parameters">
<c:vector>
<param-copy name="@values@">
<param-meta name="table_basename" value="@matrix@" />
</param-copy>
</c:vector>
</c:value>
</c:values>
<c:arg name="i">
<!-- begin with the last predicate (due to the way we'll recurse, it
will be applied *last* -->
<t:dec>
<c:length-of>
<c:set>
<param-copy name="@values@">
<param-meta name="table_basename" value="@matrix@" />
</param-copy>
</c:set>
</c:length-of>
</t:dec>
</c:arg>
</c:apply>
<c:apply name="_mquery" matrix="@matrix@">
<c:arg name="criteria">
<c:value-of name="_qparams" />
</c:arg>
<c:arg name="i">
<!-- begin with the last predicate (due to the way we'll recurse, it
will be applied *last* -->
<t:dec>
<c:length-of>
<c:value-of name="_qparams" />
</c:length-of>
</t:dec>
</c:arg>
</c:apply>
</c:let>
</template>
<template name="_when_" desc="Create field predicate for query definition">
There are a series of \tt{_where-*_} templates for query predicates that
are analogous to the \tt{_match-*_} and \tt{_when-*_} templates used in
other contexts.
<inline-template>
<for-each>
<set tplname="_where-eq_" op="CMP_OP_EQ" desc="equal" />
<set tplname="_where-lt_" op="CMP_OP_LT" desc="less than" />
<set tplname="_where-lte_" op="CMP_OP_LTE" desc="less than or equal to" />
<set tplname="_where-gt_" op="CMP_OP_GT" desc="greater than" />
<set tplname="_where-gte_" op="CMP_OP_GTE" desc="greater than or equal to" />
</for-each>
<template name="@tplname@" desc="Field predicate for table query ({@desc@})">
<param name="@values@" desc="Field value (provide only one node)" />
<param name="@id@" desc="Field index" />
<param name="@field@" desc="Field name (to be used with base)" />
<param name="@name@" desc="Field name (as a variable/constant)">
<text></text>
</param>
<param name="@seqvar@" desc="Var/constant containing whether field is sequential">
<text></text>
</param>
<t:where id="@id@" seqvar="@seqvar@"
field="@field@" name="@name@" op="@op@">
<expand-barrier>
<param-copy name="@values@" />
</expand-barrier>
</t:where>
</template>
</inline-template>
<template name="_where_" desc="Create field predicate for query definition">
<param name="@id@" desc="Field index" />
<param name="@values@" desc="Field value (provide only one node)" />
<param name="@sequential@" desc="Is data sequential?" />
@ -367,23 +409,29 @@
<text>_IS_SEQ</text>
</param>
<c:set label="Conditional for {@field@}">
<param name="@op@"
desc="Comparison operator (default CMP_OP_EQ; see CmpOp typedef)">
<text>CMP_OP_EQ</text>
</param>
<c:vector label="Conditional for {@field@}">
<!-- the first element will represent the column (field) index -->
<unless name="@name@">
<c:const value="@id@" type="integer" desc="Field index" />
<c:const value="@id@" desc="Field index" />
</unless>
<if name="@name@">
<c:value-of name="@name@" />
</if>
<!-- the second element will represent the expected value(s) -->
<c:set>
<c:vector>
<param-copy name="@values@" />
</c:set>
</c:vector>
<!-- the final element will represent whether or not this field is sequential -->
<!-- the third element will represent whether or not this field is sequential -->
<if name="@sequential@">
<c:const value="@sequential@" type="boolean" desc="Whether data is sequential" />
<c:const value="@sequential@" desc="Whether data is sequential" />
</if>
<unless name="@sequential@">
<!-- if a field name was given, we can get the sequential information
@ -396,7 +444,50 @@
<c:value-of name="FALSE" />
</unless>
</unless>
</c:set>
<!-- the fourth and final element is the comparison operator -->
<c:value-of name="@op@" />
</c:vector>
</template>
<!--
_when_ is deprecated in favor of _where-eq_.
This old template aimed to be consistent with the use of `when'
elsewhere (for cases and value predicates), but it was awkward in a
query abstraction.
-->
<template name="_when_"
desc="Create field predicate for query definition (deprecated;
use _where-*_)">
<param name="@values@" desc="Field value (provide only one node)" />
<param name="@id@" desc="Field index" />
<param name="@sequential@" desc="Is data sequential?" />
<param name="@field@" desc="Field name (to be used with base)" />
<param name="@name@" desc="Field name (as a variable/constant)">
<text></text>
</param>
<param name="@seqvar@" desc="Var/constant containing whether field is sequential">
<text></text>
</param>
<param name="@op@"
desc="Comparison operator (default CMP_OP_EQ; see CmpOp typedef)">
<text></text>
</param>
<warning>
_when_ is deprecated; use _where-eq_ instead
</warning>
<t:where id="@id@" sequential="@sequential@" seqvar="@seqvar@"
field="@field@" name="@name@" op="CMP_OP_EQ">
<param-copy name="@values@" />
</t:where>
</template>
@ -417,38 +508,27 @@
<c:cases>
<c:case>
<c:when name="i">
<c:eq>
<!-- it's important that we allow index 0, since that is a valid
predicate -->
<c:const value="-1" type="integer" desc="We're done." />
</c:eq>
</c:when>
<!-- it's important that we allow index 0, since that is a valid
predicate -->
<t:when-eq name="i" value="#-1" />
<!-- we're done; stick with the result -->
<c:value-of name="matrix" />
</c:case>
<c:otherwise>
<c:apply name="mfilter">
<!-- matrix to search -->
<c:arg name="matrix">
<!-- >> recursion happens here << -->
<c:apply name="_mquery">
<c:arg name="matrix">
<c:value-of name="matrix" />
</c:arg>
<c:arg name="criteria">
<c:value-of name="criteria" />
</c:arg>
<c:recurse>
<c:arg name="i">
<t:dec>
<c:value-of name="i" />
</t:dec>
</c:arg>
</c:apply>
</c:recurse>
</c:arg>
<!-- field (column) -->
@ -458,7 +538,7 @@
<c:value-of name="i" />
</c:index>
<c:index>
<c:const value="0" type="integer" desc="Field id" />
<c:const value="0" desc="Field id" />
</c:index>
</c:value-of>
</c:arg>
@ -470,7 +550,7 @@
<c:value-of name="i" />
</c:index>
<c:index>
<c:const value="1" type="integer" desc="Field value" />
<c:const value="1" desc="Field value" />
</c:index>
</c:value-of>
</c:arg>
@ -482,7 +562,19 @@
<c:value-of name="i" />
</c:index>
<c:index>
<c:const value="2" type="integer" desc="Sequential flag" />
<c:const value="2" desc="Sequential flag" />
</c:index>
</c:value-of>
</c:arg>
<!-- comparison operator -->
<c:arg name="op">
<c:value-of name="criteria">
<c:index>
<c:value-of name="i" />
</c:index>
<c:index>
<c:const value="3" desc="Comparison operator" />
</c:index>
</c:value-of>
</c:arg>

View File

@ -1,6 +1,6 @@
<?xml version="1.0"?>
<!--
Copyright (C) 2018 R-T Specialty, LLC.
Copyright (C) 2014-2023 Ryan Specialty, LLC.
This file is part of tame-core.

34
design/tpl/.gitignore vendored 100644
View File

@ -0,0 +1,34 @@
# Ignored files for The TAME Programming Language Git repository
# Targets
*.pdf
*.dvi
# (La)TeX
*.aux
*.fls
*.log
*.out
*.toc
# BibLaTeX
*.bbl
*.bcf
*.blg
*.run.xml
# Index
*.idx
*.ilg
*.ind
# latexmk
*.fdb_latexmk
# configure
conf.tex
aclocal.m4
autom4te.cache
config.status
configure

Some files were not shown because too many files have changed in this diff Show More