Commit Graph

1512 Commits (417df548cf68e0990993c345529be508c820a38c)

Author SHA1 Message Date
Mike Gerwitz ca6ef3ed36 tamer: frontend: Begin basic XML parsing
The first step in the process is to emit the raw XML events that can then be
immediately output again to echo the results into another file.  This will
then allow us to begin parsing the input incrementally, and begin to morph
the output into a real `xmlo` file.
2021-07-27 00:37:13 -04:00
Mike Gerwitz d9dcfe8777 tamer: Introduce tpwrap module to contain quick_xml::Error adapter
This adapter exists to implement PartialEq so that it can be derived on
Error objects.  This is used primarily (well, exclusively atm) for tests.
2021-07-23 23:23:55 -04:00
Mike Gerwitz fb8422d670 tamer: Initial frontend concept
This introduces the beginnings of frontends for TAMER, gated behind a
`wip-features` flag.

This will be introduced in stages:

  1. Replace the existing copy with a parser-based copy (echo back out the
     tokens), when the flag is on.
  2. Begin to parse portions of the source, augmenting the output xmlo (xmli
     at the moment).  The XSLT-based compiler will be modified to skip
     compilation steps as necessary.

As portions of the compilation are implemented in TAMER, they'll be placed
behind their own feature flags and stabalized, which will incrementally
remove the compilation steps from the XSLT-based system.  The result should
be substantial incremental performance improvements.

Short-term, the priorities are for loading identifiers into an IR
are (though the order may change):

  1. Echo
  2. Imports
  3. Extern declarations.
  4. Simple identifiers (e.g. param, const, template, etc).
  5. Classifications.
  6. Documentation expressions.
  7. Calculation expressions.
  8. Template applications.
  9. Template definitions.
  10. Inline templates.

After each of those are done, the resulting xmlo (xmli) will have fully
reconstructed the source document from the IR produced during parsing.
2021-07-23 22:24:08 -04:00
Mike Gerwitz 60372d2960 tamer: Makefile.am (all): Binaries and doc
`all` was previously the target for binaries only.
2021-07-23 22:23:10 -04:00
Mike Gerwitz 6ec1a49506 tamer: Makefile.am: Include feature flags for doc generation and tests
This was forgotten in the previous commit.
2021-07-23 15:56:33 -04:00
Mike Gerwitz f1a3273ee3 tamer: configure.ac: Configure-time feature flags (via Cargo) 2021-07-23 10:16:44 -04:00
Mike Gerwitz 5aaa1106cb tamer: obj::xmlo::reader::mock: Extract into crate::test::quick_xml
Other mocks exist here, and here it can be re-used for the upcoming XML
frontend.
2021-07-22 15:32:30 -04:00
Mike Gerwitz 2e50af1220 Copyright year update 2021 2021-07-22 15:00:15 -04:00
Mike Gerwitz e5bbd49166 tamer: obj::xmlo::reader: Extract tests separate file
The file's getting a bit large and the tests are rather complex.  Further,
LSP does better on smaller, less complex files.
2021-07-22 14:39:06 -04:00
Mike Gerwitz 1f24cfdf25 Remove :map: sym-dep generation
This was incorrect to begin with---it does not make sense that an input
mapping should depend upon the identifier that it maps to, in the sense that
we make use of these dependencies.  If we add weak symbol references in the
future, then this can be reintroduced.

By removing this, we free tameld from having to perform the check itself.

.rev-xmlo bumped to force rebuilding of object files since the linker now
expects that no such dependencies will exist within them.
2021-07-22 14:27:15 -04:00
Mike Gerwitz 8a2cc28ddb RELEASES.md: Update for v18.0.3 2021-07-21 15:05:52 -04:00
Mike Gerwitz c90566056d RELEASES.md: NEXT summary 2021-07-21 15:04:59 -04:00
Mike Gerwitz 90c6b51fd5 tamer: tameld: Place constants into static section in executable
This is something that changed when the TAMER POC was initially created, as
I was learning Rust.  I don't recall the original reason why this was moved,
but it could have been moved back long ago.

In our systems, constants can hold tables (as matrices) with tens or
hundreds of thousands of rows, and there are a number of them in certain
projects.  As an example, the YAML-based test cases for one of our systems
went from ~2m30s to ~45s after this change was made.  Much of the cost
savings comes from saving GC.
2021-07-21 14:53:15 -04:00
Mike Gerwitz 53360548da tame: Ignore duplicate conjunctive predicates in value list optimization error
This can occur in generated code (e.g. from proguic if a question-based
predicate inherits a predicate already specified).  This commit does not
change anything that's emitted; it merely allows proceeding.

TAMER can be smarter about this; I don't want to invest more time into
generalizing deduplication of predicates.
2021-07-19 14:53:25 -04:00
Mike Gerwitz 5dab913ecb RELEASES.md: Update for v18.0.2 2021-07-15 23:50:53 -04:00
Mike Gerwitz b2323e80ef RELEASES.md: Summary of NEXT 2021-07-15 23:50:00 -04:00
Mike Gerwitz 2ad0d1425a compiler: Correct handling of TRUE matches
There was a bug whereby TRUE matches would keep whatever value was being
matched on, even if it was not a boolean.  That was an oversight from the
proof-of-concept code, and this fixes it; that's why this is behind a flag!

This also adjusts the class aliasing optimization so that it doesn't check
for a `TRUE` symbol name, which was a bad idea to begin with.

This change also ends up expanding `lv:match[@value="TRUE"]` into the long
form, where it didn't previously; this will result in slightly larger xmlo
files in some cases, but it's nothing significant, and it does not impact
compilation times.
2021-07-15 14:55:32 -04:00
Mike Gerwitz 37977a8816 entry-form.xsl: Correctly generate HTML for params with imported types
This is a nearly-10-year-old bug that was introduced when the Summary Page
was modified to use the then-new symbol table.  The compiler previously
concatenated all packages into a single XML tree and processed that, so no
package resolution was necessary here before.
2021-07-14 09:59:45 -04:00
Mike Gerwitz 513b8d7b86 worksheet.xsl: Allow package name to auto-generate
A long time ago (about a decade), package names were required, but they are
now generated by the compiler relative to the root path.  The name here was
incorrect, which was generating an incorrect path for the linked symbols,
which was causing problems with the Summary Page.
2021-07-14 09:51:08 -04:00
Mike Gerwitz f5ba4b013b summary: Make Summay Page compiler less chatty
It produces a lot of output that either results in spam (internal errors) or
pollutes the log with unnecessary information.
2021-07-01 13:54:34 -04:00
Mike Gerwitz bc9c667c9d RELEASES.md: Update for v18.0.1 2021-06-24 10:37:25 -04:00
Mike Gerwitz d0e3a5622c Remove class-level notice for new system
This was not intentionally committed.
2021-06-24 09:59:00 -04:00
Mike Gerwitz 9a62bb2ace RELEASES.md: Update for v18.0.0 2021-06-23 12:54:25 -04:00
Mike Gerwitz eef2a5d4bc Compiler runtime optimizations with classification system rewrite
See RELEASES.md for a list of changes.

This was a significant effort that began about six months ago, but was
paused at a number of points.  Rather than risking further pauses from
interruptions, the new classification system has been gated behind a
package-level feature flag, since it causes BC breaks in certain buggy
situations.

Since this flag was introduced late, there is the potential that it causes
bugs when new optimizations are mixed with the old system.
2021-06-23 12:48:42 -04:00
Mike Gerwitz dd432d249d RELEASES.md: Update with compiler optimizations 2021-06-23 12:46:37 -04:00
Mike Gerwitz 4e859148c0 tools/pkg-graph: Debugging tool to output graph of package dependencies 2021-06-23 11:44:36 -04:00
Mike Gerwitz e9598b7cb5 Correct short runtime var declarations
They were not actually defined before being aliased.
2021-06-23 11:44:36 -04:00
Mike Gerwitz 6f2b4090cd Correct behavior of matrix matching with separate index sets in new system
This behavior was largely correct, but was not commutative if the size of
the matrices (rows or columns) was smaller than a following match.
2021-06-23 11:44:36 -04:00
Mike Gerwitz e90ebd226c Remove arrow functions from classifier runtime
We need to support as far back as IE11, unfortunately, which is ES5.
2021-06-23 11:44:36 -04:00
Mike Gerwitz 934824b2ee Reintroduce legacy classification system, place new behind flag
This largely reintroduces the legacy classification system, but there are a
number of things that are not affected by the flag.  For example:

  1. Alias classifications are still optimized when the flag is off;
  2. Classifications without predicates emit slightly different code than
     before, though their functionality has not changed;
  3. There's been a lot of refactoring and minor optimizations that are
     unaffected by the flag;
  4. lv:match/@pattern will now emit a warning; and
  5. Cleaning and casting of input data is not gated.

This allows us to incrementally migrate to the new system where behavior may
be different, but this is admittedly a bit dangerous in that the new system
was aggressively tested and reasoned about, so reintroducing the legacy
system may combine in unexpected ways.
2021-06-23 11:44:36 -04:00
Mike Gerwitz 5f6cb4cf51 .rev-xmlo: Bump version
The old and new classification systems are currently incompatible, but if the
old is reintroduced, this commit can go away.
2021-06-23 11:44:36 -04:00
Mike Gerwitz 7dbb653624 Inline intermediate any/all classifications
This is another significant milestone.

The next logical step with classification optimization is to inline all of
those intermediate classifications generated from any and all blocks, since
there are so many of them.  This means having the parent classification
absorb all dependencies; not output dependencies for the classification; not
compile the assignments for those classifications; and to inline them at the
match site.  They’re used only once, since they’re generated for each
individual block.

We need to keep the actual classification generation around (and just inline
them) for now, probably until TAMER, because we depend upon their symbol for
determining their dimensionality, which we need for the optimization work we
just did---we must inline them into the proper group (matrix, vector, or
scalar).

The optimization work done up to this point had inlining in mind---only a
little bit of work was needed to make sure that every classification can
simply be stripped of its assignment and be a valid expression that can be
inlined in place of the original reference.

The result of that was predictably significant for the `ui/package` program
that I've been testing with:

  - 4,514 classifications were inlined;
  - The file size dropped to 7.5MiB (from 8.2MiB previously---remember that
    we started at 16MiB); and
  - GC ticks were cut in half, from 67->31.

Unfortunately, this optimization added nearly 1m of time to the compilation
of that program.  Speaking from the future: the UI build optimizations in
liza-proguic were introduced to offset this difference (and provide a net
gain in performance).
2021-06-23 11:44:36 -04:00
Mike Gerwitz 97caefab1b Extract classify/@terminate into own template
Note that next-match does not cause a return from the template, as odd as it
looks.
2021-06-23 11:44:36 -04:00
Mike Gerwitz 1517a03994 Combine all class optimizations into one 2021-06-23 11:44:36 -04:00
Mike Gerwitz d1dae3e1b1 Explicit types for match raising 2021-06-23 11:44:36 -04:00
Mike Gerwitz 5adf1b7589 Combine all m* optimizations
With the recent refactoring, it's clear that these are the same thing.
2021-06-23 11:44:35 -04:00
Mike Gerwitz a563c3ce62 Remove lv:match checks from class optimization checks
We handle all cases now, and prohibited @pattern that wasn't.
2021-06-23 11:44:35 -04:00
Mike Gerwitz e3fd9388bb Abstract function wrapping for class type raising
This will let us clean up the implementation a bit more.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 10089659b1 Extract lv:classify compilation into function
To support following commits for inlining.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 0cd6d40dd9 compiler: Remove whitespace from vector/matrix constants 2021-06-23 11:44:35 -04:00
Mike Gerwitz 9dbda93b4f {precision=>p} to reduce byte count 2021-06-23 11:44:35 -04:00
Mike Gerwitz f14417f32a Remove unused domains var 2021-06-23 11:44:35 -04:00
Mike Gerwitz e0907c6db2 compiler: Do not output whitespace between nodes 2021-06-23 11:44:35 -04:00
Mike Gerwitz 4ee050323a Apply hositing optimization to classify/@any
This convets disjunctive classifications into conjunctive and places an
<any> within it.

This ends up handling all the generated qwhen classifications from proguic,
which were probably converted into <any> by a previous optimization pass.

The UI program I've been using to test these compiler optimizations has
decreased in size down from 8.2MiB since the beginning of this branch; we
started at ~16MiB.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 658e55f2fa Hoist any-all common predicate for binary conjunctive classifications
See comments.  This is meant to help mitigate the damage done by one of our
code generation systems.  The benefit is significant, allowing the code
generator to remain simple.  By placing this optimization within the
compiler, hand-written and template-generated code also benefit.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 25d500fec5 Generalized value list optimization
Note that this was also broken for vectors and scalars by the commit that
expanded non-TRUE @value.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 8e457dab34 Strip single-predicate any/all instead of extracting
Rather than extracting every any/all into their own classifications,
eliminate them (and replace them with their body) if they contain only one
predicate.  This is most likely to happen after template expansion, and
there were an alarming number of them in our system.

Stripping them out of one of our programs saved ~0.2MiB of output, and
removed many intermediate classifications.  It removed ~1,075 lines, which
should correspond closely to the actual number of classifications.

Discovering this required stripping the template barriers, which was done in
a previous commit.

Unfortunately, the performance improvement from this wasn't significantly,
largely because of the nondeterminisim of GC, which can easily mask the
gains.  But a new line `v8::internal::FixedArray::set(int,
v8::internal::Object)` appeared in the profiler output, making me wonder
whether the JIT is starting to understand more interesting properties of the
system.

`mprotect` and `v8::internal::heap_internals::GenerationalBarrier` also
appeared, which are related to GC.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 2d519947f7 Strip template barriers from expanded classifications
The barriers deeply frustrate static analysis.
2021-06-23 11:44:35 -04:00
Mike Gerwitz f8b166a42d Remove lv:join
This is a long-forgotten and long-unused feature that has been
long-superceded by symbol table introspection in inline-template.
2021-06-23 11:44:35 -04:00
Mike Gerwitz c191af8d53 Remove anyValue and related code
!!!

(Message from the future: this ends up being reintroduced and the new
classification system being placed behind a feature toggle.  But it will be
eliminated eventually.)
2021-06-23 11:44:35 -04:00