There was a bug whereby TRUE matches would keep whatever value was being
matched on, even if it was not a boolean. That was an oversight from the
proof-of-concept code, and this fixes it; that's why this is behind a flag!
This also adjusts the class aliasing optimization so that it doesn't check
for a `TRUE` symbol name, which was a bad idea to begin with.
This change also ends up expanding `lv:match[@value="TRUE"]` into the long
form, where it didn't previously; this will result in slightly larger xmlo
files in some cases, but it's nothing significant, and it does not impact
compilation times.
This is a nearly-10-year-old bug that was introduced when the Summary Page
was modified to use the then-new symbol table. The compiler previously
concatenated all packages into a single XML tree and processed that, so no
package resolution was necessary here before.
This largely reintroduces the legacy classification system, but there are a
number of things that are not affected by the flag. For example:
1. Alias classifications are still optimized when the flag is off;
2. Classifications without predicates emit slightly different code than
before, though their functionality has not changed;
3. There's been a lot of refactoring and minor optimizations that are
unaffected by the flag;
4. lv:match/@pattern will now emit a warning; and
5. Cleaning and casting of input data is not gated.
This allows us to incrementally migrate to the new system where behavior may
be different, but this is admittedly a bit dangerous in that the new system
was aggressively tested and reasoned about, so reintroducing the legacy
system may combine in unexpected ways.
This is another significant milestone.
The next logical step with classification optimization is to inline all of
those intermediate classifications generated from any and all blocks, since
there are so many of them. This means having the parent classification
absorb all dependencies; not output dependencies for the classification; not
compile the assignments for those classifications; and to inline them at the
match site. They’re used only once, since they’re generated for each
individual block.
We need to keep the actual classification generation around (and just inline
them) for now, probably until TAMER, because we depend upon their symbol for
determining their dimensionality, which we need for the optimization work we
just did---we must inline them into the proper group (matrix, vector, or
scalar).
The optimization work done up to this point had inlining in mind---only a
little bit of work was needed to make sure that every classification can
simply be stripped of its assignment and be a valid expression that can be
inlined in place of the original reference.
The result of that was predictably significant for the `ui/package` program
that I've been testing with:
- 4,514 classifications were inlined;
- The file size dropped to 7.5MiB (from 8.2MiB previously---remember that
we started at 16MiB); and
- GC ticks were cut in half, from 67->31.
Unfortunately, this optimization added nearly 1m of time to the compilation
of that program. Speaking from the future: the UI build optimizations in
liza-proguic were introduced to offset this difference (and provide a net
gain in performance).
This convets disjunctive classifications into conjunctive and places an
<any> within it.
This ends up handling all the generated qwhen classifications from proguic,
which were probably converted into <any> by a previous optimization pass.
The UI program I've been using to test these compiler optimizations has
decreased in size down from 8.2MiB since the beginning of this branch; we
started at ~16MiB.
See comments. This is meant to help mitigate the damage done by one of our
code generation systems. The benefit is significant, allowing the code
generator to remain simple. By placing this optimization within the
compiler, hand-written and template-generated code also benefit.
Rather than extracting every any/all into their own classifications,
eliminate them (and replace them with their body) if they contain only one
predicate. This is most likely to happen after template expansion, and
there were an alarming number of them in our system.
Stripping them out of one of our programs saved ~0.2MiB of output, and
removed many intermediate classifications. It removed ~1,075 lines, which
should correspond closely to the actual number of classifications.
Discovering this required stripping the template barriers, which was done in
a previous commit.
Unfortunately, the performance improvement from this wasn't significantly,
largely because of the nondeterminisim of GC, which can easily mask the
gains. But a new line `v8::internal::FixedArray::set(int,
v8::internal::Object)` appeared in the profiler output, making me wonder
whether the JIT is starting to understand more interesting properties of the
system.
`mprotect` and `v8::internal::heap_internals::GenerationalBarrier` also
appeared, which are related to GC.
This shaves ~1m off of the total build time for our largest system. Output
is impressively slow.
Around this point in time, we have the following profile from V8's sampling
profiler:
[JavaScript]:
ticks total nonlib name
36 2.8% 10.7% LazyCompile: *anyValue [...]/ui/package.strip.new.js:31020:22
3 0.2% 0.9% LazyCompile: *m1v1u [...]/ui/package.strip.new.js:30941:19
2 0.2% 0.6% LazyCompile: *precision [...]/ui/package.strip.new.js:30934:23
1 0.1% 0.3% LazyCompile: *vu [...]/ui/package.strip.new.js:30964:16
1 0.1% 0.3% LazyCompile: *init_defaults [...]/ui/package.strip.new.js:31341:27
This problem manifested when the name of the attempted classification is the
same name as another object. For example, if we have `t:match-class
name="foo"`, and `foo` is a param instead of a class, then `@yields` will
fail, and it'd fall back to matching on the param.
This is absolutely not what we want.
The error message in this context is ugly, but it does work.
Example:
!!! Unknown match @on (/lv:package/lv:classify/match): `error: unable to
determine @yields for class `scheduled_ai' (has the class been imported?)'
is unknown for classification --vis-scheduled-ai-type
This implements TCO in the XSLT compiler by requiring a human to manually
indicate when a recursive call is in tail position. This was somewhat
urgently needed to resolve stack exhaustion on large rate tables.
TAMER will do this properly by determining itself whether a call is in tail
position. Until then, this will serve as a test for this type of feature.
Replacing the existing macros with templates will allow us to now have
to deal with macros in the new compiler.
The `indexNameType` pattern needed to change to allow for variables. I
also had to remove the prefix for the `gentle-no` option of `rate`.
Create a "yield" and add backwards compatibility for the macro of the
same name. This is one of 2 macros that need to be replaced so we do not
have to worry about them with the new compiler.
This ordering will simplify streaming processing of xmlo files in
TAMER. Specifically, we know that symbols will have been declared by the
time dependencies are added to the graph (and so we should only be creating
edges to existing nodes); and we can halt reading as soon as the closing
fragments tag is encountered, avoiding parsing the entirety of these massive
XML files.
On one particularly large program, this cuts time down from ~0.333s to
~0.300 in the POC linker.
This is left over from f2db9f1268, in which I
should have cleaned all of this up. One of our developers was hitting the
removed warning, which isn't necessary since the concept of a separate
"classifier" is no longer a thing after the aforementioned commit.
* rater/rater.xsd (no-extclass, no-extclass-keeps): Remove.
* src/current/rater.xsd: Likewise. (I really need to deduplicate these.)
* src/current/compiler/js.xsl (compiler:entry-rater): Remove inaccurate
comment (genclasses is used for other things).
* src/current/include/depgen.xsl (preproc:depgen-match): Remove error
checking for pulling in non-external classes (this is the error that the
developer hit that is no longer needed).
* src/current/include/preproc/eligclass.xsl (preproc:sym): Remove
`@extclass' predicate. Remove portion of comment.
* src/current/include/preproc/expand.xsl: Remove ancient footnote that
even references an old internal rater!
* src/current/include/preproc/macros.xsl (preproc:class-groupgen): Remove
external propagation.
* src/current/include/preproc/symtable.xsl (preproc:symimport): Remove
extclass checks and propagation.
(preproc:symtable)[lv:rate]: Remove external propagation.
[lv:classify]: Likewise.
* src/current/include/preproc/template.xsl (preproc:inline-apply): Remove
external sym metadata support.
This further improves performance of the symbol table processing. The next
step will be to address how symbols are handled on a more intimate level,
since it's a huge mess atm. But I'll save that for later, after the
low-hanging fruit has been resolved.
* src/current/include/preproc/symtable.xsl (preproc:sym-discover): Use
`for-each-group' in place of `preceding-sibling'. Aggressive use of
maps for geneating the `dedup' sequence, which is a mess.
(preproc:symtable-process-symbols): Additional maps to avoid
preceding-sibling and following-sibling selectors (O(n²)=>O(n)).
This uses the same map strategy (and same duplicate code) as previous
commits, but this one generates a map for two separate tables.
There is more room for improvement, but this cuts down on the time a
lot. Also keep in mind that this is performed multiple times (once per
pass), so it's still worth revisiting. Performance is still very poor for
very large (many thousands of symbols) symbol tables.
The next slowest part appears to be the fragment compilation. I'm nearing
the end of the low-low-hanging fruit for maps. The /common/gl package
mentioned in previous commits that previously took over a minute to compile
now compiles in 20s as of this commit on equivalent hardware.
* src/current/include/preproc/symtable.xsl (@xmlns:map): New namespace
declaration.
(preproc:symtable-process-symbols): Create map for `cursym' and
`extresults'. Use it. Remove unused `dup'. Output message when
done (another is output slightly later on in the process).
The existing code was not only complex (because of XSLT 1), but mostly
unnecessary. We don't need to consult remote symbol tables at all anymore.
This shaves off an additional few seconds on large packages.
* src/current/include/preproc/package.xsl (preproc:resolv-syms)[preproc:sym]:
Only consult local symbol table. Simplify max dimension calculation.
This is a first step (low-hanging-fruit kinda thing) for improving the
performance of symbol resolution, where the compiler has to figure out the
dimensions of a symbol by first resolving its dependencies,
recursively. This is approximately an O(n³) polynomial-time algorithm _per
recursive step_. Yikes.
This is traditionally where dynamic programming methods would be used, but
that's considerably more difficult in a immutable languages like XSLT, so
I'll do my best without. (Saxon does offer some support for mutability, but
I'd prefer to avoid it if possible.)
This first change improves performance 30--40%. For example, on two large
packages we have, build times drop from 55s to 35s and from 1m42s to 1m13s
respectively.
Good start, but much more to be done!
* src/current/include/preproc/package.xsl (preproc:resolv-syms)[lv:package]:
Compute maps for preproc:symtable and preproc:sym-deps at each recursive
step. Pass along via tunneling.
(preproc:resolv-syms)[preproc:sym]: Use them.
DEV-4354
This now uses year ranges, which I'll update annually.
This also renames "R-T Specialty" to "Ryan Specialty Group". The latter is
the parent company of the former. I was originally employed under the
former when LoVullo Associates was purchased, by I now work for the parent
company.
This is a significant performance improvement for dependency
generation (which is responsible for building the dependency graph for a
package).
The previous algorithm ran in O(n²) time: it would iterate over the given
symbol table, and for _each_ symbol, do a linear scan of the entire document
to search for the corresponding source block. This resulted in explosive
depgen time for larger packages.
This makes the algorithm run in O(n) by:
- Using an XSLT 3 map for the symbol table for O(1) lookups; and
- Iterating over the _document_ a single time rather than the symbol
table, referencing the symbol table as needed (in O(1) time).
There are other parts of the system that can benefit from these same
improvements. This is important, since we need to be able to handle many
thousands of symbols efficiently.
* src/current/compiler/linker.xsl (l:depgen-sym): Recognize smybol `no-deps'
property, permitting missing dependencies. This allows us to avoid
creating nonsense nodes just to satisfy the linker, while still allowing
the linker to perform essential checks to defend against compiler bugs.
* src/current/compiler/map.xsl (lvmc:stub-symtable): Set @no-deps on
`___head' and `___tail' symbols.
(lvmc:mapsym): Set `no-deps' as appropriate on map symbols.
(preproc:depgen)[lvm:map[@from]]: Generate `preproc:sym-dep' node, which
is now expected by the depgen process.
(preproc:depgen)[lvm:map[*]]: Likewise.
(preproc:depgen)[*[@lvmc:type='retmap']//lvmm:map[@from]]: Remove
unnecessary template.
(preproc:symtable)[lvm:map[@value]]: Pass `no-deps' to `lvmc:mapsym'.
* src/current/include/depgen.xsl (preproc:depgen)[preproc:symtable]: Create
and use XSLT 3 map in place of `preproc:symtable' tree. This allows for
constant-time lookups. Provide to templates via tunnelling. Use it in
place of exiting tree references. Process source tree rather than
iterating over symbol table.
(preproc:depgen)[lv:rate, c:sum[@generates], c:product[@generates],
lv:classify, lv:function/lv:param, lv:function, lv:typedef]: Produce
`preproc:sym-dep' nodes (which was previously done while iterating
over the symbol table).
(preproc:depgen)[preproc:sym]: Remove all such processing, since we no
longer iterate over the symbol table.
(preproc:depgen)[c:value-of]: Use symtable map.
(preproc:depgen-match): Likewise.
(preproc:depgen)[lv:union]: Modify to handle changes to lv:typedef
template.
(preproc:depgen)[text()]: Remove and replace with `node()'.
* src/current/include/preproc/package.xsl (preproc:resolv-syms): Remove
logging of symbol resolution. This has a slight performace impact since
there is a lot of output.
* src/current/include/preproc/symtable.xsl
(lv:function/lv:param, c:let/c:Values/c:value): Set `no-deps'.
* src/symtable/symbols.xsl: Add documentation of `no-deps'.
(preproc:symtable)[lv:meta]: Set `no-deps'.
It's going to be like TeX before you know it... ._.
* src/current/include/preproc/package.xsl (preproc:tpl-check)
[lv:template|lv:const|lv:typedef|lv:param-copy]: Add lv:param-copy.
* src/current/include/preproc/template.xsl (preproc:apply-template)
[lv:expand-barrier, lv:skip-child-expansion]: New expansion control
structures.
This is a much more useful description if present.
* src/current/include/preproc/macros.xsl (preproc:macros)[c:value-of...]:
Default generated constant description to @label.
The term "set" is all wrong---it is actally intended to be a vector, and can
absolutely have duplicate elements (and often does).
* src/current/calc.xsd (vector): Add, recommending in place of `set'.
* src/current/compiler/js-calc.xsl (compile-calc)[c:set|c:vector]:
Add `c:vector' and provide deprecation notice for `c:set'.
* src/current/include/calc-display.xsl (c:set|c:vector): Likewise.
This has been broken for years. I don't object to fixing it, it's just that
I have better things to do right now and we've gotten complaints about it;
no use in keeping around something that's broken if there's no desire to fix
it. Workaround: refresh the page.
This does keep around the reset logic because it is actually used in other
places.
* src/current/include/entry-form.xsl (entry-form)[lv:package]: Remove reset
button.
* src/current/include/entry-form.js (clearTestCases): Remove broken function
call `Prior.setPriorMessage(null)'.
This was throwing a warning in non-ancient versions of Saxon. It does not
need to be there, nor should it be, nor do I know why it was put there.
* src/current/include/preproc/template.xsl (eseq:is-expandable): Remove
@override.
* src/current/include/dslc-base.xsl (__path-root): New param.
* src/current/src/com/lovullo/dslc/DslCompiler.java
(DslCompiler)[compile]: Resolve TAME root path.
[_transform]: Set it.
DEV-3115
We need to cut down on symbol imports as much as possible; the whole system
starts dragging if we are importing thousands of symbols into a single
package.
* src/current/include/preproc/symtable.xsl (preproc:symtable)[lv:rate,c:*]: Mark
as local if `@preproc:generated`.
* src/current/include/preproc/template.xsl (preproc:macros)[lv:inline-template]:
Mark generated templates as such.
* src/symtable/symbols.xsl (preproc:symtable)[lv:template]: Mark as local if
`@preproc:generated'.
* src/current/compiler/js.xsl (compile-class-condtion)[lv:rate]: Do not
consider @no's in predicate generation when `@preproc:gentle-no' is set.
* src/current/include/preproc/macros.xsl (preproc:macros)[lv:rate-each]: Set
`@preproc:gentle-no' on generated `lv:rate', since the generator handles
`@no' itself.
* src/current/include/preproc/template.xsl
(preproc:gen-param-value)[lv:param-sym-value]: Suppress warning for
missing symbol and yield empty string if `@ignore-missing='true'`.
This ensures that they are compiled into the `consts' object.
* src/current/include/depgen.xsl (preproc:depgen)[lv:typedef]: Include
`lv:enum/lv:item/@name' as dependencies.
The problem with this implementation was that, any time a generator had an
associated generated @yields (which is common), it wouldn't be included in
the summary page.
We can address this in the future. It's not necessarily that it was
incorrect; it's just how the system made use of it.
* src/current/include/preproc/symtable.xsl (preproc:symtable)[lv:rate]:
Do not mark @preproc:yields-generated symbols as @preproc:generated.
Templates can expand into unexpected places, so sometimes warnings are
inappropriately issued.
* src/current/include/depgen.xsl (preproc:depgen)[lv:template]: Ignore.
[lv:template/lv:param]: Remove (now unnecessary with above).