This shaves ~1m off of the total build time for our largest system. Output
is impressively slow.
Around this point in time, we have the following profile from V8's sampling
profiler:
[JavaScript]:
ticks total nonlib name
36 2.8% 10.7% LazyCompile: *anyValue [...]/ui/package.strip.new.js:31020:22
3 0.2% 0.9% LazyCompile: *m1v1u [...]/ui/package.strip.new.js:30941:19
2 0.2% 0.6% LazyCompile: *precision [...]/ui/package.strip.new.js:30934:23
1 0.1% 0.3% LazyCompile: *vu [...]/ui/package.strip.new.js:30964:16
1 0.1% 0.3% LazyCompile: *init_defaults [...]/ui/package.strip.new.js:31341:27
This allows us to easily see their shape looking at the compiled code. See
the previous commit for more of an explanation and examples. And future
commits.
This allows us to analyze the compiler runlog and determine the frequency of
certain shapes to prioritize optimization efforts.
This is a proof-of-concept. It also contains arrow functions, which do not
exist in ES5.
The notation m#v#s# refers to matrix, vector, and scalar counts of a
classification. This optimization therefore focuses on classifications with
a single vector and a single matrix.
I'd like to note that this commit message was written in retrospect, months
later, after I returned to these proof-of-concept commits to finalize
them. I'll try my best to have things make sense in a historical context
based on my notes.
The choice to focus on m1v1 was based on taking survey of the shape of
classifications in our largest rating system. m1v*, and specifically m1v1,
was the largest by far, followed by v1s1. Here's an example program used
for a UI:
$ grep -h 'internal: [svm][0-9]\+[svm][0-9]\+ ' run*.log > result
$ cut -d' ' -f2 result | sort | uniq -c | sort -rn
10056 m1v1
1788 m1v2
473 v1s1
18 v2s1
13 v1s5
8 v1s3
7 v1s2
4 v2s5
2 v4s4
2 v4s2
2 v2s8
2 v2s6
2 v1s9
2 v1s4
1 v7s7
1 v6s2
1 v5s7
1 v5s5
1 v5s4
1 v5s2
1 v4s9
1 v4s7
1 v4s3
1 v3s9
1 v3s7
1 v3s5
1 v3s2
1 v3s1
1 v33s21
1 v2s60
1 v2s4
1 v2s3
1 v2s2
1 v28s1
1 v23s8
1 v22s9
1 v1s8
1 v1s6
1 v18s24
1 v15s14
1 v14s6
1 v14s5
1 v13s7
1 v13s6
1 v12s6
1 v11s1
1 m76v7
1 m3v1
1 m1v3
1 m1374v1
The excessively large ones (like the last one) are aggregate classifications
that are generated by a template. But note the first count.
Here's another example, one of the raters:
8812 m1v1
311 v1s1
17 v2s1
14 v1s5
4 v2s5
4 v1s6
4 v11s10
3 v3s1
3 v1s8
2 v5s14
2 v4s7
2 v3s9
2 v3s5
2 v2s4
2 v1s9
2 v1s4
2 v1s2
1 v8s7
1 v7s7
1 v7s15
1 v6s4
1 v6s2
1 v6s10
1 v5s8
1 v5s7
1 v5s4
1 v5s2
1 v53s9
1 v4s9
1 v4s4
1 v4s3
1 v4s2
1 v4s11
1 v3s8
1 v3s7
1 v3s20
1 v3s2
1 v3s19
1 v3s15
1 v2s8
1 v2s60
1 v2s6
1 v2s2
1 v2s12
1 v29s20
1 v28s1
1 v23s8
1 v1s3
1 v15s23
1 v13s6
1 v13s20
1 v12s6
1 v12s10
1 v11s1
1 m1v2
1 m1s1
Given these examples, m1v1 is an easy first choice for this commit.
The general pattern for this commit and those that follow is to match on a
specific shape of classification that we're optimizing for, falling back to
the old anyValue-based system for all other cases, with the intent of
eventually removing it.
This has long been a curse, and I don't know why I didn't resolve it sooner.
This makes explicit some of the odd things that this is doing, to maintain
the previous behavior. Changing that behavior would be ideal, but ought to
be done separately and put behind a feature flag.
This reverts commit e2d9467633bb75d79dbc8fe9f8971bfa412ea59f.
BUT: it does cause more data to be returned, perhaps unnecessarily. See if
that may offset the slight increase in GC cost.
Further, we may end up getting rid of some of these generated values; check
after we do some class optimizations.
This was a waste of time; it actually reduces performance slightly and increased
GC, unintuitively enough.
Leaving commit here and reverting to keep it for reference.
When the Summary Page was _first written_ (the first part of TAME), it was
compiled in the browser---development consisted of refreshing the page,
which was familiar to how we wrote PHP at the time. No compile process.
In that situation, we couldn't have the XSLT stylesheet failing to
translate. But of course those days are long since gone, and this must be a
compile-time error.
It shouldn't ever get to this point, granted.
Single-predicate classifications matching on TRUE can be optimized into
aliases. These sometimes occur in hand-written code, but can also be
generated by templates.
The classification system rewrite removed the debug value collection that
previously existed. It didn't make a whole lot of sense anyway, given that
that compiler rearranges matches.
This falls back to showing the value of the @on, which should be good
enough, and is honestly better than what we had before.
This represents the old cmatch system (which is in use today, but the
classification system has since been rewritten, though it has not yet been
merged). It was my attempt over a decade ago to reason about how this
system ought to work.
I think it's fair to say that this is absolute insanity and that the new
formulation is significantly better.
The intent originally was to try to keep developers to a reasonable name
length, but generated identifiers can easily exceed this, and we further do
not support namespacing.
This can be handled at a template level instead for enforcing naming
conventions.
This had gotten quite out of date from the actual rater.xsd, which existed
outside of this repository, that is used during our build process. That was
an unintended artifact from moving files around.
That file has been removed and symlinked to this one.
First thing to note: this belong in liza-proguic, not here. But it's here
right now, so for now I'm making the change. The relationship between TAME
and proguic is awkward and will hopefully be improved upon in the near
future.
As for this actual change: step-level fragments will be concatenated such
that the imports will appear at the step level rather than the root.
I did say it was _experimental_ guided TRO.
This waits to perform the actual argument reassignment until after
processing the expressions associated with the new arguments, since they
will otherwise be replaced when their original values are still needed.
This change simply prevents failure in such situations, (e.g. on invalidated
fields in Liza). We'll worry about proper errors and correctness, which
ought to be compile-time, in TAMER.
The MathJax CDN stopped working in April 2017. I updated it to the
recommended CDN with the last version from April 2017 to ensure it works
like it used to work before the CDN stopped.
I added the checksum to ensure the content of the script.
This problem manifested when the name of the attempted classification is the
same name as another object. For example, if we have `t:match-class
name="foo"`, and `foo` is a param instead of a class, then `@yields` will
fail, and it'd fall back to matching on the param.
This is absolutely not what we want.
The error message in this context is ugly, but it does work.
Example:
!!! Unknown match @on (/lv:package/lv:classify/match): `error: unable to
determine @yields for class `scheduled_ai' (has the class been imported?)'
is unknown for classification --vis-scheduled-ai-type
This implements TCO in the XSLT compiler by requiring a human to manually
indicate when a recursive call is in tail position. This was somewhat
urgently needed to resolve stack exhaustion on large rate tables.
TAMER will do this properly by determining itself whether a call is in tail
position. Until then, this will serve as a test for this type of feature.
Replacing the existing macros with templates will allow us to now have
to deal with macros in the new compiler.
The `indexNameType` pattern needed to change to allow for variables. I
also had to remove the prefix for the `gentle-no` option of `rate`.
Create a "yield" and add backwards compatibility for the macro of the
same name. This is one of 2 macros that need to be replaced so we do not
have to worry about them with the new compiler.
All systems should be using the provided Makefile, so this shouldn't be
invoked anymore. The new linker is still considered a proof-of-concept, but
bugs have been encountered in the old one that are not worth investing the
time into fixing.
The new linker has been used in production for nearly a couple months and is
functioning properly.
This ordering will simplify streaming processing of xmlo files in
TAMER. Specifically, we know that symbols will have been declared by the
time dependencies are added to the graph (and so we should only be creating
edges to existing nodes); and we can halt reading as soon as the closing
fragments tag is encountered, avoiding parsing the entirety of these massive
XML files.
On one particularly large program, this cuts time down from ~0.333s to
~0.300 in the POC linker.
This not only reduces file size, but also has a significant performance
benefit for the UI, which is almost entirely classifications. A run for one
of our systems was reduced from 1m30s to 11s from this change.
This was used to provide additional information on the stack for debugging
the compiled code. Since this is very rarely needed, and is only needed by
someone debugging the compiler, it can be manually enabled if desired.
This also wraps it so that it'll be stripped if it is included.
The `<t:match-class-code-lookup />` matches were not showing in the
summary pages. I loosened the selector so it is able to find the matches
when it generates the summary pages.
If an `lvm:if` is immediately followed by another 'lvm:if`, both should
be used to create the conditional. The existing code wouild only "select
the nearest condition".
The LOB being passed into the function was being ignored and instead it
was pulling it from the contract object. With Package, this caused all 3
LOB to be "COMMPKGE" rather than the correct LOB being processed at the
time.
Going forward, one cannot `map` or `pass` to "line_code" as it will be
considered a reserved word.
Co-Authored-By: Jim Grundner <james.grundner@rtspecialty.com>
This is left over from f2db9f1268, in which I
should have cleaned all of this up. One of our developers was hitting the
removed warning, which isn't necessary since the concept of a separate
"classifier" is no longer a thing after the aforementioned commit.
* rater/rater.xsd (no-extclass, no-extclass-keeps): Remove.
* src/current/rater.xsd: Likewise. (I really need to deduplicate these.)
* src/current/compiler/js.xsl (compiler:entry-rater): Remove inaccurate
comment (genclasses is used for other things).
* src/current/include/depgen.xsl (preproc:depgen-match): Remove error
checking for pulling in non-external classes (this is the error that the
developer hit that is no longer needed).
* src/current/include/preproc/eligclass.xsl (preproc:sym): Remove
`@extclass' predicate. Remove portion of comment.
* src/current/include/preproc/expand.xsl: Remove ancient footnote that
even references an old internal rater!
* src/current/include/preproc/macros.xsl (preproc:class-groupgen): Remove
external propagation.
* src/current/include/preproc/symtable.xsl (preproc:symimport): Remove
extclass checks and propagation.
(preproc:symtable)[lv:rate]: Remove external propagation.
[lv:classify]: Likewise.
* src/current/include/preproc/template.xsl (preproc:inline-apply): Remove
external sym metadata support.
These exist because TAME is nondeterministic, so all state must be passed
into it. But it's inconvenient to have users have to manually fill in
dates, so we derive them from the environment unless they are set.
* src/current/scripts/entry-form.js (fillTimeValues): New function.
(rater): Use it.
This further improves performance of the symbol table processing. The next
step will be to address how symbols are handled on a more intimate level,
since it's a huge mess atm. But I'll save that for later, after the
low-hanging fruit has been resolved.
* src/current/include/preproc/symtable.xsl (preproc:sym-discover): Use
`for-each-group' in place of `preceding-sibling'. Aggressive use of
maps for geneating the `dedup' sequence, which is a mess.
(preproc:symtable-process-symbols): Additional maps to avoid
preceding-sibling and following-sibling selectors (O(n²)=>O(n)).
Same concept as previous commits: rather than iterating over the symbol
table and scanning the tree for the matching node, iterate over the document
and look up from a symbol map: O(n²) => O(n).
This gives a respectable performance boost to compilation of certain
packages (best improving packages with many classifications or rate blocks).
* src/current/compiler/fragments.xsl (@xmlns:xs, @xmlns:map): New namespace
declarations.
(preproc:compile-fragments): Generate `preproc:fragment' nodes and match
on document rather than symbols.
[lv:package]: Generate map and tunnel it.
* src/current/compiler/js.xsl (compile)[lv:classify, lv:match]: Use
symtable-map.
(compile-class-condition)[lv:rate]: Likewise.
(compile-cmatch)[lv:rate]: Likewise.
This uses the same map strategy (and same duplicate code) as previous
commits, but this one generates a map for two separate tables.
There is more room for improvement, but this cuts down on the time a
lot. Also keep in mind that this is performed multiple times (once per
pass), so it's still worth revisiting. Performance is still very poor for
very large (many thousands of symbols) symbol tables.
The next slowest part appears to be the fragment compilation. I'm nearing
the end of the low-low-hanging fruit for maps. The /common/gl package
mentioned in previous commits that previously took over a minute to compile
now compiles in 20s as of this commit on equivalent hardware.
* src/current/include/preproc/symtable.xsl (@xmlns:map): New namespace
declaration.
(preproc:symtable-process-symbols): Create map for `cursym' and
`extresults'. Use it. Remove unused `dup'. Output message when
done (another is output slightly later on in the process).
This is the first step to improving the map. Note that this duplicates the
symbol table generation code that's used in a few other places
now---that'll be cleaned up in future commits once I have a better idea of
all the places this will be used and try to move it to a higher level.
* src/current/compiler/validate.xsl (@xmlns:xs, @xmlns:map): New namespace
definitions.
(lvv:validate)[lv:package]: Generate symbol table map. Tunnel to
templates.
[c:apply[@name], lv:classify[@as]//lv:match, lv:match[@value]]
[c:*[@name or @of], c:apply/c:arg[@name], lv:rate/lv:class]: Use it.
The existing code was not only complex (because of XSLT 1), but mostly
unnecessary. We don't need to consult remote symbol tables at all anymore.
This shaves off an additional few seconds on large packages.
* src/current/include/preproc/package.xsl (preproc:resolv-syms)[preproc:sym]:
Only consult local symbol table. Simplify max dimension calculation.
This is a first step (low-hanging-fruit kinda thing) for improving the
performance of symbol resolution, where the compiler has to figure out the
dimensions of a symbol by first resolving its dependencies,
recursively. This is approximately an O(n³) polynomial-time algorithm _per
recursive step_. Yikes.
This is traditionally where dynamic programming methods would be used, but
that's considerably more difficult in a immutable languages like XSLT, so
I'll do my best without. (Saxon does offer some support for mutability, but
I'd prefer to avoid it if possible.)
This first change improves performance 30--40%. For example, on two large
packages we have, build times drop from 55s to 35s and from 1m42s to 1m13s
respectively.
Good start, but much more to be done!
* src/current/include/preproc/package.xsl (preproc:resolv-syms)[lv:package]:
Compute maps for preproc:symtable and preproc:sym-deps at each recursive
step. Pass along via tunneling.
(preproc:resolv-syms)[preproc:sym]: Use them.
DEV-4354
This only saves 1--2s on a 30s run, but I want to move into this direction,
so it'll simplify future refactoring if I just add it. Small changes like
these will accumulate, too.
* src/current/compiler/linker.xsl (l:orig-package, l:root-symtable-map): New
variables.
(l:resov-extern): Use it.
This now uses year ranges, which I'll update annually.
This also renames "R-T Specialty" to "Ryan Specialty Group". The latter is
the parent company of the former. I was originally employed under the
former when LoVullo Associates was purchased, by I now work for the parent
company.
The previous commit made dependency lists optional for certain symbols. The
Summary Page needs to be updated to permit such a thing.
The whole Summary Page needs aggressive refactoring, though, so this doesn't
bother checking for `no-deps' to see if this is a bad thing.
* src/current/summary.xsl (typeset-final)[preproc:sym-ref]: Permit missing
symbol dependencies.
(lv:param|lv:const|lv:item): Likewise.
This is a significant performance improvement for dependency
generation (which is responsible for building the dependency graph for a
package).
The previous algorithm ran in O(n²) time: it would iterate over the given
symbol table, and for _each_ symbol, do a linear scan of the entire document
to search for the corresponding source block. This resulted in explosive
depgen time for larger packages.
This makes the algorithm run in O(n) by:
- Using an XSLT 3 map for the symbol table for O(1) lookups; and
- Iterating over the _document_ a single time rather than the symbol
table, referencing the symbol table as needed (in O(1) time).
There are other parts of the system that can benefit from these same
improvements. This is important, since we need to be able to handle many
thousands of symbols efficiently.
* src/current/compiler/linker.xsl (l:depgen-sym): Recognize smybol `no-deps'
property, permitting missing dependencies. This allows us to avoid
creating nonsense nodes just to satisfy the linker, while still allowing
the linker to perform essential checks to defend against compiler bugs.
* src/current/compiler/map.xsl (lvmc:stub-symtable): Set @no-deps on
`___head' and `___tail' symbols.
(lvmc:mapsym): Set `no-deps' as appropriate on map symbols.
(preproc:depgen)[lvm:map[@from]]: Generate `preproc:sym-dep' node, which
is now expected by the depgen process.
(preproc:depgen)[lvm:map[*]]: Likewise.
(preproc:depgen)[*[@lvmc:type='retmap']//lvmm:map[@from]]: Remove
unnecessary template.
(preproc:symtable)[lvm:map[@value]]: Pass `no-deps' to `lvmc:mapsym'.
* src/current/include/depgen.xsl (preproc:depgen)[preproc:symtable]: Create
and use XSLT 3 map in place of `preproc:symtable' tree. This allows for
constant-time lookups. Provide to templates via tunnelling. Use it in
place of exiting tree references. Process source tree rather than
iterating over symbol table.
(preproc:depgen)[lv:rate, c:sum[@generates], c:product[@generates],
lv:classify, lv:function/lv:param, lv:function, lv:typedef]: Produce
`preproc:sym-dep' nodes (which was previously done while iterating
over the symbol table).
(preproc:depgen)[preproc:sym]: Remove all such processing, since we no
longer iterate over the symbol table.
(preproc:depgen)[c:value-of]: Use symtable map.
(preproc:depgen-match): Likewise.
(preproc:depgen)[lv:union]: Modify to handle changes to lv:typedef
template.
(preproc:depgen)[text()]: Remove and replace with `node()'.
* src/current/include/preproc/package.xsl (preproc:resolv-syms): Remove
logging of symbol resolution. This has a slight performace impact since
there is a lot of output.
* src/current/include/preproc/symtable.xsl
(lv:function/lv:param, c:let/c:Values/c:value): Set `no-deps'.
* src/symtable/symbols.xsl: Add documentation of `no-deps'.
(preproc:symtable)[lv:meta]: Set `no-deps'.
These provide a more pleasent abstraction than having to reference CMP_OP_*
constants.
* core/test/core/vector/interpolate.xml: {t:when=>t:where-eq}.
* core/test/core/vector/table.xml: Likewise, but using the other variants
where appropriate given the value of `@op'.
* core/vector/interpolate.xml: Likewise.
* core/vector/table.xml (_when_, _where_): Rename former to latter and
provide deprecation warning.
(_when-lt_, _when-lte_, _when-gt_, _when-gte_): Add abstractions.
* src/current/rater.xsd: Permit template variable as tenplate name.