Commit Graph

905 Commits (2e50af1220118e7cd48135ebe8374c1d049ea707)

Author SHA1 Message Date
Mike Gerwitz db88f6aba5 div function 2021-06-23 11:44:33 -04:00
Mike Gerwitz fc96880b85 lv:match/c:* optimizations
A large number of classification optimizations were being thwarted by my not
handling this case.
2021-06-22 15:00:58 -04:00
Mike Gerwitz 73696657fc Optimize @anyOf m0v0s* 2021-06-22 15:00:58 -04:00
Mike Gerwitz 5d9970c853 Optimize @anyOf m0v*s0
This sets the foundation to applying this optimization to the others as
well.
2021-06-22 15:00:58 -04:00
Mike Gerwitz f86eaf6aa2 More concise anyOf checks
These also use unary functions, which will be able to be composed
for upcoming changes.
2021-06-22 15:00:58 -04:00
Mike Gerwitz e59a3b3ff5 Remove unnecessary debug output (writes are very slow)
This shaves ~1m off of the total build time for our largest system.  Output
is impressively slow.

Around this point in time, we have the following profile from V8's sampling
profiler:

  [JavaScript]:
     ticks  total  nonlib   name
       36    2.8%   10.7%  LazyCompile: *anyValue [...]/ui/package.strip.new.js:31020:22
        3    0.2%    0.9%  LazyCompile: *m1v1u [...]/ui/package.strip.new.js:30941:19
        2    0.2%    0.6%  LazyCompile: *precision [...]/ui/package.strip.new.js:30934:23
        1    0.1%    0.3%  LazyCompile: *vu [...]/ui/package.strip.new.js:30964:16
        1    0.1%    0.3%  LazyCompile: *init_defaults [...]/ui/package.strip.new.js:31341:27
2021-06-22 15:00:58 -04:00
Mike Gerwitz d828ad6a1f Extract optimized vec and scalar matches into functions
The vector one will be reused by m1v1 to become m1v*.
2021-06-22 15:00:58 -04:00
Mike Gerwitz 917977effc Use Em instead of destructuring for m1v1
Similar to previous commit.
2021-06-22 15:00:58 -04:00
Mike Gerwitz 3a6695c873 Use E instead of destructuring for v{u,e} functions
This also has an added benefit: that it's ES5-compatible.  Aside from the
arrow functions that need to be removed in future commits.
2021-06-22 15:00:58 -04:00
Mike Gerwitz cfbdc35a55 m0v*s0 single-distinct-@on optimization
I have been wanting to do this for many years.  This is quite
gratifying.  Here is some example output:

  c['foo']=E(A['fooState']=A['state'].map(s => +[2,7,8,9,10,11,19,20,21,22,26,28,31,32,35,39,40,41,46,47,44].includes(s)));

Previously, it looked like this:

  classes['foo'] = (function(){var result,tmp;  tmp = anyValue(
  args['state'], 2, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1924'] || ( debug['d1124644e1924'] = [] ) ).push( tmp
  );/*!-*/ result = tmp; tmp = anyValue( args['state'], 7, args['fooState'],
  false, false ) ;/*!+*/( debug['d1124644e1925'] || ( debug['d1124644e1925'] =
  [] ) ).push( tmp );/*!-*/ result = result || tmp; tmp = anyValue(
  args['state'], 8, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1926'] || ( debug['d1124644e1926'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 9,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1927'] || (
  debug['d1124644e1927'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 10, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1928'] || ( debug['d1124644e1928'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 11,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1929'] || (
  debug['d1124644e1929'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 19, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1930'] || ( debug['d1124644e1930'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 20,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1931'] || (
  debug['d1124644e1931'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 21, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1932'] || ( debug['d1124644e1932'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 22,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1933'] || (
  debug['d1124644e1933'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 26, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1934'] || ( debug['d1124644e1934'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 28,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1936'] || (
  debug['d1124644e1936'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 31, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1937'] || ( debug['d1124644e1937'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 32,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1938'] || (
  debug['d1124644e1938'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 35, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1939'] || ( debug['d1124644e1939'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 40,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1940'] || (
  debug['d1124644e1940'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 41, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1941'] || ( debug['d1124644e1941'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 46,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1942'] || (
  debug['d1124644e1942'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 44, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1943'] || ( debug['d1124644e1943'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; return tmp;})();

The source XML is:

  <classify as="foo" yields="fooState"
            desc="Foo">
    <any>
      <match on="state" value="STATE_AL" />
      <match on="state" value="STATE_CT" />
      <match on="state" value="STATE_DC" />
      <match on="state" value="STATE_DE" />
      <match on="state" value="STATE_FL" />
      <match on="state" value="STATE_GA" />
      <match on="state" value="STATE_LA" />
      <match on="state" value="STATE_MA" />
      <match on="state" value="STATE_MD" />
      <match on="state" value="STATE_ME" />
      <match on="state" value="STATE_MS" />
      <match on="state" value="STATE_NC" />
      <match on="state" value="STATE_NH" />
      <match on="state" value="STATE_NJ" />
      <match on="state" value="STATE_NY" />
      <match on="state" value="STATE_PA" />
      <match on="state" value="STATE_RI" />
      <match on="state" value="STATE_SC" />
      <match on="state" value="STATE_VA" />
      <match on="state" value="STATE_VT" />
      <match on="state" value="STATE_TX" />
    </any>
  </classify>
2021-06-22 15:00:58 -04:00
Mike Gerwitz a2f846f9c4 {gen,}classes name reduction to reduce byte count 2021-06-22 15:00:58 -04:00
Mike Gerwitz a880605511 Optimal m0v0s* single-distinct-@on scalar match
See comments for more information.

This will require a polyfill for Array.prototype.includes for IE11, if we
stick with it.
2021-06-22 15:00:58 -04:00
Mike Gerwitz 1f72f756ca m0v0s* optimization 2021-06-22 15:00:57 -04:00
Mike Gerwitz d352919807 m0v*s0 optimization 2021-06-22 15:00:57 -04:00
Mike Gerwitz 736d9278bf Temporarily output mvs lengths for unoptimized classifications
This allows us to easily see their shape looking at the compiled code.  See
the previous commit for more of an explanation and examples.  And future
commits.

This allows us to analyze the compiler runlog and determine the frequency of
certain shapes to prioritize optimization efforts.
2021-06-22 15:00:57 -04:00
Mike Gerwitz d9bbf0282e m1v1 classification optimizations
This is a proof-of-concept.  It also contains arrow functions, which do not
exist in ES5.

The notation m#v#s# refers to matrix, vector, and scalar counts of a
classification.  This optimization therefore focuses on classifications with
a single vector and a single matrix.

I'd like to note that this commit message was written in retrospect, months
later, after I returned to these proof-of-concept commits to finalize
them.  I'll try my best to have things make sense in a historical context
based on my notes.

The choice to focus on m1v1 was based on taking survey of the shape of
classifications in our largest rating system.  m1v*, and specifically m1v1,
was the largest by far, followed by v1s1.  Here's an example program used
for a UI:

  $ grep -h 'internal: [svm][0-9]\+[svm][0-9]\+ ' run*.log > result
  $ cut -d' ' -f2 result | sort | uniq -c | sort -rn
    10056 m1v1
     1788 m1v2
      473 v1s1
       18 v2s1
       13 v1s5
        8 v1s3
        7 v1s2
        4 v2s5
        2 v4s4
        2 v4s2
        2 v2s8
        2 v2s6
        2 v1s9
        2 v1s4
        1 v7s7
        1 v6s2
        1 v5s7
        1 v5s5
        1 v5s4
        1 v5s2
        1 v4s9
        1 v4s7
        1 v4s3
        1 v3s9
        1 v3s7
        1 v3s5
        1 v3s2
        1 v3s1
        1 v33s21
        1 v2s60
        1 v2s4
        1 v2s3
        1 v2s2
        1 v28s1
        1 v23s8
        1 v22s9
        1 v1s8
        1 v1s6
        1 v18s24
        1 v15s14
        1 v14s6
        1 v14s5
        1 v13s7
        1 v13s6
        1 v12s6
        1 v11s1
        1 m76v7
        1 m3v1
        1 m1v3
        1 m1374v1

The excessively large ones (like the last one) are aggregate classifications
that are generated by a template.  But note the first count.

Here's another example, one of the raters:

   8812 m1v1
    311 v1s1
     17 v2s1
     14 v1s5
      4 v2s5
      4 v1s6
      4 v11s10
      3 v3s1
      3 v1s8
      2 v5s14
      2 v4s7
      2 v3s9
      2 v3s5
      2 v2s4
      2 v1s9
      2 v1s4
      2 v1s2
      1 v8s7
      1 v7s7
      1 v7s15
      1 v6s4
      1 v6s2
      1 v6s10
      1 v5s8
      1 v5s7
      1 v5s4
      1 v5s2
      1 v53s9
      1 v4s9
      1 v4s4
      1 v4s3
      1 v4s2
      1 v4s11
      1 v3s8
      1 v3s7
      1 v3s20
      1 v3s2
      1 v3s19
      1 v3s15
      1 v2s8
      1 v2s60
      1 v2s6
      1 v2s2
      1 v2s12
      1 v29s20
      1 v28s1
      1 v23s8
      1 v1s3
      1 v15s23
      1 v13s6
      1 v13s20
      1 v12s6
      1 v12s10
      1 v11s1
      1 m1v2
      1 m1s1

Given these examples, m1v1 is an easy first choice for this commit.

The general pattern for this commit and those that follow is to match on a
specific shape of classification that we're optimizing for, falling back to
the old anyValue-based system for all other cases, with the intent of
eventually removing it.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 5a816a4701 Ensure all params are numeric
This has long been a curse, and I don't know why I didn't resolve it sooner.

This makes explicit some of the odd things that this is doing, to maintain
the previous behavior.  Changing that behavior would be ideal, but ought to
be done separately and put behind a feature flag.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 250c230d94 Revert "REMOVE ME: Use variables in place of object for generated class yields"
This reverts commit e2d9467633bb75d79dbc8fe9f8971bfa412ea59f.

BUT: it does cause more data to be returned, perhaps unnecessarily.  See if
that may offset the slight increase in GC cost.

Further, we may end up getting rid of some of these generated values; check
after we do some class optimizations.
2021-06-22 15:00:57 -04:00
Mike Gerwitz ec196146e2 REMOVE ME: Use variables in place of object for generated class yields
This was a waste of time; it actually reduces performance slightly and increased
GC, unintuitively enough.

Leaving commit here and reverting to keep it for reference.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 9784ef9326 Remove unused lv:assuming
This was going to be a feature to permit testing (I think?), but it has
never been used and was abandoned long ago.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 05736abe23 compiler/js (lv:classify): Extract @yields dest name into function
I will be changing how this work shortly.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 6512ea245a Omit _CMATCH_ generation if no predicates, alias if one
I would like for _CMATCH_ to eventually go away entirely, but this is an
improvement in the meantime.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 17db2d0df8 Use more concise var refs in generated code to reduce byte count 2021-06-22 15:00:57 -04:00
Mike Gerwitz 3c47858c73 Extract empty classify into own template
Simplify main template.
2021-06-22 15:00:57 -04:00
Mike Gerwitz ce0f51db2f compiler/js-calc: Make unknown calculation type a compile-time error
When the Summary Page was _first written_ (the first part of TAME), it was
compiled in the browser---development consisted of refreshing the page,
which was familiar to how we wrote PHP at the time.  No compile process.

In that situation, we couldn't have the XSLT stylesheet failing to
translate.  But of course those days are long since gone, and this must be a
compile-time error.

It shouldn't ever get to this point, granted.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 9ed6d40386 compiler/js (lv:classify): Remove unused noclass
This existed back when the classifier was compiled separately from the
rate function; they are now one and the same.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 894f7ffab8 compiler/js (lv:classify): Remove unused $ignores 2021-06-22 15:00:57 -04:00
Mike Gerwitz d0532fe75a Simplify predmatch and eliminate when no predicate 2021-06-22 15:00:57 -04:00
Mike Gerwitz 8d25d60c60 Significantly reduce parenthesis and whitespace in output
The intent here is simply to reduce byte count, as well as make the
generated code easier to read and find patterns in for future
optimizations.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 3434efcdef Remove unnecesary ||0 defaults 2021-06-22 15:00:57 -04:00
Mike Gerwitz 1c07968375 Remove unused result intermediate value 2021-06-22 15:00:57 -04:00
Mike Gerwitz 80e3029fa0 Remove function wrapper from c:when 2021-06-22 15:00:57 -04:00
Mike Gerwitz 603e9fb342 Optimize single-true-match classes into aliases
Single-predicate classifications matching on TRUE can be optimized into
aliases.  These sometimes occur in hand-written code, but can also be
generated by templates.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 3eca3cf8dc Modernization of some runtime JS functions
We still can't use arrow functions, since the output must be ES5-compatible.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 525d138d33 Remove function wrapper from generic c:* 2021-06-22 15:00:57 -04:00
Mike Gerwitz 459a25e943 Replace toFixed to truncate rate blocks
toFixed required converting to a string and back, which had miserable
performance.  This avoids that cost.
2021-06-22 15:00:57 -04:00
Mike Gerwitz d27cedc70c Remove lv:rate function wrapper 2021-06-22 15:00:57 -04:00
Mike Gerwitz ef5a7c58d8 Remove function wrapper from class blocks
These are unneeded.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 74f2849e8b Do not execute unnecessary code paths
Benchmarking showed virtually no benefit, surprisingly.  But this can be
used in conjunction with other optimizations in the future.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 9c63337fc6 README.md: Mention Rust in upper paragraph alongside XSLT 2021-06-22 12:17:33 -04:00
Mike Gerwitz 8b5053d475 README.md: Mention TAMER 2021-06-22 12:15:50 -04:00
Mike Gerwitz 93fb1f1bdd tamer: Rust v1.{48=>53}.0 for rustdoc tool lints
A previous commit used a rustdoc tool lint, but that support wasn't added
until 1.52.0 (2021-05-06).

Note that this represents the minimum _required_ version to build TAMER; you
can use a later version.
2021-06-22 09:07:53 -04:00
Mike Gerwitz 716556c39f tamer: Rust 1.{42=>48}.0 for stable intra-doc links without nightly 2021-06-21 13:10:00 -04:00
Mike Gerwitz 96ea0302cc tamer: Cargo.lock: Dependency updates
This project has been on pause for over a year.
2021-06-21 12:46:38 -04:00
Mike Gerwitz 416676f1ab build-aux/progtest-runner: Deterministically concatenate files by name 2021-06-09 16:10:52 -04:00
Mike Gerwitz 645c4da541 RELEASES.md: Add _use-new-classification_system_ mention 2021-06-09 16:09:09 -04:00
Mike Gerwitz cdb2e876ab core/test/class: Begin classification system test cases
These are incomplete, but a start.
2021-06-09 13:33:11 -04:00
Mike Gerwitz 1e620e1e96 core/base (_use-new-classification-system): New template
This template prepares for the introduction of the new classification
system, which is a full rewrite that is both more performant and more
correct in its behavior.  Unfortunately, the corrections will cause problems
with old code that may be relying on certain cases, particularly where
undefined values are implicitly treated as zero.

Consequently, the legacy and new systems will exist side-by-side, able to be
toggled on as desired so people can verify that behavior is correct before
we switch it on by default.  This template allows switching on the system
for an entire package (if it's placed at the toplevel), or portions of a
package, though the latter should only be used in exceptional circumstances.

See the test cases in commits to follow for more information.
2021-06-09 13:32:46 -04:00
Mike Gerwitz d4dc1e651b core/base: Section _yield_ and _rate-each_ 2021-06-08 13:26:49 -04:00
Mike Gerwitz bf399c0370 core/aggregate: Remove package
This package is not used today.  See RELEASES.md for more information;  This
is a dangerous package that never should have existed.

This also fixes the test suite.
2021-06-08 12:00:45 -04:00