Commit Graph

1238 Commits (1ad2fb1dc8ed85814811e50d2a344134d1858a1d)

Author SHA1 Message Date
Mike Gerwitz dd432d249d RELEASES.md: Update with compiler optimizations 2021-06-23 12:46:37 -04:00
Mike Gerwitz 4e859148c0 tools/pkg-graph: Debugging tool to output graph of package dependencies 2021-06-23 11:44:36 -04:00
Mike Gerwitz e9598b7cb5 Correct short runtime var declarations
They were not actually defined before being aliased.
2021-06-23 11:44:36 -04:00
Mike Gerwitz 6f2b4090cd Correct behavior of matrix matching with separate index sets in new system
This behavior was largely correct, but was not commutative if the size of
the matrices (rows or columns) was smaller than a following match.
2021-06-23 11:44:36 -04:00
Mike Gerwitz e90ebd226c Remove arrow functions from classifier runtime
We need to support as far back as IE11, unfortunately, which is ES5.
2021-06-23 11:44:36 -04:00
Mike Gerwitz 934824b2ee Reintroduce legacy classification system, place new behind flag
This largely reintroduces the legacy classification system, but there are a
number of things that are not affected by the flag.  For example:

  1. Alias classifications are still optimized when the flag is off;
  2. Classifications without predicates emit slightly different code than
     before, though their functionality has not changed;
  3. There's been a lot of refactoring and minor optimizations that are
     unaffected by the flag;
  4. lv:match/@pattern will now emit a warning; and
  5. Cleaning and casting of input data is not gated.

This allows us to incrementally migrate to the new system where behavior may
be different, but this is admittedly a bit dangerous in that the new system
was aggressively tested and reasoned about, so reintroducing the legacy
system may combine in unexpected ways.
2021-06-23 11:44:36 -04:00
Mike Gerwitz 5f6cb4cf51 .rev-xmlo: Bump version
The old and new classification systems are currently incompatible, but if the
old is reintroduced, this commit can go away.
2021-06-23 11:44:36 -04:00
Mike Gerwitz 7dbb653624 Inline intermediate any/all classifications
This is another significant milestone.

The next logical step with classification optimization is to inline all of
those intermediate classifications generated from any and all blocks, since
there are so many of them.  This means having the parent classification
absorb all dependencies; not output dependencies for the classification; not
compile the assignments for those classifications; and to inline them at the
match site.  They’re used only once, since they’re generated for each
individual block.

We need to keep the actual classification generation around (and just inline
them) for now, probably until TAMER, because we depend upon their symbol for
determining their dimensionality, which we need for the optimization work we
just did---we must inline them into the proper group (matrix, vector, or
scalar).

The optimization work done up to this point had inlining in mind---only a
little bit of work was needed to make sure that every classification can
simply be stripped of its assignment and be a valid expression that can be
inlined in place of the original reference.

The result of that was predictably significant for the `ui/package` program
that I've been testing with:

  - 4,514 classifications were inlined;
  - The file size dropped to 7.5MiB (from 8.2MiB previously---remember that
    we started at 16MiB); and
  - GC ticks were cut in half, from 67->31.

Unfortunately, this optimization added nearly 1m of time to the compilation
of that program.  Speaking from the future: the UI build optimizations in
liza-proguic were introduced to offset this difference (and provide a net
gain in performance).
2021-06-23 11:44:36 -04:00
Mike Gerwitz 97caefab1b Extract classify/@terminate into own template
Note that next-match does not cause a return from the template, as odd as it
looks.
2021-06-23 11:44:36 -04:00
Mike Gerwitz 1517a03994 Combine all class optimizations into one 2021-06-23 11:44:36 -04:00
Mike Gerwitz d1dae3e1b1 Explicit types for match raising 2021-06-23 11:44:36 -04:00
Mike Gerwitz 5adf1b7589 Combine all m* optimizations
With the recent refactoring, it's clear that these are the same thing.
2021-06-23 11:44:35 -04:00
Mike Gerwitz a563c3ce62 Remove lv:match checks from class optimization checks
We handle all cases now, and prohibited @pattern that wasn't.
2021-06-23 11:44:35 -04:00
Mike Gerwitz e3fd9388bb Abstract function wrapping for class type raising
This will let us clean up the implementation a bit more.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 10089659b1 Extract lv:classify compilation into function
To support following commits for inlining.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 0cd6d40dd9 compiler: Remove whitespace from vector/matrix constants 2021-06-23 11:44:35 -04:00
Mike Gerwitz 9dbda93b4f {precision=>p} to reduce byte count 2021-06-23 11:44:35 -04:00
Mike Gerwitz f14417f32a Remove unused domains var 2021-06-23 11:44:35 -04:00
Mike Gerwitz e0907c6db2 compiler: Do not output whitespace between nodes 2021-06-23 11:44:35 -04:00
Mike Gerwitz 4ee050323a Apply hositing optimization to classify/@any
This convets disjunctive classifications into conjunctive and places an
<any> within it.

This ends up handling all the generated qwhen classifications from proguic,
which were probably converted into <any> by a previous optimization pass.

The UI program I've been using to test these compiler optimizations has
decreased in size down from 8.2MiB since the beginning of this branch; we
started at ~16MiB.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 658e55f2fa Hoist any-all common predicate for binary conjunctive classifications
See comments.  This is meant to help mitigate the damage done by one of our
code generation systems.  The benefit is significant, allowing the code
generator to remain simple.  By placing this optimization within the
compiler, hand-written and template-generated code also benefit.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 25d500fec5 Generalized value list optimization
Note that this was also broken for vectors and scalars by the commit that
expanded non-TRUE @value.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 8e457dab34 Strip single-predicate any/all instead of extracting
Rather than extracting every any/all into their own classifications,
eliminate them (and replace them with their body) if they contain only one
predicate.  This is most likely to happen after template expansion, and
there were an alarming number of them in our system.

Stripping them out of one of our programs saved ~0.2MiB of output, and
removed many intermediate classifications.  It removed ~1,075 lines, which
should correspond closely to the actual number of classifications.

Discovering this required stripping the template barriers, which was done in
a previous commit.

Unfortunately, the performance improvement from this wasn't significantly,
largely because of the nondeterminisim of GC, which can easily mask the
gains.  But a new line `v8::internal::FixedArray::set(int,
v8::internal::Object)` appeared in the profiler output, making me wonder
whether the JIT is starting to understand more interesting properties of the
system.

`mprotect` and `v8::internal::heap_internals::GenerationalBarrier` also
appeared, which are related to GC.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 2d519947f7 Strip template barriers from expanded classifications
The barriers deeply frustrate static analysis.
2021-06-23 11:44:35 -04:00
Mike Gerwitz f8b166a42d Remove lv:join
This is a long-forgotten and long-unused feature that has been
long-superceded by symbol table introspection in inline-template.
2021-06-23 11:44:35 -04:00
Mike Gerwitz c191af8d53 Remove anyValue and related code
!!!

(Message from the future: this ends up being reintroduced and the new
classification system being placed behind a feature toggle.  But it will be
eliminated eventually.)
2021-06-23 11:44:35 -04:00
Mike Gerwitz 8147bec24f m*v*s*!
This is a major milestone for class optimization---the old anyValue-based
system is no longer in use; the classification system has been wholly
rewritten.

The ticks in the sampling profiler are now where they should be, open to
further optimization with a much more solid foundation.

  [JavaScript]:
     ticks  total  nonlib   name
        5    0.6%    3.0%  LazyCompile: *vu [...]/ui/package.strip.js:25191:16
        5    0.6%    3.0%  LazyCompile: *M [...]/ui/package.strip.js:25267:15
        3    0.4%    1.8%  LazyCompile: *vmu [...]/ui/package.strip.js:25144:17
        3    0.4%    1.8%  LazyCompile: *ve [...]/ui/package.strip.js:25204:16
        2    0.2%    1.2%  LazyCompile: *precision [...]/ui/package.strip.js:25137:23
        2    0.2%    1.2%  LazyCompile: *me [...]/ui/package.strip.js:25178:16
        2    0.2%    1.2%  LazyCompile: *cmatch [...]/ui/package.strip.js:25495:20
        2    0.2%    1.2%  LazyCompile: *ceq [...]/ui/package.strip.js:25273:17
        1    0.1%    0.6%  LazyCompile: *init_defaults [...]/ui/package.strip.js:25624:27
        1    0.1%    0.6%  LazyCompile: *MM [...]/ui/package.strip.js:25268:16
        1    0.1%    0.6%  LazyCompile: *E [...]/ui/package.strip.js:25239:15
        1    0.1%    0.6%  LazyCompile: *<anonymous> [...]/ui/package.strip.js:25184:13
        1    0.1%    0.6%  LazyCompile: *<anonymous> [...]/ui/package.strip.js:25171:13

Much better than the 102 ticks that anyValue was taking some time ago!

A lot of time used to be spent compiling functions as well, a lot of which
was removed by previous commits, bringing us to:

 [C++]:
   ticks  total  nonlib   name
     50    5.9%   30.5%  node::contextify::ContextifyContext::CompileFunction(v8::FunctionCallbackInfo<v8::Value> const&)
     20    2.4%   12.2%  write
      9    1.1%    5.5%  node::native_module::NativeModuleEnv::CompileFunction(v8::FunctionCallbackInfo<v8::Value> const&)
      6    0.7%    3.7%  __pthread_cond_timedwait
      4    0.5%    2.4%  mmap

All of this work has simplified the output enough that it's obviated a slew
of other optimizations that can be done in future work, though a lot of that
may wait for TAMER, since performing them in XSLT will be difficult and not
performant; the compiler is slow enough as it is.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 542ff46b6d m*v*s0 optimization
...getting close!
2021-06-23 11:44:35 -04:00
Mike Gerwitz 606a3fe987 m1v*s0 optimization 2021-06-23 11:44:35 -04:00
Mike Gerwitz a63eb4c5e6 m1v1s0*: Remove cmp args and support c:*/@anyOf
This supports all currently-optimized transformations, whereas previously we
were permitting only lv:match[@value].
2021-06-23 11:44:35 -04:00
Mike Gerwitz b735d91955 m0v*s* optimization
Building up to finalizing m*v*s*!

For context, here is the survey prior to this commit:

   3476 m0v1s1
   3385 m1v2s0
    582 m1v1s0
    531 m2v0s0
    225 m3v0s0
    171 m0v2s1
    169 m4v0s0
    135 m5v0s0
    102 m1v0s0
     85 m0v1s5
     71 m6v0s0
     67 m14v0s0
     57 m7v0s0
     50 m0v2s5
     48 m8v0s0
     41 m9v0s0
     39 m1v0s1
     39 m10v0s0
     34 m0v1s2
     26 m0v1s9
     22 m12v0s0
     20 m11v0s0
     20 m0v2s4
     20 m0v1s3
     19 m15v0s0
     19 m0v4s7
     17 m0v5s7
     17 m0v3s1
     17 m0v1s6
     16 m13v0s0
     16 m0v4s9
     16 m0v2s8
     16 m0v1s4
     15 m0v5s4
     15 m0v4s3
     15 m0v3s9
     15 m0v3s5
     15 m0v2s6
     14 m0v12s10
     13 m0v5s14
     13 m0v3s8
     12 m18v0s0
     12 m0v4s4
     12 m0v4s2
     12 m0v3s7
     12 m0v3s2
     12 m0v2s2
     12 m0v12s6
     11 m0v7s7
     11 m0v6s2
     11 m0v5s2
     11 m0v53s9
     11 m0v2s60
     11 m0v28s1
     11 m0v23s8
     11 m0v13s6
     10 m17v0s0
      9 m0v2s3
      8 m0v11s10
      7 m85v0s0
      7 m20v0s0
      7 m0v4s5
      7 m0v1s8
      6 m87v0s0
      6 m35v0s0
      6 m33v0s0
      6 m30v0s0
      6 m19v0s0
      6 m16v0s0
      6 m0v5s6
      5 m21v0s0
      5 m0v7s9
      5 m0v3s10
      4 m53v0s0
      4 m50v0s0
      4 m28v0s0
      4 m114v0s0
      4 m0v6s10
      4 m0v5s8
      4 m0v3s6
      4 m0v3s3
      4 m0v1s7
      4 m0v13s10
      3 m86v0s0
      3 m24v0s0
      3 m23v0s0
      3 m0v6s4
      3 m0v5s5
      3 m0v4s6
      3 m0v3s19
      3 m0v2s12
      3 m0v1s11
      3 m0v11s9
      3 m0v11s1
      2 m99v0s0
      2 m97v0s0
      2 m95v0s0
      2 m79v0s0
      2 m74v0s0
      2 m71v0s0
      2 m60v0s0
      2 m5v18s7
      2 m55v0s0
      2 m49v0s0
      2 m419v0s0
      2 m374v0s0
      2 m34v0s0
      2 m32v0s0
      2 m31v0s0
      2 m27v0s0
      2 m201v0s0
      2 m1v1s1
      2 m1v13s3
      2 m161v0s0
      2 m159v0s0
      2 m157v0s0
      2 m151v0s0
      2 m144v0s0
      2 m142v0s0
      2 m0v9s7
      2 m0v9s11
      2 m0v8s9
      2 m0v8s7
      2 m0v8s19
      2 m0v7s12
      2 m0v6s6
      2 m0v6s11
      2 m0v5s9
      2 m0v5s3
      2 m0v5s11
      2 m0v5s1
      2 m0v4s8
      2 m0v4s11
      2 m0v3s4
      2 m0v3s20
      2 m0v3s15
      2 m0v3s12
      2 m0v2s7
      2 m0v2s16
      2 m0v2s11
      2 m0v29s20
      2 m0v19s7
      2 m0v19s3
      2 m0v17s12
      2 m0v16s16
      2 m0v15s23
      2 m0v15s10
      2 m0v13s9
      2 m0v13s15
      2 m0v11s8
      2 m0v10s15
      1 m94v0s0
      1 m93v0s0
      1 m92v0s0
      1 m90v0s0
      1 m81v0s0
      1 m76v7s0
      1 m76v0s0
      1 m70v0s0
      1 m68v0s0
      1 m66v11s11
      1 m64v0s0
      1 m58v0s0
      1 m54v0s0
      1 m51v0s0
      1 m514v20s19
      1 m4v4s7
      1 m48v0s0
      1 m481v20s14
      1 m451v0s0
      1 m44v0s0
      1 m43v0s0
      1 m42v0s0
      1 m3v16s7
      1 m38v4s6
      1 m38v0s0
      1 m370v0s0
      1 m2v2s3
      1 m2v2s0
      1 m2v25s25
      1 m29v0s0
      1 m26v0s0
      1 m25v0s0
      1 m22v0s0
      1 m213v0s0
      1 m1v3s0
      1 m1454v3215s1422
      1 m13v11s37
      1 m1374v1s0
      1 m131v0s0
      1 m10v30s23
      1 m102v0s0
      1 m0v9s9
      1 m0v9s8
      1 m0v9s12
      1 m0v8s12
      1 m0v7s4
      1 m0v7s15
      1 m0v7s11
      1 m0v6s9
      1 m0v6s8
      1 m0v6s5
      1 m0v6s20
      1 m0v6s16
      1 m0v6s12
      1 m0v4s17
      1 m0v4s10
      1 m0v4s1
      1 m0v46s23
      1 m0v3s17
      1 m0v3s16
      1 m0v33s21
      1 m0v32s38
      1 m0v2s9
      1 m0v2s10
      1 m0v23s30
      1 m0v22s9
      1 m0v22s31
      1 m0v20s29
      1 m0v18s24
      1 m0v18s10
      1 m0v17s26
      1 m0v17s14
      1 m0v16s9
      1 m0v16s27
      1 m0v15s20
      1 m0v15s14
      1 m0v15s11
      1 m0v14s6
      1 m0v14s5
      1 m0v14s13
      1 m0v13s7
      1 m0v13s20
      1 m0v12s9
      1 m0v12s8
      1 m0v11s11
      1 m0v10s17
      1 m0v10s14
      1 m0v10s11
      1 m0v10s10

There are some horridly large ones in there!  They were missing from output
in previous commits because of how I was gathering information.

Those large ones come from liza-proguic's __proguiClasses.
2021-06-23 11:44:35 -04:00
Mike Gerwitz 8045c9d99a Remove v{u,e} second argument; always match truthful
Add optimization notes, note the impact on FALSE with implicit 0 (see mega
commit).
2021-06-23 11:44:35 -04:00
Mike Gerwitz 5ae5c226f9 lv:match/c:* optimizations for v* and s*
This will make m1v*s0 worth doing now.
2021-06-23 11:44:35 -04:00
Mike Gerwitz db88f6aba5 div function 2021-06-23 11:44:33 -04:00
Mike Gerwitz fc96880b85 lv:match/c:* optimizations
A large number of classification optimizations were being thwarted by my not
handling this case.
2021-06-22 15:00:58 -04:00
Mike Gerwitz 73696657fc Optimize @anyOf m0v0s* 2021-06-22 15:00:58 -04:00
Mike Gerwitz 5d9970c853 Optimize @anyOf m0v*s0
This sets the foundation to applying this optimization to the others as
well.
2021-06-22 15:00:58 -04:00
Mike Gerwitz f86eaf6aa2 More concise anyOf checks
These also use unary functions, which will be able to be composed
for upcoming changes.
2021-06-22 15:00:58 -04:00
Mike Gerwitz e59a3b3ff5 Remove unnecessary debug output (writes are very slow)
This shaves ~1m off of the total build time for our largest system.  Output
is impressively slow.

Around this point in time, we have the following profile from V8's sampling
profiler:

  [JavaScript]:
     ticks  total  nonlib   name
       36    2.8%   10.7%  LazyCompile: *anyValue [...]/ui/package.strip.new.js:31020:22
        3    0.2%    0.9%  LazyCompile: *m1v1u [...]/ui/package.strip.new.js:30941:19
        2    0.2%    0.6%  LazyCompile: *precision [...]/ui/package.strip.new.js:30934:23
        1    0.1%    0.3%  LazyCompile: *vu [...]/ui/package.strip.new.js:30964:16
        1    0.1%    0.3%  LazyCompile: *init_defaults [...]/ui/package.strip.new.js:31341:27
2021-06-22 15:00:58 -04:00
Mike Gerwitz d828ad6a1f Extract optimized vec and scalar matches into functions
The vector one will be reused by m1v1 to become m1v*.
2021-06-22 15:00:58 -04:00
Mike Gerwitz 917977effc Use Em instead of destructuring for m1v1
Similar to previous commit.
2021-06-22 15:00:58 -04:00
Mike Gerwitz 3a6695c873 Use E instead of destructuring for v{u,e} functions
This also has an added benefit: that it's ES5-compatible.  Aside from the
arrow functions that need to be removed in future commits.
2021-06-22 15:00:58 -04:00
Mike Gerwitz cfbdc35a55 m0v*s0 single-distinct-@on optimization
I have been wanting to do this for many years.  This is quite
gratifying.  Here is some example output:

  c['foo']=E(A['fooState']=A['state'].map(s => +[2,7,8,9,10,11,19,20,21,22,26,28,31,32,35,39,40,41,46,47,44].includes(s)));

Previously, it looked like this:

  classes['foo'] = (function(){var result,tmp;  tmp = anyValue(
  args['state'], 2, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1924'] || ( debug['d1124644e1924'] = [] ) ).push( tmp
  );/*!-*/ result = tmp; tmp = anyValue( args['state'], 7, args['fooState'],
  false, false ) ;/*!+*/( debug['d1124644e1925'] || ( debug['d1124644e1925'] =
  [] ) ).push( tmp );/*!-*/ result = result || tmp; tmp = anyValue(
  args['state'], 8, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1926'] || ( debug['d1124644e1926'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 9,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1927'] || (
  debug['d1124644e1927'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 10, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1928'] || ( debug['d1124644e1928'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 11,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1929'] || (
  debug['d1124644e1929'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 19, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1930'] || ( debug['d1124644e1930'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 20,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1931'] || (
  debug['d1124644e1931'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 21, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1932'] || ( debug['d1124644e1932'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 22,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1933'] || (
  debug['d1124644e1933'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 26, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1934'] || ( debug['d1124644e1934'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 28,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1936'] || (
  debug['d1124644e1936'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 31, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1937'] || ( debug['d1124644e1937'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 32,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1938'] || (
  debug['d1124644e1938'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 35, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1939'] || ( debug['d1124644e1939'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 40,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1940'] || (
  debug['d1124644e1940'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 41, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1941'] || ( debug['d1124644e1941'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; tmp = anyValue( args['state'], 46,
  args['fooState'], false, false ) ;/*!+*/( debug['d1124644e1942'] || (
  debug['d1124644e1942'] = [] ) ).push( tmp );/*!-*/ result = result || tmp;
  tmp = anyValue( args['state'], 44, args['fooState'], false, false ) ;/*!+*/(
  debug['d1124644e1943'] || ( debug['d1124644e1943'] = [] ) ).push( tmp
  );/*!-*/ result = result || tmp; return tmp;})();

The source XML is:

  <classify as="foo" yields="fooState"
            desc="Foo">
    <any>
      <match on="state" value="STATE_AL" />
      <match on="state" value="STATE_CT" />
      <match on="state" value="STATE_DC" />
      <match on="state" value="STATE_DE" />
      <match on="state" value="STATE_FL" />
      <match on="state" value="STATE_GA" />
      <match on="state" value="STATE_LA" />
      <match on="state" value="STATE_MA" />
      <match on="state" value="STATE_MD" />
      <match on="state" value="STATE_ME" />
      <match on="state" value="STATE_MS" />
      <match on="state" value="STATE_NC" />
      <match on="state" value="STATE_NH" />
      <match on="state" value="STATE_NJ" />
      <match on="state" value="STATE_NY" />
      <match on="state" value="STATE_PA" />
      <match on="state" value="STATE_RI" />
      <match on="state" value="STATE_SC" />
      <match on="state" value="STATE_VA" />
      <match on="state" value="STATE_VT" />
      <match on="state" value="STATE_TX" />
    </any>
  </classify>
2021-06-22 15:00:58 -04:00
Mike Gerwitz a2f846f9c4 {gen,}classes name reduction to reduce byte count 2021-06-22 15:00:58 -04:00
Mike Gerwitz a880605511 Optimal m0v0s* single-distinct-@on scalar match
See comments for more information.

This will require a polyfill for Array.prototype.includes for IE11, if we
stick with it.
2021-06-22 15:00:58 -04:00
Mike Gerwitz 1f72f756ca m0v0s* optimization 2021-06-22 15:00:57 -04:00
Mike Gerwitz d352919807 m0v*s0 optimization 2021-06-22 15:00:57 -04:00
Mike Gerwitz 736d9278bf Temporarily output mvs lengths for unoptimized classifications
This allows us to easily see their shape looking at the compiled code.  See
the previous commit for more of an explanation and examples.  And future
commits.

This allows us to analyze the compiler runlog and determine the frequency of
certain shapes to prioritize optimization efforts.
2021-06-22 15:00:57 -04:00
Mike Gerwitz d9bbf0282e m1v1 classification optimizations
This is a proof-of-concept.  It also contains arrow functions, which do not
exist in ES5.

The notation m#v#s# refers to matrix, vector, and scalar counts of a
classification.  This optimization therefore focuses on classifications with
a single vector and a single matrix.

I'd like to note that this commit message was written in retrospect, months
later, after I returned to these proof-of-concept commits to finalize
them.  I'll try my best to have things make sense in a historical context
based on my notes.

The choice to focus on m1v1 was based on taking survey of the shape of
classifications in our largest rating system.  m1v*, and specifically m1v1,
was the largest by far, followed by v1s1.  Here's an example program used
for a UI:

  $ grep -h 'internal: [svm][0-9]\+[svm][0-9]\+ ' run*.log > result
  $ cut -d' ' -f2 result | sort | uniq -c | sort -rn
    10056 m1v1
     1788 m1v2
      473 v1s1
       18 v2s1
       13 v1s5
        8 v1s3
        7 v1s2
        4 v2s5
        2 v4s4
        2 v4s2
        2 v2s8
        2 v2s6
        2 v1s9
        2 v1s4
        1 v7s7
        1 v6s2
        1 v5s7
        1 v5s5
        1 v5s4
        1 v5s2
        1 v4s9
        1 v4s7
        1 v4s3
        1 v3s9
        1 v3s7
        1 v3s5
        1 v3s2
        1 v3s1
        1 v33s21
        1 v2s60
        1 v2s4
        1 v2s3
        1 v2s2
        1 v28s1
        1 v23s8
        1 v22s9
        1 v1s8
        1 v1s6
        1 v18s24
        1 v15s14
        1 v14s6
        1 v14s5
        1 v13s7
        1 v13s6
        1 v12s6
        1 v11s1
        1 m76v7
        1 m3v1
        1 m1v3
        1 m1374v1

The excessively large ones (like the last one) are aggregate classifications
that are generated by a template.  But note the first count.

Here's another example, one of the raters:

   8812 m1v1
    311 v1s1
     17 v2s1
     14 v1s5
      4 v2s5
      4 v1s6
      4 v11s10
      3 v3s1
      3 v1s8
      2 v5s14
      2 v4s7
      2 v3s9
      2 v3s5
      2 v2s4
      2 v1s9
      2 v1s4
      2 v1s2
      1 v8s7
      1 v7s7
      1 v7s15
      1 v6s4
      1 v6s2
      1 v6s10
      1 v5s8
      1 v5s7
      1 v5s4
      1 v5s2
      1 v53s9
      1 v4s9
      1 v4s4
      1 v4s3
      1 v4s2
      1 v4s11
      1 v3s8
      1 v3s7
      1 v3s20
      1 v3s2
      1 v3s19
      1 v3s15
      1 v2s8
      1 v2s60
      1 v2s6
      1 v2s2
      1 v2s12
      1 v29s20
      1 v28s1
      1 v23s8
      1 v1s3
      1 v15s23
      1 v13s6
      1 v13s20
      1 v12s6
      1 v12s10
      1 v11s1
      1 m1v2
      1 m1s1

Given these examples, m1v1 is an easy first choice for this commit.

The general pattern for this commit and those that follow is to match on a
specific shape of classification that we're optimizing for, falling back to
the old anyValue-based system for all other cases, with the intent of
eventually removing it.
2021-06-22 15:00:57 -04:00
Mike Gerwitz 5a816a4701 Ensure all params are numeric
This has long been a curse, and I don't know why I didn't resolve it sooner.

This makes explicit some of the odd things that this is doing, to maintain
the previous behavior.  Changing that behavior would be ideal, but ought to
be done separately and put behind a feature flag.
2021-06-22 15:00:57 -04:00