Use experimental TCO for heavily recursive portion of table lookup
This was urgently needed for a project using TAME. Somehow, we've gone all of these years without a table in which the first predicate is unable to sufficiently filter out enough results that we do not hit stack limits. Each recursive step of mrange before inlining and TCO, at the time of writing, was adding eight stack frames. This is because each let (and many other things) compile into self-applying functions. Since mrange is invoked once for every single row for a given value, we quickly run out of stack space. For example, consider this table: 1, $a, $b 2, $a, $b 2, $b, $c 2, $c, $d 3, $a, $b If we were to filter the first column on the value 2, it would first bisect to find the middle row, backtrack to the first, and then move forward to the last, producing: 2, $a, $b 2, $b, $c 2, $c, $d This is at least three mrange calls, for a potential total of 8*3=24 stack frames, depending on implementation details I don't quite recall at the moment about the how the query system works. We had over 1000 rows after applying the first predicate; the stack was exhausted before it could even reach the last row. Tail call optimization (TCO) is the process of turning recursive calls in tail position into jumps. So, rather than the stack growing on a recursive call, it stays constant. A common way to accomplish this in stack-based languages is using a trampoline. In our case, we enclose the entirety of the function in a `do` loop, and clear a flag indicating that a tail call took place. When we reach a recursive tail call, we set that flag. Then, instead of invoking the function again, we _overwrite the original arguments_ with their new values, and simply return 0. When the function hits the end of the loop, it will see that the flag is set, and jump back to the beginning of the function, starting all over with the new values. Compiling in this functionality is not difficult. Tracking whether a given call is in tail position, however, is a bit of a pain given how the XSLT code is currently written. Given that this is all being replaced with TAMER, it's difficult to stomach making too many changes to the compiler, when we can do it properly in the future with TAMER. But we need the feature now. As a compromise, I call this implementation "guided" TCO---we rely on a human to indicate that a call is in tail position by setting an experimental flag manually. That frees us from having to have the compiler do it, but does create some nasty problems if the human is wrong. Consequently, this should only be used in core, and people should not use it unless they know what they're doing. Using this feature currently outputs a warning---that way, if there are problems, people have some idea of where they maybe can look. The warning will be removed in the future after this has been in production for some time (granted, our test suite passes). Once again: TAMER will implement proper tail calls automatically, without the need for a human to intervene. For more information on tail calls: - https://en.wikipedia.org/wiki/Tail_callmaster
commit
6784090cf0
28
RELEASES.md
28
RELEASES.md
|
@ -14,6 +14,34 @@ commits that introduce the changes. To make a new release, run
|
|||
`tools/mkrelease`, which will handle updating the heading for you.
|
||||
|
||||
|
||||
NEXT
|
||||
====
|
||||
This release adds support for experimental human-guided tail call
|
||||
optimizations (TCO) to resolve issues of stack exhaustion during runtime for
|
||||
tables with a large number of rows after having applied the first
|
||||
predicate. This feature should not be used outside of `tame-core`, and will
|
||||
be done automatically by TAMER in the future.
|
||||
|
||||
`tame-core`
|
||||
-----------
|
||||
- `vector/filter/mrange`, used by the table lookup system, has had its
|
||||
mutually recursive function inlined and now uses TCO.
|
||||
- This was the source of stack exhaustion on tables whose predicates were
|
||||
unable to filter rows sufficiently.
|
||||
|
||||
----
|
||||
Compiler
|
||||
--------
|
||||
- Experimental guided tail call optimizations (TCO) added to XSLT-based
|
||||
compiler, allowing a human to manually indicate recursive calls in tail
|
||||
position.
|
||||
- This is undocumented and should only be used by `tame-core`. The
|
||||
experimental warning will be removed in future releases if the behavior
|
||||
proves to be sound.
|
||||
- TAMER will add support for proper tail calls that will be detected
|
||||
automatically.
|
||||
|
||||
|
||||
v17.4.3 (2020-07-02)
|
||||
====================
|
||||
This release fixes a bug caused by previous refactoring that caused
|
||||
|
|
|
@ -246,6 +246,8 @@
|
|||
<param name="seq" type="boolean" desc="Is data sequential?" />
|
||||
<param name="op" type="integer" desc="Comparison operator" />
|
||||
|
||||
<param name="__experimental_guided_tco" type="float" desc="Experimental guided TCO" />
|
||||
|
||||
<c:let>
|
||||
<c:values>
|
||||
<c:value name="curval" type="float" desc="Current value">
|
||||
|
@ -296,22 +298,113 @@
|
|||
|
||||
|
||||
<c:otherwise>
|
||||
<c:apply name="_mrange_cmp" matrix="matrix" col="col" val="val"
|
||||
start="start" end="end" seq="seq" op="op">
|
||||
<c:arg name="cur">
|
||||
<c:value-of name="matrix">
|
||||
<!-- current row -->
|
||||
<c:index>
|
||||
<c:value-of name="start" />
|
||||
</c:index>
|
||||
<c:let>
|
||||
<c:values>
|
||||
<c:value name="cur" type="float"
|
||||
desc="Current value">
|
||||
<c:value-of name="matrix">
|
||||
<!-- current row -->
|
||||
<c:index>
|
||||
<c:value-of name="start" />
|
||||
</c:index>
|
||||
|
||||
<!-- requested column -->
|
||||
<c:index>
|
||||
<c:value-of name="col" />
|
||||
</c:index>
|
||||
</c:value-of>
|
||||
</c:arg>
|
||||
</c:apply>
|
||||
<!-- requested column -->
|
||||
<c:index>
|
||||
<c:value-of name="col" />
|
||||
</c:index>
|
||||
</c:value-of>
|
||||
</c:value>
|
||||
</c:values>
|
||||
|
||||
<c:let>
|
||||
<c:values>
|
||||
<c:value name="found" type="boolean"
|
||||
desc="Whether comparison matches">
|
||||
<c:cases>
|
||||
<c:case label="Equal">
|
||||
<t:when-eq name="op" value="CMP_OP_EQ" />
|
||||
|
||||
<c:value-of name="TRUE">
|
||||
<t:when-eq name="cur" value="val" />
|
||||
</c:value-of>
|
||||
</c:case>
|
||||
|
||||
<c:case label="Less than">
|
||||
<t:when-eq name="op" value="CMP_OP_LT" />
|
||||
|
||||
<c:value-of name="TRUE">
|
||||
<t:when-lt name="cur" value="val" />
|
||||
</c:value-of>
|
||||
</c:case>
|
||||
|
||||
<c:case label="Less than or equal to">
|
||||
<t:when-eq name="op" value="CMP_OP_LTE" />
|
||||
|
||||
<c:value-of name="TRUE">
|
||||
<t:when-lte name="cur" value="val" />
|
||||
</c:value-of>
|
||||
</c:case>
|
||||
|
||||
<c:case label="Greater than">
|
||||
<t:when-eq name="op" value="CMP_OP_GT" />
|
||||
|
||||
<c:value-of name="TRUE">
|
||||
<t:when-gt name="cur" value="val" />
|
||||
</c:value-of>
|
||||
</c:case>
|
||||
|
||||
<c:case label="Greater than or equal to">
|
||||
<t:when-eq name="op" value="CMP_OP_GTE" />
|
||||
|
||||
<c:value-of name="TRUE">
|
||||
<t:when-gte name="cur" value="val" />
|
||||
</c:value-of>
|
||||
</c:case>
|
||||
</c:cases>
|
||||
</c:value>
|
||||
</c:values>
|
||||
|
||||
<c:cases>
|
||||
<!-- if values matches, cons it -->
|
||||
<c:case>
|
||||
<t:when-eq name="found" value="TRUE" />
|
||||
|
||||
<c:cons>
|
||||
<c:value-of name="matrix">
|
||||
<c:index>
|
||||
<c:value-of name="start" />
|
||||
</c:index>
|
||||
</c:value-of>
|
||||
|
||||
<c:recurse>
|
||||
<c:arg name="start">
|
||||
<c:sum>
|
||||
<c:value-of name="start" />
|
||||
<c:const value="1" desc="Check next element" />
|
||||
</c:sum>
|
||||
</c:arg>
|
||||
</c:recurse>
|
||||
</c:cons>
|
||||
</c:case>
|
||||
|
||||
|
||||
<!-- no match, continue recursion using TCO so that we
|
||||
do not exhaust the stack (this is an undocumented,
|
||||
experimental feature that requires explicitly
|
||||
stating that a recursive call is in tail position) -->
|
||||
<c:otherwise>
|
||||
<c:recurse __experimental_guided_tco="TRUE">
|
||||
<c:arg name="start">
|
||||
<c:sum>
|
||||
<c:value-of name="start" />
|
||||
<c:const value="1" desc="Check next element" />
|
||||
</c:sum>
|
||||
</c:arg>
|
||||
</c:recurse>
|
||||
</c:otherwise>
|
||||
</c:cases>
|
||||
</c:let>
|
||||
</c:let>
|
||||
</c:otherwise>
|
||||
</c:cases>
|
||||
</c:let>
|
||||
|
@ -319,108 +412,6 @@
|
|||
</function>
|
||||
|
||||
|
||||
<!-- mutually recursive with _mrange -->
|
||||
<function name="_mrange_cmp" desc="mrange helper for value comparison">
|
||||
<param name="matrix" type="float" set="matrix" desc="Matrix to filter" />
|
||||
<param name="col" type="integer" desc="Column index to filter on" />
|
||||
<param name="val" type="float" desc="Column value to filter on" />
|
||||
<param name="start" type="integer" desc="Starting index (aka current index)" />
|
||||
<param name="end" type="integer" desc="Ending index" />
|
||||
<param name="seq" type="integer" desc="Is data sequential?" />
|
||||
<param name="op" type="integer" desc="Comparison operator" />
|
||||
<param name="cur" type="float" desc="Current value" />
|
||||
|
||||
|
||||
<c:let>
|
||||
<c:values>
|
||||
<c:value name="found" type="boolean"
|
||||
desc="Whether comparison matches">
|
||||
<c:cases>
|
||||
<c:case label="Equal">
|
||||
<t:when-eq name="op" value="CMP_OP_EQ" />
|
||||
|
||||
<c:value-of name="TRUE">
|
||||
<t:when-eq name="cur" value="val" />
|
||||
</c:value-of>
|
||||
</c:case>
|
||||
|
||||
<c:case label="Less than">
|
||||
<t:when-eq name="op" value="CMP_OP_LT" />
|
||||
|
||||
<c:value-of name="TRUE">
|
||||
<t:when-lt name="cur" value="val" />
|
||||
</c:value-of>
|
||||
</c:case>
|
||||
|
||||
<c:case label="Less than or equal to">
|
||||
<t:when-eq name="op" value="CMP_OP_LTE" />
|
||||
|
||||
<c:value-of name="TRUE">
|
||||
<t:when-lte name="cur" value="val" />
|
||||
</c:value-of>
|
||||
</c:case>
|
||||
|
||||
<c:case label="Greater than">
|
||||
<t:when-eq name="op" value="CMP_OP_GT" />
|
||||
|
||||
<c:value-of name="TRUE">
|
||||
<t:when-gt name="cur" value="val" />
|
||||
</c:value-of>
|
||||
</c:case>
|
||||
|
||||
<c:case label="Greater than or equal to">
|
||||
<t:when-eq name="op" value="CMP_OP_GTE" />
|
||||
|
||||
<c:value-of name="TRUE">
|
||||
<t:when-gte name="cur" value="val" />
|
||||
</c:value-of>
|
||||
</c:case>
|
||||
</c:cases>
|
||||
</c:value>
|
||||
</c:values>
|
||||
|
||||
<c:cases>
|
||||
<!-- if values matches, cons it -->
|
||||
<c:case>
|
||||
<t:when-eq name="found" value="TRUE" />
|
||||
|
||||
<c:cons>
|
||||
<c:value-of name="matrix">
|
||||
<c:index>
|
||||
<c:value-of name="start" />
|
||||
</c:index>
|
||||
</c:value-of>
|
||||
|
||||
<c:apply name="mrange" matrix="matrix" col="col" val="val"
|
||||
end="end" seq="seq" op="op">
|
||||
<c:arg name="start">
|
||||
<c:sum>
|
||||
<c:value-of name="start" />
|
||||
<c:const value="1" desc="Check next element" />
|
||||
</c:sum>
|
||||
</c:arg>
|
||||
</c:apply>
|
||||
</c:cons>
|
||||
</c:case>
|
||||
|
||||
|
||||
<!-- no match, continue (mutual) recursion -->
|
||||
<c:otherwise>
|
||||
<c:apply name="mrange" matrix="matrix" col="col" val="val"
|
||||
end="end" seq="seq" op="op">
|
||||
<c:arg name="start">
|
||||
<c:sum>
|
||||
<c:value-of name="start" />
|
||||
<c:const value="1" desc="Check next element" />
|
||||
</c:sum>
|
||||
</c:arg>
|
||||
</c:apply>
|
||||
</c:otherwise>
|
||||
</c:cases>
|
||||
</c:let>
|
||||
</function>
|
||||
|
||||
|
||||
<section title="Bisecting">
|
||||
Perform an~$O(lg n)$ bisect on a data set.
|
||||
|
||||
|
|
|
@ -854,7 +854,7 @@
|
|||
|
||||
@return generated function application
|
||||
-->
|
||||
<template match="c:apply" mode="compile-calc">
|
||||
<template match="c:apply" mode="compile-calc" priority="5">
|
||||
<variable name="name" select="@name" />
|
||||
<variable name="self" select="." />
|
||||
|
||||
|
@ -904,6 +904,92 @@
|
|||
</template>
|
||||
|
||||
|
||||
<!--
|
||||
Whether the given function supports tail call optimizations (TCO)
|
||||
|
||||
This is an experimental feature that must be explicitly requested.
|
||||
-->
|
||||
<function name="compiler:function-supports-tco" as="xs:boolean">
|
||||
<param name="func" as="element( lv:function )" />
|
||||
|
||||
<sequence select="exists( $func/lv:param[
|
||||
@name='__experimental_guided_tco' ] )" />
|
||||
</function>
|
||||
|
||||
|
||||
<!--
|
||||
Whether a recursive function application is marked as being in tail
|
||||
position within a function supporting TCO
|
||||
|
||||
A human must determined if a recursive call is in tail position, and
|
||||
hopefully the human is not wrong.
|
||||
-->
|
||||
<function name="compiler:apply-uses-tco" as="xs:boolean">
|
||||
<param name="apply" as="element( c:apply )" />
|
||||
|
||||
<variable name="ancestor-func" as="element( lv:function )?"
|
||||
select="$apply/ancestor::lv:function" />
|
||||
|
||||
<sequence select="exists( $apply/c:arg[ @name='__experimental_guided_tco' ] )
|
||||
and $apply/@name = $ancestor-func/@name
|
||||
and compiler:function-supports-tco( $ancestor-func ) " />
|
||||
</function>
|
||||
|
||||
|
||||
<!--
|
||||
Experimental guided tail call optimization (TCO)
|
||||
|
||||
When the special param `__experimental_guided_tco' is defined and set to a
|
||||
true value, the recursive call instead overwrites the original function
|
||||
arguments and returns a dummy value. The function's trampoline is then
|
||||
responsible for re-invoking the function's body.
|
||||
|
||||
Note that this only applies to self-recursive functions; mutual recursion
|
||||
is not recognized.
|
||||
|
||||
By forcing a human to specify whether a recursive call is in tail
|
||||
position, we free ourselves from having to track tail position within this
|
||||
nightmare of a compiler; we can figure this out properly in TAMER.
|
||||
-->
|
||||
<template mode="compile-calc" priority="7"
|
||||
match="c:apply[ compiler:apply-uses-tco( . ) ]">
|
||||
<variable name="name" select="@name" />
|
||||
<variable name="self" select="." />
|
||||
|
||||
<message select="concat('warning: ', $name, ' recursing with experimental guided TCO')" />
|
||||
|
||||
<variable name="arg-prefix" select="concat( ':', $name, ':' )" />
|
||||
|
||||
<!-- reassign function arguments -->
|
||||
<for-each select="
|
||||
root(.)/preproc:symtable/preproc:sym[
|
||||
@type='func'
|
||||
and @name=$name
|
||||
]/preproc:sym-ref
|
||||
">
|
||||
|
||||
<variable name="pname" select="substring-after( @name, $arg-prefix )" />
|
||||
<variable name="arg" select="$self/c:arg[@name=$pname]" />
|
||||
|
||||
<!-- if the call specified this argument, then use it -->
|
||||
<if test="$arg">
|
||||
<sequence select="concat( '/*TCO*/', $pname, '=' )" />
|
||||
<apply-templates select="$arg/c:*[1]" mode="compile" />
|
||||
<text>,</text>
|
||||
</if>
|
||||
</for-each>
|
||||
|
||||
<!-- return value, which doesn't matter since it won't be used -->
|
||||
<text>0</text>
|
||||
|
||||
<!-- don't support c:when here; not worth the effort -->
|
||||
<if test="./c:when">
|
||||
<message terminate="yes"
|
||||
select="'c:when unsupported on TCO c:apply: ', ." />
|
||||
</if>
|
||||
</template>
|
||||
|
||||
|
||||
<template match="c:when" mode="compile-calc">
|
||||
<!-- note that if we have multiple c:whens, they'll be multiplied together by
|
||||
whatever calls this, so we're probably fine -->
|
||||
|
|
|
@ -1022,6 +1022,10 @@
|
|||
will return the result of its expression (represented by a calculation in the
|
||||
XML).
|
||||
|
||||
If the special param __experimental_guided_tco is defined, recursive calls
|
||||
to the same function can set it to a true value to perform tail call
|
||||
optimization (TCO). See js-calc.xsl for more information.
|
||||
|
||||
@return generated function
|
||||
-->
|
||||
<template match="lv:function" mode="compile">
|
||||
|
@ -1041,13 +1045,32 @@
|
|||
|
||||
<text>) {</text>
|
||||
|
||||
<text>return ( </text>
|
||||
<variable name="tco" as="xs:boolean"
|
||||
select="compiler:function-supports-tco( . )" />
|
||||
|
||||
<if test="$tco">
|
||||
<message select="concat('warning: ', @name, ' enabled experimental guided TCO')" />
|
||||
</if>
|
||||
|
||||
<!-- top of this function's trampoline, if TCO was requested -->
|
||||
<if test="$tco">
|
||||
<text>do{__experimental_guided_tco=0;</text>
|
||||
</if>
|
||||
|
||||
<text>var fresult = ( </text>
|
||||
<!-- begin calculation generation (there should be only one calculation node
|
||||
as a child, so only it will be considered) -->
|
||||
<apply-templates select="./c:*[1]" mode="compile" />
|
||||
<text> );</text>
|
||||
|
||||
<text>} </text>
|
||||
<!-- bottom of this function's trampoline, if TCO was requested; if the
|
||||
flag is set (meaning a relevant tail call was hit), jump back to
|
||||
the beginning of the function -->
|
||||
<if test="$tco">
|
||||
<text>}while(__experimental_guided_tco);</text>
|
||||
</if>
|
||||
|
||||
<text>return fresult;} </text>
|
||||
</template>
|
||||
|
||||
|
||||
|
|
|
@ -247,6 +247,7 @@
|
|||
not(
|
||||
@name=$overrides/@name
|
||||
or @name=$self/@*/local-name()
|
||||
or starts-with( @name, '__experimental_' )
|
||||
)
|
||||
]
|
||||
">
|
||||
|
|
Loading…
Reference in New Issue