depgen (preproc:symtable): Use for-each-group for deduplication

And here's the optimization that I had wanted to perform all along, but it
took some time to confidently get to this point.

`preceding-sibling` was used since the XSLT1 days for deduplication, before
`for-each-group` existed.  It works just fine for small inputs, but the
problem is we're doing this many thousands of times in larger packages, and
that really adds up.  (In this case, shaving ~8s off of one of our large
packages with ~20k symbols in play.)

DEV-15114
main
Mike Gerwitz 2023-10-19 10:59:30 -04:00
parent abc37ef0cc
commit 959ff06539
1 changed files with 11 additions and 21 deletions

View File

@ -107,41 +107,31 @@
<for-each select="$deps, $deps//preproc:sym-dep"> <for-each select="$deps, $deps//preproc:sym-dep">
<variable name="sym-name" as="xs:string" <variable name="sym-name" as="xs:string"
select="@name" /> select="@name" />
<variable name="cursym" as="element( preproc:sym )?" <variable name="cursym" as="element( preproc:sym )?"
select="$symtable-map( $sym-name )" /> select="$symtable-map( $sym-name )" />
<if test="not( $cursym )"> <if test="not( $cursym )">
<message select="." />
<message terminate="yes" <message terminate="yes"
select="concat( 'internal error: ', select="concat( 'internal error: ',
'cannot find symbol in symbol table: ', 'cannot find symbol in symbol table: ',
'`', $sym-name, '''' )" /> '`', $sym-name, '''' )" />
</if> </if>
<!-- do not output duplicates (we used to not output references <preproc:sym-dep name="{@name}">
to ourselves, but we are now retaining them, since those <!-- @tex provided an non-empty, or function -->
data are useful) --> <for-each-group select="preproc:sym-ref"
<variable name="uniq" select=" group-by="@name">
preproc:sym-ref[ <!-- symbols must not have themselves as their own dependency -->
not( @name=preceding-sibling::preproc:sym-ref/@name ) <if test="not( $cursym/@allow-circular = 'true' )
] and ( @name = $cursym/@name
" /> or @parent = $cursym/@name )">
<message terminate="yes"
<!-- symbols must not have themselves as their own dependency -->
<if test="$uniq[ not( $cursym/@allow-circular = 'true' )
and ( @name = $cursym/@name
or @parent = $cursym/@name ) ]">
<message terminate="yes"
select="concat( '[preproc] !!! fatal: symbol ', select="concat( '[preproc] !!! fatal: symbol ',
$cursym/@name, $cursym/@name,
' references itself ', ' references itself ',
'(circular dependency)' )" /> '(circular dependency)' )" />
</if> </if>
<preproc:sym-dep name="{@name}">
<!-- @tex provided an non-empty, or function -->
<for-each select="$uniq">
<variable name="name" select="@name" /> <variable name="name" select="@name" />
<variable name="sym" as="element( preproc:sym )?" <variable name="sym" as="element( preproc:sym )?"
select="$symtable-map( $name )" /> select="$symtable-map( $name )" />
@ -181,7 +171,7 @@
number( $sym/@dim ), number( $sym/@dim ),
position() )" /> position() )" />
</preproc:sym-ref> </preproc:sym-ref>
</for-each> </for-each-group>
</preproc:sym-dep> </preproc:sym-dep>
</for-each> </for-each>
</preproc:sym-deps> </preproc:sym-deps>