These have been a pain in the ass since TAMER began.
It seemed like a good idea at the time to have static code generated in this
way, but the lack of explicit dependencies just makes this a mess and works
against the operating theory of the system.
Furthermore, the _same_ static fragments were generated for each and every
map package.
There is still a post-link step (standalones) handled in XSLT; the
previously-static code has been moved there. This will eventually be
integrated into tameld itself, once TAMER has facilities for JS generation.
(This was discovered while trying to parent identifiers to packages.)
DEV-13162
preproc:sym/preproc:from is used for generating `knownFields` using the
_input_ map, so this has no use for return map values; the map still
produces edges to its dependencies.
The issue is that there are return map entries in some of our systems that
are producing multiple `preproc:from`, but I somewhat-recently modified the
system to support only a single map, to remove dynamic allocation. This
resolves that problem.
With that said, `knownFields` was created for Liza to know when the
classifier ought to be invoked, to save time. Back when it was first
introduced ~10y ago, this provided significant savings, however the
structure of our system now is such that nearly every single field invokes
the classifier.
Furthermore, these details should remain encapsulated; if we wanted to make
that determination, we should be provided with a delta, which we could also
use to do incremental classification in the future, if there's an ROI there
after other improvements have been made.
So, eventually, preproc:sym/preproc:from will go away entirely.
DEV-10936
RSG (Ryan Specialty Group) recently announced a rename to Ryan Specialty (no
"Group"), but I'm not sure if the legal name has been changed yet or not, so
I'll wait on that.
TAMER rejects this, because we shouldn't be using anything but UTF-8. My
use of this encoding is ancient, from over a decade ago, that was apparently
just copied around.
DEV-10936
This was broken by the previous fix, because I had cast to a numeric value
before invoking `set_defaults`, which needs the empty string retained so
that it knows whether a default ought to be applied.
This also ensures that `set_values` will always return a numeric value when
that default is applied.
DEV-10484
This has been a problem for...ever, but the old classification system (and
calculations) had `||0` for ever variable reference, whereas the new one
does not; NaNs result in undefined behavior in the new classification
system, since those values are not expected to exist.
This ought to have automated tests, but it will be rewritten in TAMER.
DEV-10484
This was incorrect to begin with---it does not make sense that an input
mapping should depend upon the identifier that it maps to, in the sense that
we make use of these dependencies. If we add weak symbol references in the
future, then this can be reintroduced.
By removing this, we free tameld from having to perform the check itself.
.rev-xmlo bumped to force rebuilding of object files since the linker now
expects that no such dependencies will exist within them.
This ordering will simplify streaming processing of xmlo files in
TAMER. Specifically, we know that symbols will have been declared by the
time dependencies are added to the graph (and so we should only be creating
edges to existing nodes); and we can halt reading as soon as the closing
fragments tag is encountered, avoiding parsing the entirety of these massive
XML files.
On one particularly large program, this cuts time down from ~0.333s to
~0.300 in the POC linker.
This now uses year ranges, which I'll update annually.
This also renames "R-T Specialty" to "Ryan Specialty Group". The latter is
the parent company of the former. I was originally employed under the
former when LoVullo Associates was purchased, by I now work for the parent
company.
This is a significant performance improvement for dependency
generation (which is responsible for building the dependency graph for a
package).
The previous algorithm ran in O(n²) time: it would iterate over the given
symbol table, and for _each_ symbol, do a linear scan of the entire document
to search for the corresponding source block. This resulted in explosive
depgen time for larger packages.
This makes the algorithm run in O(n) by:
- Using an XSLT 3 map for the symbol table for O(1) lookups; and
- Iterating over the _document_ a single time rather than the symbol
table, referencing the symbol table as needed (in O(1) time).
There are other parts of the system that can benefit from these same
improvements. This is important, since we need to be able to handle many
thousands of symbols efficiently.
* src/current/compiler/linker.xsl (l:depgen-sym): Recognize smybol `no-deps'
property, permitting missing dependencies. This allows us to avoid
creating nonsense nodes just to satisfy the linker, while still allowing
the linker to perform essential checks to defend against compiler bugs.
* src/current/compiler/map.xsl (lvmc:stub-symtable): Set @no-deps on
`___head' and `___tail' symbols.
(lvmc:mapsym): Set `no-deps' as appropriate on map symbols.
(preproc:depgen)[lvm:map[@from]]: Generate `preproc:sym-dep' node, which
is now expected by the depgen process.
(preproc:depgen)[lvm:map[*]]: Likewise.
(preproc:depgen)[*[@lvmc:type='retmap']//lvmm:map[@from]]: Remove
unnecessary template.
(preproc:symtable)[lvm:map[@value]]: Pass `no-deps' to `lvmc:mapsym'.
* src/current/include/depgen.xsl (preproc:depgen)[preproc:symtable]: Create
and use XSLT 3 map in place of `preproc:symtable' tree. This allows for
constant-time lookups. Provide to templates via tunnelling. Use it in
place of exiting tree references. Process source tree rather than
iterating over symbol table.
(preproc:depgen)[lv:rate, c:sum[@generates], c:product[@generates],
lv:classify, lv:function/lv:param, lv:function, lv:typedef]: Produce
`preproc:sym-dep' nodes (which was previously done while iterating
over the symbol table).
(preproc:depgen)[preproc:sym]: Remove all such processing, since we no
longer iterate over the symbol table.
(preproc:depgen)[c:value-of]: Use symtable map.
(preproc:depgen-match): Likewise.
(preproc:depgen)[lv:union]: Modify to handle changes to lv:typedef
template.
(preproc:depgen)[text()]: Remove and replace with `node()'.
* src/current/include/preproc/package.xsl (preproc:resolv-syms): Remove
logging of symbol resolution. This has a slight performace impact since
there is a lot of output.
* src/current/include/preproc/symtable.xsl
(lv:function/lv:param, c:let/c:Values/c:value): Set `no-deps'.
* src/symtable/symbols.xsl: Add documentation of `no-deps'.
(preproc:symtable)[lv:meta]: Set `no-deps'.
A better option is to pre-process all inputs, but I need a quick
fix to my stupidity. 0||""==="".
* src/current/compiler/map.xsl (lvmc:compile)[lvm:map//lvm:from[*]]: Correct oval default.
The previous length check existed as a really bad array check (before
Array.isArray was a thing). This has been broken since Nov 2012.
The problem manifests itself when you want an empty array. We then have:
[] => [[]] => [DEFAULT_VALUE]
* src/current/compiler/map.xsl (lvmc:compile)[lvm:map//lvm:from[*]]: Use
`Array.isArray' in place of length check.
* src/current/compiler/map.xsl:
(lvmc:gen-input-default): Add argument.
[dim]: New param, defaulting to `$sym/@dim'.
(lvmc:compile)[lvm:map//lvm:from[*]]: Provide appropriate dimension value
to `set_defaults'. Provide compile-time error if nesting of `from'
nodes exceeds what is appropriate for the symbol dimensions.
This was a bit of a nasty one. Fortunately, this was only used as a
validation, so the code that the compiler produced was still correct.
The problem was that a version of Saxon sometime between 9.5 and 9.8 added
an optimization to eliminate conditionals with no body. Consequently, the
kluge to force the variable to be evaluated was optimized away,
`lvmc:get-symbol' was never called, and no error was ever produced.
This would be best refactored, but that's not something I have time to take
up at the moment priority-wise. This should be future-proof since this
would never be a noop.
* src/current/compiler/map.xsl (lvmc:compile)[lvm:map//lvm:from[*]]: Force
evaluation of `$sym' by ensuring that the condition will not be a noop.
* src/current/compiler/map.xsl
(lvmc:get-method-func, lvmc:value-ref, lvmc:transformation-wrap): New
functions, partyl extracted from below.
(lvmc:compile)[lvm:map//lvm:from]: Use `lvmc:value-ref'.
[lvm:map//lvm:from/lvm:translate]: Add `[@key]' to match.
[lvm:map//lvm:transform]: New match. Ignore node entirely.
(lvmc:concat-compile): Propagate symtable to `lvmc:compile'.
This is all really confusing because this doesn't use the same import
specification as packages; maps got stuck in a partial transition. So,
let's provide some helpful errors rather than silently failing.
* src/current/compiler/map.xsl (preproc:symtable)[lvm:import]:
Error if missing `@path'. Provide more information if `@package' was
provided to help clarify.
In particular, I want it to handle absolute import paths. It does
this for the return map; I forgot to add it here.
* src/current/compiler/map.xsl (lvmc:compile) [lvm:program-map]:
Preprocess nodes.
This is a backwards-incompatible change that, like the input map,
requires the use of symbols in the return map. This will allow us to
forego the use of @keep and will have the return map be the authority
of what gets linked (all of its dependencies).
* src/current/compiler/map.xsl: Add symbol support to return-map.
This allows for the proper importing of symbols into the package
generated by the map compiler, which in turn allows for processing
their default values.
(Copyright headers will be added in the next commit; these are the
original files, unaltered in any way.)
The internal project name at LoVullo is simply "Calc DSL". This
liberates the entire thing. If anything was missed, I'll be added
later.
To continue building at LoVullo with this move, symlinks are used for
the transition; this is the exact code that is used in production.
There is a lot here---over 25,000 lines. Much of it is in disarray from
the environment surrounding its development, but it does work well for
what it was intended to do.
(LoVullo folks: fork point is 65723a0 in calcdsl.git.)