preproc:symtable-process-symbols is run on each pass (e.g. during initial
processing and after each template expansion) to introduce new symbols into
the symbol table from imports and newly discovered symbols.
This processing was previously optimized a bit using maps to reduce the cost
of symbol table lookups, but the processing was still inefficient, relying
on XSLT1-style processing (as originally written) for deduplication. This
now uses `for-each-group` and `perform-sort` to offload the expensive
computation onto Saxon, which is much more efficient.
Symbol table processing has long been a culprit, but I hadn't attempted to
optimize further in recent months because of TAMER work. Since TAMER has
been on pause for a few months with other things needing my attention, I
needed to provide a short-term performance improvement to keep up with
increasing build times.
DEV-11716
This provides logging that can be used to analyze jobs. See `tamed --help`
for some examples. More to come.
You'll notice that one of the examples reprents package build time in
_minutes_. This is why TAMER is necessary; as of the time of writing, the
longest-building package is nearly five and a half minutes, and there are a
number of packages that take a minute or more. But, there are potentially
other optimizations that can be done. And this is _after_ many rounds of
optimizations over the years. (TAME was not originally built for what it is
currently being used for.)
This is something that I've wanted to do for quite some time, but for good
reason, have been avoiding.
`tamed --report` is fairly basic right now, but allows you to see what each
of the runners are doing. This will be expanded further to gather data for
further analysis.
The thing that I was avoiding was a status line during the build to
summarize what the runners are doing, since it's nearly impossible to do so
from the build output with multiple runners. This will not only allow me to
debug more easily, but will keep the output plainly visible to developers at
all times in the hope that it can help them improve the build times
themselves in certain cases.
It is currently gated behind TAMED_TUI, since, while it works well overall,
it is imperfect, and will cause artifacts from build output partly
overwriting the status line, and may even occasionally clobber the PS1 by
erasing the line. This will be improved upon in the future; something is
better than nothing.
This macro is used to consume whitespace so that the following sentence can
start on the next line without producing any whitespace in the output. Its
argument is, therefore, whitespace.
This used to work in earlier versions of Texinfo, but around 6.{6,7} it
began failing because an argument was provided when it wasn't defined with
one.
This clears the buffers used by quick_xml, which was apparently forgotten
during initial development (I think I expected it to re-use the previously
allocated space automatically).
This has significant effects in some cases. For example, one of our UI
builds drops from ~9KiB to ~5KiB peak memory usage. Other builds for larger
suppliers are only slightly effected because of some of their massive
fragments.
This was incorrect to begin with---it does not make sense that an input
mapping should depend upon the identifier that it maps to, in the sense that
we make use of these dependencies. If we add weak symbol references in the
future, then this can be reintroduced.
By removing this, we free tameld from having to perform the check itself.
.rev-xmlo bumped to force rebuilding of object files since the linker now
expects that no such dependencies will exist within them.
This is something that changed when the TAMER POC was initially created, as
I was learning Rust. I don't recall the original reason why this was moved,
but it could have been moved back long ago.
In our systems, constants can hold tables (as matrices) with tens or
hundreds of thousands of rows, and there are a number of them in certain
projects. As an example, the YAML-based test cases for one of our systems
went from ~2m30s to ~45s after this change was made. Much of the cost
savings comes from saving GC.
This can occur in generated code (e.g. from proguic if a question-based
predicate inherits a predicate already specified). This commit does not
change anything that's emitted; it merely allows proceeding.
TAMER can be smarter about this; I don't want to invest more time into
generalizing deduplication of predicates.
There was a bug whereby TRUE matches would keep whatever value was being
matched on, even if it was not a boolean. That was an oversight from the
proof-of-concept code, and this fixes it; that's why this is behind a flag!
This also adjusts the class aliasing optimization so that it doesn't check
for a `TRUE` symbol name, which was a bad idea to begin with.
This change also ends up expanding `lv:match[@value="TRUE"]` into the long
form, where it didn't previously; this will result in slightly larger xmlo
files in some cases, but it's nothing significant, and it does not impact
compilation times.
This is a nearly-10-year-old bug that was introduced when the Summary Page
was modified to use the then-new symbol table. The compiler previously
concatenated all packages into a single XML tree and processed that, so no
package resolution was necessary here before.
A long time ago (about a decade), package names were required, but they are
now generated by the compiler relative to the root path. The name here was
incorrect, which was generating an incorrect path for the linked symbols,
which was causing problems with the Summary Page.
This package is not used today. See RELEASES.md for more information; This
is a dangerous package that never should have existed.
This also fixes the test suite.
The classification system rewrite removed the debug value collection that
previously existed. It didn't make a whole lot of sense anyway, given that
that compiler rearranges matches.
This falls back to showing the value of the @on, which should be good
enough, and is honestly better than what we had before.
%.xml{=>o}: %csvo rater/core/vector/table.xmlo
That is: we'll only build an object file when we try to build another object
file. This was causing problems with dependency generation, because it will
triggering compilation early.
This should have been done many years ago. This will determine if any of
the dependencies have changed for the included suppliers.mk and regenerate
it as needed, without the developer having to do so manually when imports
change.
* build-aux/Makefile.am (suppliers.mk): Invoke ant with `-q` to eliminate
"processing" messages for each and every file. This also speeds up
operation slightly.
* build-aux/gen-make: Remove information echos for each file.
These changes will allow for suppliers.mk to be regenerated automatically
without being so invasive.
The intent originally was to try to keep developers to a reasonable name
length, but generated identifiers can easily exceed this, and we further do
not support namespacing.
This can be handled at a template level instead for enforcing naming
conventions.
This had gotten quite out of date from the actual rater.xsd, which existed
outside of this repository, that is used during our build process. That was
an unintended artifact from moving files around.
That file has been removed and symlinked to this one.
Note: this really belongs in liza-proguic, and should be moved in the near
future.
liza-proguic is being modified to generate step-level packages, which are
significantly faster to build than larger ones (XSLT TAME scales
terribly). These changes handle those new dependencies.
One important thing to note with this change is that suppliers.mk now
requires proguic to have run before generation so that those generated
dependencies can be properly examined. This is a quick operation, so that
is not problematic.
This also depends on the .version.xml change that was previously made: when
the timestamp changed every time, we got into an infinite build loop.
First thing to note: this belong in liza-proguic, not here. But it's here
right now, so for now I'm making the change. The relationship between TAME
and proguic is awkward and will hopefully be improved upon in the near
future.
As for this actual change: step-level fragments will be concatenated such
that the imports will appear at the step level rather than the root.
This will be generated automatically by the Makefile. It's not appropriate
to generate in the configure script, and I do not recall why I did
so---possibly to work around the issue of delayed tab completion when it
needs regeneration?
This removes suppmk-gen in favor of more generic Makefile targets---in this
case, having `%.tdat` depend upon `rater/core/tdat.xml`, even though that's
not quite true (the %.xml file generated from it needs it). But these files
are going away soon; a pending TAME optimization branch removes support for
the underlying pattern primitive entirely; CSVMs should be used instead.
The timestamp of the file will now only be updated if the hash (version)
_actually_ changes. This allows this to be used as a target dependency
without forcing a rebuild each and every time.
We were still having issues with this function when taking the positive
branch, when predicates cause many matches within tables. This was causing
us to hit stack limits in certain browsers on the Summary Page.
This converts it to an iterator so that all branches are tail-recursive, and
then enables TCO on them.
I was disappointed to find that there's little performance or memory benefit
in running our test suite.
I did say it was _experimental_ guided TRO.
This waits to perform the actual argument reassignment until after
processing the expressions associated with the new arguments, since they
will otherwise be replaced when their original values are still needed.