design/tpl: The Tame Programming Language initial concept

There's a lot of change that's likely going to take place with this thing,
but it's a start.  The abstract summarizes the purpose of this---to formally
define TAME in terms of algebra, first-order logic, and [ZFC] set theory.

This came about while working on compiler changes and optimizations, since
it's difficult to ensure correctness (and discover further optimizations)
without being able to formally define the language.  The focus at the moment
is the classification system rewrite, which can be expressed in terms of
first order logic and set theory.

This commit contains essentially a POC with some carefully chosen
mathematical foundations (abstractions of which are subject to change) and a
basic representation of a subset of the classification system for scalars.
Mike Gerwitz 2021-04-30 09:16:01 -04:00
parent 685549f06b
commit bd454f7a7c
11 changed files with 899 additions and 6 deletions

.gitignore vendored
View File

@ -10,12 +10,12 @@
# should be added using autoreconf -i

View File

@ -27,12 +27,25 @@ build:
- tamer/target
expire_in: 30 min
stage: build
- cd design/tpl/
- make
- design/tpl/tpl.pdf
expire_in: 30 min
stage: deploy
- mkdir -p public/doc
- mv doc/tame.html/* doc/tame.pdf doc/ public/
- mv tamer/target/doc public/tamer/
- mkdir -p public/design
- mv design/tpl/tpl.pdf public/design/
- public/

design/tpl/.gitignore vendored 100644
View File

@ -0,0 +1,26 @@
# Ignored files for The TAME Programming Language Git repository
# Targets
# (La)TeX
# Glossary
# Index
# latexmk

View File

@ -0,0 +1,12 @@
add_cus_dep('glo', 'gls', 0, 'run_makeglossaries');
add_cus_dep('acn', 'acr', 0, 'run_makeglossaries');
sub run_makeglossaries {
if ( $silent ) {
system "makeglossaries -q '$_[0]'";
else {
system "makeglossaries '$_[0]'";

View File

@ -0,0 +1,14 @@
inputs := tpl.tex tpl.sty glossary.tex \
$(wildcard sec/*.tex)
.PHONY: pdf
pdf: tpl.pdf
tpl.pdf: tpl.tex $(inputs)
latexmk --pdf $<
latexmk -c

View File

@ -0,0 +1,18 @@
# The TAME Programming Language: Design and Implementation
This is a living document providing a formal definition of the TAME
programming language.
## Dependencies
See [`tpl.sty`](tpl.sty) for the specific LaTeX packages that are
needed. If you use a Debian-based system, the following command should be
sufficient to install all necessary dependencies:
$ apt install --no-recommends \
make latexmk \
texlive-latex-extra texlive-fonts-extra texlive-science
## Building
Simply run `make`. The output is `tpl.pdf`.

View File

@ -0,0 +1,169 @@
% The TAME Programming Language glossary
% Copyright (C) 2021 Ryan Specialty Group, LLC.
% Licensed under the Creative Commons Attribution-ShareAlike 4.0
% International License.
\newacronym{tamer}{\textsc{Tamer}}{\tame{} in Rust}
\newglossaryentry{free variable}
name={free variable},
description={a variable that is not a \gls{bound variable}}
\newglossaryentry{bound variable}
name={bound variable},
description={a value of \gls{true} or \gls{false}},
description={boolean value representing ``true''},
description={boolean value representing ``false''},
description={logical conjunction (``and'')},
description={logical disjunction (``or'')},
description={number of elements in some set~$S$},
description={a set sharing the same \gls{index set}},
symbol={\ensuremath{\{A_j\}_{j\in J}}}
\newglossaryentry{index set}
name={index set},
description={a set whose members index members of another set; see also
description={type $A$ is castable to type $B$ if there exists some
\gls{surjective} function $A\rightarrow B$}
description={$\forall y\in Y : \exists x\in X : f(x) = y$},
description={an equivalence relation is a reflexive, symmetric, and
transitive binary operation},
\newglossaryentry{logical equivalence}
name={logical equivalence},
description={$p$ and $q$ are logically equivalent ($p\equiv q$) \gls{iff}
both $q$ and~$p$ are~\true or both are~\false},
\newglossaryentry{logical implication}
name={logical implication},
description={if and only if},
name={universal quantification},
description={expresses a predicate that must be satisfied for every
element in a \gls{domain}},
name={existential quantification},
description={expresses a predicate that must be satisfied for some
element in a \gls{domain}},
name={domain of discourse},
description={set of elements over which variables of interest may range},
description={set of all integers},
\newglossaryentry{empty set}
name={empty set},
description={set of zero elements},

View File

@ -0,0 +1,132 @@
\section{Classification System}\seclabel{class}
A \gls{classification} is a user-defined abstraction that describes
(``classifies'') arbitrary data.
Classifications can be used as predicates, generating functions, and can be
composed into more complex classifications.
Nearly all conditions in \tame{} are specified using classifications.
\index{first-order logic!sentence}
All classifications represent \emph{first-order sentences}---%
that is,
they contain no \glspl{free variable}.
this means that all variables within a~classification are
\emph{tightly coupled} to the classification itself.
This limitation is mitigated through use of the template system.
For example,
consider the following classification \tameclass{cost-exceeded}.
Let~\tameparam{cost} be a scalar parameter.
<classify as="cost-exceeded" desc="Cost of item is too expensive">
<t:match-gt on="cost" value="100.00" />
is then equivalent to the proposition
\tameclass{cost-exceeded} \equiv \tameparam{cost} > 100.00.
A classification is either \glssymbol{true} or~\glssymbol{false}.
Let $\tameparam{cost}=150.00$.
\tameclass{cost-exceeded} & \equiv \tameparam{cost} > 100.00 \\
& \equiv 150.00 > 100.00 \\
& \equiv \true.
Each \xmlnode{match} of a classification is a~\gls{predicate}.
Multiple predicates are by default joined by \gls{conjunction}:
<classify as="pool-hazard" desc="Hazardous pool">
<match on="diving_board" />
<t:match-lt on="pool_depth_ft" value="8" />
is equivalent to the proposition
\tameclass{pool-hazard} \equiv \tameparam{diving\_board}
\logand \tameparam{pool\_depth\_ft} < 8.
\begin{definition}[Universal Classification]\dfnlabel{classu}
A classification~$c$ by default performs \gls{conjunction} on its match
expressions $M_0\ldots M_n$.
&\xml{<classify as="}&&c\xml{" desc="$\ldots$">} \\
&\quad M_0 \\
&\quad \vdots \\
&\quad M_n \\
&&\equiv c\in\Bool \\
& &&\equiv \exists\left( M_0 \logand \ldots \logand M_n \right).
\begin{definition}[Existential Classification]\dfnlabel{classe}
A classification~$c$ with the attribute \xpath{@any="true"} performs
\gls{disjunction} on its match expressions $M_0\ldots M_n$.
&\xml{<classify as="} &&c\xml{" any="true" desc="$\ldots$">} \\
&\quad M_0 \\
&\quad \vdots \\
&\quad M_n \\
&&\equiv c\in\Bool \\
& &&\equiv \exists\left( M_0 \logor \ldots \logor M_n \right).
\begin{definition}[Match Equality]
\xml{<match on="$x$" value="$y$" />} \equiv x = y.
\begin{definition}[Match Equality Short Form]
\xml{<match on="$x$" />}
\equiv \xml{<match on="$x$" value="TRUE" />}.
\begin{definition}[Match Equality Long Form]
\xml{<match on="$x$" value="$y$" />}
&\equiv {}&&\xml{<match on="$x$">} \\
& &&\quad \xml{<c:eq>} \\
& &&\quad\quad \xml{<c:value-of name="$y$">} \\
& &&\quad \xml{</c:eq>} \\
& &&\xml{</match>} \\
&\equiv {}&&\xml{<t:match-eq on="$x$" value="$y$" />}.
\begin{definition}[Match Membership Equivalence]
When $T$ is a type defined with \xmlnode{typedef},
\xml{<match on="$x$" anyOf="$T$" />} \equiv x \in T.

View File

@ -0,0 +1,319 @@
\section{Notational Conventions}
This section provides a fairly terse overview of the foundational
mathematical concepts used in this paper.
While we try to reason about \tame{} in terms of algebra,
first-order logic;
and set theory;
notation varies even within those branches.
To avoid ambiguity,
especially while introducing our own notation,
core operators and concepts are explicitly defined below.
This section begins its numbering at~0.
This is not only a hint that \tame{} (and this paper) use 0-indexing,
but also because equations; definitions; theorems; corollaries; and the
like are all numbered relative to their section.
When you see any of these prefixed with ``0.'',
this sets those references aside as foundational mathematical concepts
that are not part of the theory and operation of \tame{} itself.
\subsection{Propositional Logic}
\index{integer (\Int)}%
We reproduce here certain axioms and corollaries of propositional logic for
convenience and to clarify our interpretation of certain concepts.
The use of the symbols $\logand$, $\logor$, and~$\neg$ are standard.
The symbol $\vdash$ means ``infer''.
We use $\implies$ in place of $\rightarrow$ for implication,
since the latter is used to denote the mapping of a domain to a codomain
in reference to functions.
We further use $\equiv$ in place of $\leftrightarrow$ to represent material
\begin{definition}[Logical Conjunction]
$p,q \vdash (p\logand q)$.
\begin{definition}[Logical Disjunction]
$p \vdash (p\logor q)$ and $q \vdash (p\logor q)$.
\begin{definition}[Law of Excluded Middle]
$\vdash (p \logor \neg p)$.
\begin{definition}[Law of Non-Contradiction]
$\vdash \neg(p \logand \neg p)$.
\begin{definition}[De Morgan's Theorem]
$\neg(p \logand q) \vdash (\neg p \logor \neg q)$
and $\neg(p \logor q) \vdash (\neg p \logand \neg q)$.
\index{equivalence!material (\ensuremath{\equiv})}
\begin{definition}[Material Equivalence]
$p\equiv q \vdash \big((p \logand q) \logor (\neg p \logand \neg q)\big)$.
$\equiv$ denotes a logical identity.
it'll often be used as a definition operator.
\begin{definition}[Logical Implication]
$p\implies q \vdash (\neg p \logor q)$.
\begin{definition}[Truth Values]\dfnlabel{truth-values}
$\vdash\true$ and $\vdash\neg\false$.
\subsection{First-Order Logic and Set Theory}
The symbol $\emptyset$ represents the empty set---%
the set of zero elements.
We assume that the axioms of ZFC~set theory hold,
but define $\in$ here for clarity.
\index{set!membership@membership (\ensuremath{\in})}
\begin{definition}[Set Membership]
$x \in S \equiv \Set{x} \cap S \not= \emptyset.$
\index{quantification|see {fist-order logic}}
\index{first-order logic!quantification (\ensuremath{\forall, \exists})}
$\forall$ denotes first-order universal quantification (``for all''),
and $\exists$ first-order existential quantification (``there exists''),
over some \gls{domain}.
\index{disjunction|see {first-order logic}}
\index{first-order logic!disjunction (\ensuremath{\logor})}
\begin{definition}[Existential Quantification]\dfnlabel{exists}
$\Exists{x\in X}{P(x)} \equiv
\true \in \Set{P(x) \mid x\in X}$.
\index{conjunction|see {first-order logic}}
\index{first-order logic!conjunction (\ensuremath{\logand})}
\begin{definition}[Universal Quantification]\dfnlabel{forall}
$\Forall{x\in X}{P(x)} \equiv \neg\Exists{x\in X}{\neg P(x)}$.
\index{set!empty (\ensuremath{\emptyset, \{\}})}
\begin{remark}[Vacuous Truth]
By Definition~7, $\Exists{x\in\emptyset}P \equiv \false$
and by \dfnref{forall}, $\Forall{x\in\emptyset}P \equiv \true$.
And so we also have the tautologies $\vdash \neg\Exists{x\in\emptyset}P$
and $\vdash \Forall{x\in\emptyset}P$.
\begin{definition}[Boolean/Integer Equivalency]\dfnlabel{bool-int}
$\Set{0,1}\in\Int, \false \equiv 0$ and $\true \equiv 1$.
\tamefalse{} and~\tametrue{} are constants in \tame{} mapping to the
\gls{integer} values $\{0,1\}\in\Int$.
\dfnref{bool-int} relates these constants to their
\gls{boolean} counterparts so that they may be used in numeric contexts
and vice-versa.
The notation $f = x \mapsto x' : A\rightarrow B$ represents a function~$f$
that maps from~$x$ to~$x'$,
where $x\in A$ (the domain of~$f$) and $x'\in B$ (the co-domain of~$f$).
A function $A\rightarrow B$ can be represented as the Cartesian
product of its domain and codomain, $A\times B$.
For example,
$x\mapsto x^2 : \Int\rightarrow\Int$ is represented by the set of ordered
pairs $\Set{(x,x^2) \mid x\in\Int}$, which looks something like
The set of values over which some function~$f$ ranges is its \emph{image},
which is a subset of its codomain.
In the example above,
both the domain and codomain are the set of integers~$\Int$,
but the image is $\Set{x^2 \mid x\in\Int}$,
which is clearly a subset of~$\Int$.
We therefore have
A \rightarrow B &\subset A\times B, \\
f : A \rightarrow B &\vdash f \subset A\times B, \\
f = \alpha \mapsto \alpha' : A \rightarrow B
&= \Set{(\alpha,\alpha')
\mid \alpha\in A \logand \alpha'\in B}, \\
f[D\subseteq A] &= \Set{f(\alpha) \mid \alpha\in D} \subset B, \\
f[] &= f[A].
And ordered pair $(x,y)$ is also called a \emph{$2$-tuple}.
an \emph{$n$-tuple} is used to represent an $n$-ary function,
where by convention we have $(x)=x$.
So $f(x,y) = f((x,y)) = x+y$.
If we let $t=(x,y)$,
then we also have $f(x,y) = ft$.
Binary functions are often written using \emph{infix} notation;
for example, we have $x+y$ rather than $+(x,y)$.
fx \in \Set{b \mid (x,b) \in f}
\subsubsection{Binary Operations On Functions}
Consider two unary functions $f$ and~$g$,
and a binary relation~$R$.
We introduce a notation~$\bicomp R$ to denote the composition of a binary
function with two unary functions.\footnote{%
The notation originates from~$\circ$ to denote ordinary function
as in $(f\circ g)(x) = f(g(x))$.}
f &: A \rightarrow B \\
g &: A \rightarrow D \\
R &: B\times D \rightarrow F \\
f \bicomp{R} g &= \alpha \mapsto f(\alpha)Rg(\alpha) : A \rightarrow F
Note that $f$ and~$g$ must share the same domain~$A$.
In that sense,
this is the mapping of the operation~$R$ over the domain~$A$.
This is analogous to unary function composition~$f\circ g$.
A scalar value~$x$ can be mapped onto some function~$f$ using a constant
For example,
consider adding some number~$x$ to each element in the image of~$f$:
f \bicomp+ (\_\mapsto x) = \alpha \mapsto f(\alpha) + x.
The symbol~$\_$ is used to denote a variable that is never referenced.
For convenience,
we also define $\bicompi{R}$,
which recursively handles combinations of function and scalar values.
This notation is used to simplify definitions of the classification system
(see \secpref{class})
when dealing with vectors
(see \secref{vec}).
\alpha \bicompi{R} \beta =
\gamma \mapsto \alpha(\gamma) \bicompi{R} \beta(\gamma)
&\text{if } (\alpha : A\rightarrow B) \logand (\beta : A\rightarrow D),\\
\gamma \mapsto \alpha(\gamma) \bicompi{R} (\_ \mapsto \beta)
&\text{if } (\alpha : A\rightarrow B) \logand (\beta \in\Real),\\
\alpha R \beta &\text{otherwise}.
Note that we consider the bracket notation for the image of a function
$(f:A\rightarrow B)[A]$ to itself be a binary function.
Given that, we have $f\bicomp{[]} = f\bicomp{[A]}$ for functions returning
functions (such as vectors of vectors in \secref{vec}),
noting that $\bicompi{[]}$ is \emph{not} a sensible construction.
\subsection{Vectors and Index Sets}\seclabel{vec}
\tame{} supports scalar, vector, and matrix values.
its implementation history leaves those concepts a bit tortured.
A vector is a sequence of values, defined as a function of
an~\gls{index set}.
Let $J\subset\Int$ represent an index set.
A \emph{vector}~$v\in\Vectors^\Real$ is a totally ordered sequence of
elements represented as a function of an element of its index set:
v = \Vector{v_0,\ldots,v_j}^{\Real}_{j\in J}
= j \mapsto v_j : J \rightarrow \Real.
This definition means that $v_j = v(j)$,
making the subscript a notational convenience.
We may omit the superscript such that $\Vectors^\Real=\Vectors$
and $\Vector{\ldots}^\Real=\Vector{\ldots}$.
Let $J\subset\Int$ represent an index set.
A \emph{matrix}~$M\in\Matrices$ is a totally ordered sequence of
elements represented as a function of an element of its index set:
M = \Vector{M_0,\ldots,M_j}^{\Vectors^\Real}_{j\in J}
= j \mapsto M_j : J \rightarrow \Vectors^\Real.
The consequences of \dfnref{matrix}---%
defining a matrix as a vector of independent vectors---%
are important.
This defines a matrix to be more like a multidimensional array,
with no requirement that the lengths of the vectors be equal.
\begin{corollary}[Matrix Row Length Variance]\corlabel{matrix-row-len}
$\vdash \Exists{M\in\Matrices}{\neg\Forall*{j}{\Forall{k}{\len{M_j} = \len{M_k}}}}$.
\corref{matrix-row-len} can be read ``there exists some matrix~$M$ such that
not all row lengths of~$M$ are equal''.
In other words---%
the inner vectors of a matrix can vary in length.
Since a vector is a function,
a vector or matrix can be converted into a set of unique elements like so:
&\mapsto &&\bigcup\Vector{\Vector{0,1}\![],\Vector{2,2}\![],\Vector{2,0}[]}\![] \\
&\mapsto &&\bigcup\Vector{\Set{0,1},\Set{2},\Set{2,0}}\![] \\
&\mapsto &&\bigcup\Set{\Set{0,1},\Set{2},\Set{2,0}} \\
&= &&\Set{0,1,2}.
We can also add two vectors, and scale them:
1 \bicomp{+} \Vector{1,2,3} \bicomp{+} \Vector{4,5,6}
&= \Vector{1+1,\, 2+1,\, 3+1} \bicomp{+} \Vector{4,5,6} \\
&= \Vector{2,3,4} \bicomp{+} \Vector{4,5,6} \\
&= \Vector{2+4,\, 3+5,\, 4+6} \\
&= \Vector{6, 8, 10}.
\subsection{XML Notation}
The grammar of \tame{} is XML.
Equivalence relations will be used to map source expressions to an
underlying mathematical expression.
For example,
\xml{<match on="$x$" value="$y$" />} \equiv x = y
defines that pattern of \xmlnode{match} expression to be materially
equivalent to~$x=y$---%
anywhere an equality relation appears,
you could equivalently replace it with that XML representation without
changing the meaning of the mathematical expression.

design/tpl/tpl.sty 100644
View File

@ -0,0 +1,125 @@
% The TAME Programming Language glossary
% Copyright (C) 2021 Ryan Specialty Group, LLC.
% Licensed under the Creative Commons Attribution-ShareAlike 4.0
% International License.
% Concrete Mathematics fonts
% Note that we force draft=false so hyperlinks always appear
% For displaying source code (e.g. XML)
% Definitions, theorems, proofs, etc
% Line spacing instead of indentation for paragraphs
% To aid in defining star commands
% Creating custom symbols
% Colors from Tango Icon Theme
% TAME is typeset in smallcaps
% TODO: highlighting
\newcommand\Fam[3]{\ensuremath{\left\{#1_{#2}\right\}_{#2\in #3}}}
\newcommand\VFam[3]{\ensuremath{\Vector{#1_{#2}}_{#2\in #3}}}
% Allows us to switch between styles, if need be
% (e.g. I used the colon style before adopting notation from type theory
% where the colon became ambiguous)
\newcommand\@Forall[2]{\forall #1\left(#2\right)}
\newcommand\@Forallstar[2]{\forall #1 #2}
\newcommand\@Exists[2]{\exists #1\left(#2\right)}
% Without \left and \right. You'll also need this if you use `&' in an
% `align' environment within an argument.
\newcommand\@Existsstar[2]{\exists #1 #2}
\newcommand\pref[1]{\ref{#1} on page~\pageref{#1}}
% Binary function composition

design/tpl/tpl.tex 100644
View File

@ -0,0 +1,65 @@
% The TAME Programming Language Living Document
% Copyright (C) 2021 Ryan Specialty Group, LLC.
% Licensed under the Creative Commons Attribution-ShareAlike 4.0
% International License.
\title{The TAME Programming Language}
\subtitle{Design and Implementation (Living Document)}
\author{Mike Gerwitz}
\date{April 2021}% TODO dynamic
% Copyright notice for bottom of first page
\tiny Copyright \textcopyright{} 2021 Ryan Specialty Group, LLC.
CC-BY-SA 4.0.]{\thepage}
% Begin section numbering at 0 to emphasize that it's foundational material
% not directly related to TAME itself
\tame{} is The Algebraic Metalanguage, a programming language and
collection of tools designed to aid in the development, understanding,
and maintenance of systems performing numerous calculations on a
complex graph of dependencies, conditions, and a large number of
inputs. \tame{} has existed for over a decade, and while its initial
design was successful and still in active use today, it does suffer
from inconsistencies and tradeoffs that introduce certain impediments
to users of the language, and compromise future optimizations and
language evolution. It also lacks documentation not just of the
language itself, but also of the underlying principles and
This document is an attempt to formally consider certain parts of
\tame{} as it undergoes redesign and reimplementation as part of the
\tamer{} project. It is considered a living document---it is not
likely to ever be a finished work.