tamer: asg::graph::visit::topo: Cut cycles

This commit includes plenty of documentation, so you should look there.

It's desirable to describe the sorting that TAME performs as a topological
sort, since that's the end result we want.  This uses the ontology to
determine what to do to the graph when a cycle is encountered.  So
technically we're sorting a graph with cycles, but you can equivalently view
this as first transforming the graph to cut all cycles and then sorting it.

For the sake of trivia, the term "cut" is used for two reasons: (1) it's an
intuitive visualization, and (2) the term "cut" has precedence in logic
programming (e.g. Prolog), where it (`!`) is used to prevent
backtracking.  We're also preventing backtracking, via a back edge, which
would produce a cycle.

DEV-13162
main
Mike Gerwitz 2023-04-28 14:33:48 -04:00
parent c2c1434afe
commit 9b53a5e176
4 changed files with 221 additions and 25 deletions

View File

@ -980,12 +980,17 @@ object_rel! {
/// Opaque identifiers at the time of writing are used by the linker
/// which does not reason about cross edges
/// (again at the time of writing).
///
/// Identifiers representing functions are able to produce cycles,
/// representing recursion.
/// This is a legacy feature expected to be removed in the future;
/// see [`ObjectRel::can_recurse`] for more information.
Ident -> {
tree Ident,
tree Expr,
tree Tpl,
tree Meta,
}
} can_recurse(ident) if matches!(ident.kind(), Some(IdentKind::Func(..)))
}
impl ObjectIndex<Ident> {

View File

@ -58,6 +58,7 @@ macro_rules! object_rel {
$from:ident -> {
$($ety:ident $kind:ident,)*
}
$(can_recurse($rec_obj:ident) if $rec_expr:expr)?
) => {paste::paste! {
/// Subset of [`ObjectKind`]s that are valid targets for edges from
#[doc=concat!("[`", stringify!($from), "`].")]
@ -73,10 +74,10 @@ macro_rules! object_rel {
}
impl ObjectRel<$from> for [<$from Rel>] {
fn narrow<OB: ObjectRelFrom<$from> + ObjectRelatable>(
self,
fn narrow_ref<OB: ObjectRelFrom<$from> + ObjectRelatable>(
&self,
) -> Option<ObjectIndex<OB>> {
match self {
match *self {
$(Self::$kind(oi) => oi.filter_rel(),)*
}
}
@ -98,6 +99,15 @@ macro_rules! object_rel {
),
}
}
$(
fn can_recurse(&self, asg: &Asg) -> bool {
self.narrow_ref::<$from>()
.map(|oi| oi.resolve(asg))
.map(|$rec_obj| $rec_expr)
.unwrap_or(false)
}
)?
}
impl ObjectRelatable for $from {
@ -319,6 +329,29 @@ impl<S> DynObjectRel<S, ObjectIndex<Object>> {
ty_cross_edge!(Root, Pkg, Ident, Expr, Tpl, Meta, Doc)
}
/// Dynamically determine whether this edge represents a permitted
/// cycle.
///
/// A cycle is permitted in certain cases of recursion.
/// See [`ObjectRel::can_recurse`] for more information.
pub fn can_recurse(&self, asg: &Asg) -> bool {
macro_rules! ty_can_recurse {
($($ty:ident),*) => {
match self.source_ty() {
$(
ObjectRelTy::$ty => {
self.narrow_target::<$ty>().is_some_and(
|rel| rel.can_recurse(asg)
)
},
)*
}
}
}
ty_can_recurse!(Root, Pkg, Ident, Expr, Tpl, Meta, Doc)
}
}
impl<T> DynObjectRel<ObjectIndex<Object>, T> {
@ -502,6 +535,16 @@ pub trait ObjectRel<OA: ObjectKind + ObjectRelatable>:
/// query for edges of particular kinds.
fn narrow<OB: ObjectRelFrom<OA> + ObjectRelatable>(
self,
) -> Option<ObjectIndex<OB>> {
self.narrow_ref()
}
/// Attempt to narrow into the [`ObjectKind`] `OB`.
///
/// This method is the same as [`Self::narrow`],
/// but taking a reference instead of ownership.
fn narrow_ref<OB: ObjectRelFrom<OA> + ObjectRelatable>(
&self,
) -> Option<ObjectIndex<OB>>;
/// Attempt to narrow into the [`ObjectKind`] `OB`,
@ -624,6 +667,49 @@ pub trait ObjectRel<OA: ObjectKind + ObjectRelatable>:
/// that,
/// once all use cases are clear.
fn is_cross_edge<S, T>(&self, rel: &DynObjectRel<S, T>) -> bool;
/// Whether the provided relationship represents a valid recursive
/// target.
///
/// It is expected that this method will be consulted only when the
/// provided [`ObjectIndex`] would produce a cycle when added to some
/// path.
/// This means that the source and target object will be identical.
///
/// It is expected that a cycle should be able to be "cut" at this point
/// while still producing a valid topological ordering of the graph.
/// For example,
/// consider two mutually recursive functions `A` and `B`,
/// as shown here:
///
/// ```text
/// A -> B
/// ^----'
/// ```
///
/// There are two cycles that might be encountered:
///
/// - `A -> B -> A`, which would cut to `A -> B`; and
/// - `B -> A -> B`, which would cut to `B -> A`.
///
/// In both cases,
/// since both `A` and `B` are functions,
/// a valid ordering is produced.
///
/// Failure to uphold this invariant when designing the graph's ontology
/// will result in an invalid ordering of the graph,
/// which will compile a program that does not behave according to
/// its specification.
/// That is:
/// proper ordering is a requirement to uphold soundness.
///
/// Recursion will continue to be limited as TAMER progresses,
/// migrating to a more APL-like alternative to solving
/// otherwise-recursive problems and restricting remaining recursion
/// to that which can provably terminate.
fn can_recurse(&self, _asg: &Asg) -> bool {
false
}
}
/// An [`ObjectIndex`]-like object that is able to relate to

View File

@ -40,6 +40,35 @@
//! so any additional information would provide an incomplete picture;
//! this sort is _not_ intended to provide information about all paths
//! to a particular object and cannot be used in that way.
//!
//! Cutting Of Cycles
//! =================
//! A _cycle_ is a path that references another object earlier in the path,
//! as if it loops in on itself.
//! Cycles are generally not permitted,
//! as they would require that a value would have to be computed before it
//! could compute itself.
//! This almost certainly represents an error in the program's specification.
//!
//! Cycles are permitted for recursion.
//! More information can be found in [`ObjectRel::can_recurse`].
//!
//! A toplogical ordering is defined only for graphs that do not contain
//! cycles.
//! To order a graph _with_ cycles,
//! the depth-first search performs a _cut_,
//! whereby the edge that would have led to the cycle is omitted,
//! as if cutting a loop of string at the point that it is tied.
//! An example of such a cut can be found in [`ObjectRel::can_recurse`].
//!
//! This is done in two scenarios:
//!
//! 1. An unsupported cycle is an error.
//! A cut is performed as a means of error recovery so that the process
//! may continue and discover more errors before terminating.
//!
//! 2. A cycle representing allowed recursion performs a cut since the
//! path taken thus far already represents a valid ordering.
use super::super::{Asg, ObjectIndex};
use crate::{
@ -50,7 +79,7 @@ use fixedbitset::FixedBitSet;
use std::{error::Error, fmt::Display, iter::once};
#[cfg(doc)]
use crate::span::Span;
use crate::{asg::graph::object::ObjectRel, span::Span};
pub fn topo_sort(
asg: &Asg,
@ -77,7 +106,7 @@ pub struct TopoPostOrderDfs<'a> {
/// Each iterator pops a relationship off the stack and visits it.
///
/// The inner [`Result`] serves as a cycle flag set by
/// [`Self::flag_if_cycle`].
/// [`Self::flag_or_cut_cycle`].
/// Computing the proper [`Cycle`] error before placing it on the stack
/// would not only bloat the size of each element of this stack,
/// but also use unnecessary memory on the heap.
@ -167,7 +196,7 @@ impl<'a> TopoPostOrderDfs<'a> {
/// this determination is made by consulting [`Self::finished`].
///
/// Each object that is pushed onto the stack will be checked by
/// [`Self::flag_if_cycle`];
/// [`Self::flag_or_cut_cycle`];
/// see that function for more information.
/// It is important that each cycle be flagged individually,
/// rather than returning an error from this function,
@ -210,13 +239,14 @@ impl<'a> TopoPostOrderDfs<'a> {
fn push_neighbors(&mut self, src_oi: ObjectIndex<Object>) {
self.asg
.edges_dyn(src_oi)
.map(|dyn_oi| *dyn_oi.target())
.filter(|&oi| !self.finished.contains(oi.into()))
.map(|oi| Self::flag_if_cycle(&self.visited, oi))
.filter(|dyn_oi| !self.finished.contains((*dyn_oi.target()).into()))
.filter_map(|dyn_oi| {
Self::flag_or_cut_cycle(&self.visited, self.asg, dyn_oi)
})
.collect_into(&mut self.stack);
}
/// Determine if the provided [`ObjectIndex`] would introduce a cycle if
/// Determine if the provided relation would introduce a cycle if
/// appended to the current path and flag it if so.
///
/// This should be called only after having checked [`Self::finished`],
@ -229,6 +259,17 @@ impl<'a> TopoPostOrderDfs<'a> {
/// If so,
/// then introducing it again would produce a cycle.
///
/// Cycles are permitted under limited circumstances,
/// where the edge represents a recursive target.
/// This determination is made utilizing the graph's ontology via
/// [`DynObjectRel::can_recurse`].
/// If the cycle ends up being permitted,
/// then we perform a cut by filtering out the edge entirely,
/// as if it did not exist.
/// It is up to the graph's ontology to ensure that all such cuts will
/// result in a valid ordering.
/// (Cuts also occur during error recovery for unsupported cycles.)
///
/// We use [`Result`] where `E` is [`ObjectIndex`] to simply flag the
/// object as containing a cycle;
/// this allows us to defer computation of the cycle and allocation
@ -238,14 +279,21 @@ impl<'a> TopoPostOrderDfs<'a> {
///
/// See [`Self::find_cycle_path`] for the actual cycle computation that
/// will eventually be performed.
fn flag_if_cycle(
fn flag_or_cut_cycle(
visited: &FixedBitSet,
oi: ObjectIndex<Object>,
) -> Result<ObjectIndex<Object>, ObjectIndex<Object>> {
asg: &Asg,
dyn_oi: DynObjectRel,
) -> Option<Result<ObjectIndex<Object>, ObjectIndex<Object>>> {
let oi = *dyn_oi.target();
if visited.contains(oi.into()) {
Err(oi)
if dyn_oi.can_recurse(asg) {
None // cut
} else {
Some(Err(oi))
}
} else {
Ok(oi)
Some(Ok(oi))
}
}
@ -254,7 +302,7 @@ impl<'a> TopoPostOrderDfs<'a> {
/// leaving it on the stack.
///
/// If the object atop of the stack has been flagged as a cycle by
/// [`Self::flag_if_cycle`],
/// [`Self::flag_or_cut_cycle`],
/// then the actual path associated with the cycle will be computed
/// by [`Self::find_cycle_path`] and an a [`Cycle`] returned.
///

View File

@ -21,9 +21,12 @@ use super::*;
use crate::{
asg::{
air::{Air, AirAggregate},
graph::object::{self, ObjectTy, Pkg},
ExprOp,
graph::object::{
self, ObjectKind, ObjectRelFrom, ObjectRelatable, ObjectTy, Root,
},
ExprOp, IdentKind,
},
num::{Dim, Dtype},
parse::{util::SPair, ParseState},
span::{dummy::*, Span, UNKNOWN_SPAN},
};
@ -54,11 +57,12 @@ fn topo_report_only(
.collect()
}
fn topo_report<I: IntoIterator<Item = Air>>(
fn topo_report<O: ObjectKind + ObjectRelatable, I: IntoIterator<Item = Air>>(
toks: I,
) -> Vec<Result<(ObjectTy, Span), Vec<(ObjectTy, Span)>>>
where
I::IntoIter: Debug,
O: ObjectRelFrom<Root>,
{
let mut parser = AirAggregate::parse(toks.into_iter());
assert!(parser.all(|x| x.is_ok()));
@ -68,7 +72,7 @@ where
topo_report_only(
asg,
oi_root.edges_filtered::<Pkg>(asg).map(ObjectIndex::widen),
oi_root.edges_filtered::<O>(asg).map(ObjectIndex::widen),
)
}
@ -151,7 +155,7 @@ fn sorts_objects_given_single_root() {
// `topo_sort` via `topo_report`.
(Pkg, m(S1, S14) ),
]),
topo_report(toks).into_iter().collect(),
topo_report::<object::Pkg, _>(toks).into_iter().collect(),
);
}
@ -211,7 +215,7 @@ fn sorts_objects_given_single_root_more_complex() {
(Pkg, m(S1, S17) ),
]),
topo_report(toks).into_iter().collect(),
topo_report::<object::Pkg, _>(toks).into_iter().collect(),
);
}
@ -336,7 +340,7 @@ fn sorts_objects_given_multiple_roots() {
(Ident, S8),
(Pkg, m(S6, S10)),
]),
topo_report(toks).into_iter().collect(),
topo_report::<object::Pkg, _>(toks).into_iter().collect(),
);
}
@ -402,6 +406,59 @@ fn unsupported_cycles_with_recovery() {
Ok((Pkg, m(S1, S11))),
],
topo_report(toks).into_iter().collect::<Vec<_>>(),
topo_report::<object::Pkg, _>(toks)
.into_iter()
.collect::<Vec<_>>(),
);
}
// TAME supports cycles in certain contexts,
// as a component of the graph's ontology.
// A topological sort of a graph containing permitted cycles should be
// viewed as sorting a graph that first "cuts" those cycles,
// filtering out the edge that would have caused the cycle to occur.
// It is the responsibility of the ontology to ensure that all such cuts
// will result in a topological sort.
#[test]
fn supported_cycles() {
let id_a = SPair("func_a".into(), S3);
let id_b = SPair("func_b".into(), S8);
let kind = IdentKind::Func(Dim::Scalar, Dtype::Integer);
#[rustfmt::skip]
let toks = vec![
PkgStart(S1),
// Two mutually recursive functions.
IdentDecl(id_a, kind.clone(), Default::default()), // <--.
IdentDep(id_a, id_b), // -. |
// | |
IdentDecl(id_b, kind.clone(), Default::default()), // <' |
IdentDep(id_b, id_a), // ---'
// Root so that `topo_report` will find them.
IdentRoot(id_a),
IdentRoot(id_b),
PkgEnd(S11),
];
// TODO: Template recursion was not part of the ontology at the time of
// writing.
use ObjectTy::*;
assert_eq!(
#[rustfmt::skip]
Ok(vec![
// The order in which the above functions will be visited is
// undefined;
// this is the ordering that happens to be taken by the
// implementation based on the definition and stack
// ordering.
(Ident, S8),
(Ident, S3),
]),
topo_report::<object::Ident, _>(toks)
.into_iter()
.collect::<Result<Vec<_>, _>>(),
);
}