tame/tamer/src
Mike Gerwitz bfe46be5bb tamer: xir::tree::attr_parser_from: Integrate AttrParser
This begins to integrate the isolated AttrParser.  The next step will be
integrating it into the larger XIRT parser.

There's been considerable delay in getting this committed, because I went
through quite the struggle with myself trying to determine what balance I
want to strike between Rust's type system; convenience with parser
combinators; iterators; and various other abstractions.  I ended up being
confounded by trying to maintain the current XmloReader abstraction, which
is fundamentally incompatible with the way the new parsing system
works (streaming iterators that do not collect or perform heap
allocations).

There'll be more information on this to come, but there are certain things
that will be changing.

There are a couple problems highlighted by this commit (not in code, but
conceptually):

  1. Introducing Option here for the TokenParserState doesn't feel right, in
     the sense that the abstraction is inappropriate.  We should perhaps
     introduce a new variant Parsed::Done or something to indicate intent,
     rather than leaving the reader to have to read about what None actually
     means.
  2. This turns Parsed into more of a statement influencing control
     flow/logic, and so should be encapsulated, with an external equivalent
     of Parsed that omits variants that ought to remain encapsulated.
  3. TokenStreamState is true, but these really are the actual parsers;
     TokenStreamParser is more of a coordinator, and helps to abstract away
     some of the common logic so lower-level parsers do not have to worry
     about it.  But calling it TokenStreamState is both a bit
     confusing and is an understatement---it _does_ hold the state, but it
     also holds the current parsing stack in its variants.

Another thing that is not yet entirely clear is whether this AttrParser
ought to care about detection of duplicate attributes, or if that should be
done in a separate parser, perhaps even at the XIR level.  The same can be
said for checking for balanced tags.  By pushing it to TokenStream in XIR,
we would get a guaranteed check regardless of what parsers are used, which
is attractive because it reduces the (almost certain-to-otherwise-occur)
risk that individual parsers will not sufficiently check for semantically
valid XML.  But it does _potentially_ match error recovery more
complicated.  But at the same time, perhaps more specific parsers ought not
care about recovery at that level.

Anyway, point being, more to come, but I am disappointed how much time I'm
spending considering parsing, given that there are so many things I need to
move onto.  I just want this done right and in a way that feels like it's
working well with Rust while it's all in working memory, otherwise it's
going to be a significant effort to get back into.

DEV-11268
2021-12-10 14:25:08 -05:00
..
asg tamer: {ir::=>}{asg, xir} 2021-11-04 16:13:27 -04:00
bin tamer: frontend: Begin basic XML parsing 2021-07-27 00:37:13 -04:00
frontend tamer: frontend: Begin basic XML parsing 2021-07-27 00:37:13 -04:00
iter tamer: iter::collect::TryCollect::try_collect_ok: Doc fix 2021-11-16 12:26:05 -05:00
ld tamer: xir: Remove Attr::Extensible 2021-12-06 14:26:58 -05:00
obj tamer: xir::Token::span: New method 2021-12-06 14:48:55 -05:00
sym tamer: obj::xmlo: Extract error types into own module 2021-11-16 15:47:52 -05:00
test tamer: tameld: Skip fragment unescaping only to re-escape on write 2021-08-18 11:39:06 -04:00
tpwrap tamer: Introduce tpwrap module to contain quick_xml::Error adapter 2021-07-23 23:23:55 -04:00
xir tamer: xir::tree::attr_parser_from: Integrate AttrParser 2021-12-10 14:25:08 -05:00
convert.rs tamer: convert: Add missing method-level docs 2021-09-08 16:12:53 -04:00
fs.rs Copyright year update 2021 2021-07-22 15:00:15 -04:00
global.rs tamer: Remove Ix generalization throughout system 2021-09-23 14:52:54 -04:00
iter.rs tamer: iter::TryCollect::try_collect_ok: New method 2021-11-10 09:09:07 -05:00
ld.rs tamer: {ir::=>}{asg, xir} 2021-11-04 16:13:27 -04:00
lib.rs tamer: {ir::=>}{asg, xir} 2021-11-04 16:13:27 -04:00
span.rs tamer: Replace all &'static str in errors with SymbolId 2021-10-11 15:39:53 -04:00
xir.rs tamer: xir::Token::span: New method 2021-12-06 14:48:55 -05:00