rebirth: Add define-macro

This provides preliminary support for traditional Lisp macros, _not_ Scheme
macros as we know them today; this implementation is easy to implement, and
gives us a great foundation for moving forward.

There are caveats to this simple and naive implementation, documented
ad nauseam.  It's probably worth a read if you're studying Rebirth in any
level of detail for whatever reason.

This is a pretty exciting change---it liberates us from rigid compiler
changes and will allow us to rewrite fnmap into macros almost as-is.  The
change is fairly elegant, all things considered---the amount of code is
minimal; most of the change consists of comments describing it and its
caveats.  This is a defining step in Rebirth, and brings it a step closer to
being an actual Lisp rather than a fragment of one.  (Though I still don't
know if Rebirth Lisp will ever actually be a full Lisp.  Please hold the
arguments about Scheme not being a Lisp or I'll respond very immaturely with
"your mom is not a Lisp", and we'll both be very confused and somewhat
offended by one-another, with a net loss overall.  Oh, wait, you're actually
reading this?)

Anyway, changes:

* build-aux/bootstrap/rebirth.scm: Some rephrasing of toplevel comments,
    and addition of macro comments.
  (_macros): New ECMAScript variable.
  (cdfn-macro, macro-compile-result, list->ast): New procedures.  Little
    full of comments.  Lots of sap.
  (parse-lisp): Fix typo.  Add third argument to `cdfn-proc' (#f).
  (cdfn-proc): Add third argument `id-override'.  Use it in place of token
    value, if set.  Remove semicolon from generated ES function (it was
    unneeded to begin with) so that it can be used in macro ES expressions.
  (macro?): New procedure, conditional based on availability of
    `string->es'.
  (apply-proc-or-macro): New procedure.  Conditionally apply macro during
    compiler runtime or compile output for a procedure application.
  (fnmap)[define-macro]: Apply `cdfn-macro'.  This will try to apply it even
    if the procedure isn't defined yet (e.g. first Rebirth pass), so don't
    call it until then!
    [else]: Use `apply-proc-or-macro'.
master
Mike Gerwitz 2017-12-11 22:43:29 -05:00
parent 0a1e530a76
commit 1d6756a709
Signed by: mikegerwitz
GPG Key ID: 8C917B7F5DC51BA2
1 changed files with 276 additions and 21 deletions

View File

@ -34,7 +34,7 @@
;;;
;;; Of course, bootstrapping can't end there: we need a fully functioning
;;; Scheme compiler. Rebirth may as well be called Rerebirth, or
;;; Rererebirth, or Re*birth---it is a recursively self-hosting
;;; Rererebirth, or Re*birth, or Reⁿbirth---it is a recursively self-hosting
;;; compiler. It adds features to itself each time it compiles itself.
;;;
;;; Note that we're dealing with a small subset of Scheme here, so certain
@ -48,20 +48,23 @@
;;; first commit is the direct copy before actual code changes).
;; Here we define the libprebirth primitives. When we first compile Rebirth
;; with Birth, `string->es' is not yet available, because it is only
;; implemented in Rebirth. Further, Birth includes libprebirth in its
;; output, so we cannot blindly redefine the procedures without producing an
;; error.
;; The first step in the Rebirth process is to liberate ourselves from
;; libprebirth.
;; Here we define the libprebirth primitives. When we first compile
;; Rebirth with Birth, `string->es' is not yet available, because it is
;; only implemented in Rebirth. Further, Birth includes libprebirth in
;; its output, so we cannot blindly redefine the procedures without
;; producing an error.
;;
;; Once Rebirth is compiled with Birth, Rebirth can then compile
;; itself. Since Rebirth _does_ implement `string->es', and further _does not_
;; include libprebirth in its output, we can define the libprebirth
;; itself. Since Rebirth _does_ implement `string->es', and further _does
;; not_ include libprebirth in its output, we can define the libprebirth
;; primitives ourselves in Rebirth Lisp. Cut the cord.
;;
;; Some of these definitions aren't valid: variable arguments, for example,
;; aren't represented _at all_---the `define' form will be properly
;; implemented in the future to correct this.
;; Some of these definitions aren't valid: variable arguments, for
;; example, aren't represented _at all_---the `define' form will be
;; properly implemented in the future to correct this.
(cond-expand
(string->es
(define #t (string->es "true"))
@ -225,6 +228,228 @@
(string->es "fsdata[$$path]")))))
;; Without macro support, anything that involves producing code with
;; variable structure at compile-time must be hard-coded in the
;; compiler. Perhaps the greatest power in Lisp is the ability to extend
;; the language through its own facilities---its ability to parse itself
;; and treat itself as data.
;;
;; So we need to introduce macro support.
;;
;; This is not a trivial task: RⁿRS has a rich and powerful system that
;; would be quite a bit of work upfront to implement. Instead, we're
;; going to focus on traditional Lisp macros, which are conceptually
;; rather simple---they produce a list that, when expanded, is treated as
;; Lisp code as if the user had typed it herself.
;;
;; Macros hold the full power of Lisp---macro expansion _is_
;; compilation. This means that we need to compile macro expansions as
;; their own separate programs during the normal compilation process and
;; splice in the result. But to execute the macro, we need to execute
;; ECMAScript code that we just generated. In other words: the evil eval.
;;
;; ECMAScript has two ways of evaluating ES code contained in a string:
;; through the `eval' function and by instantiating `Function' with a
;; string argument representing the body of the function (or something
;; that can be cast into a string). Good thing, otherwise we'd find
;; ourselves painfully writing a Lisp interpreter in Rebirth Lisp.
;;
;; This implementation is very simple---there's very little code but a great
;; deal of comments. They describe important caveats and hopefully
;; enlighten the curious reader.
(cond-expand
(string->es
;; Stores macros for compiler runtime.
(string->es "const _macros = {}")
(define (cdfn-macro sexp)
(define (%make-macro-proc sexp)
;; The syntax for a macro definition is the same as a procedure
;; definition. In fact, that's exactly what we want, since a macro is
;; a procedure that, when applied, produces a list. But we want an
;; anonymous function, so override the id to the empty string.
(let* ((proc-es (cdfn-proc sexp "")))
;; Rather than outputting the generated ES function, we're going to
;; immediately evaluate it. This is a trivial task, but how we do
;; it is important: we need to maintain lexical scoping. This
;; means that we must use `eval'---`new Function' does not create a
;; closure.
;;
;; The only thing we need to do to ensure that eval returns a
;; function is to enclose the function definition in
;; parenthesis. This results in something along the lines of:
;; eval("(function(args){...})")
;;
;; If you're confused by the execution environment (compiler
;; runtime vs. compiler output), don't worry, you're not
;; alone. We're actually dealing with a number of things here:
;;
;; 1. Use `string->es' below to produce _compiler output_ for the
;; next version of a Rebirth Lisp compiler that will be
;; responsible for actually running the `eval'.
;; 2. That next version of the compiler will then compile
;; ECMAScript function definition from macro procedure source
;; using `cdfn-proc' as above.
;; 3. This will then be run by the compiler _at runtime_ by
;; running the `eval' statement below (which is part of the
;; program just as if it were Lisp).
;; 4. The result will be the procedure `proc-es' available to the
;; compiler at runtime rather than produced as compiler output.
;;
;; There's a lot of words here for so little code! We currently
;; lack the language features necessary to produce the types of
;; abstractions that would make this dissertation unnecessary.
(string->es "eval('(' + $$proc$_$es + ')')")))
;; We then store the macro by name in memory in `_macros'. When
;; invoked, it will apply the result of the above generated procedure
;; to `macro-compile-result' (defined below), which will produce the
;; ECMAScript code resulting from the macro application.
;;
;; There are consequences to this naive implementation. Rebirth is a
;; dumb transpiler that relies on features of ECMAScript to do its
;; job. In particular, we don't have any dependency graph or lexical
;; scoping or any of those necessary features---we let ECMAScript take
;; care of all of that. That means that we have no idea what is
;; defined or even what has been compiled; we just transpile and move
;; on blindly. Any errors resulting from undefined procedures, for
;; example, occur at runtime in the compiled output.
;;
;; These are features that will be implemented in Gibble Lisp; that's
;; not something to distract ourselves with now.
;;
;; So there are some corollaries:
;;
;; 1. Macros must be defined _before_ they are called. Order
;; matters.
;; 2. Macros can only make use of what is defined in the compiler
;; runtime environment---if a procedure is defined, it won't be
;; available to macros until the next compilation pass. This is
;; because we have no dependency graph and cannot automatically
;; eval dependencies so that they are available in the execution
;; context.
;; - To work around that, procedures can be defined within the
;; macro body. Of course, then they're encapsulated within it,
;; which is not always desirable.
;;
;; While this implementation is crippled, it does still provide good
;; foundation with which we can move forward. Our use of recursive
;; Reⁿbirth passes and `cond-expand' makes this less of an issue as
;; well, since we're recursing anyway.
(let ((macro-proc (%make-macro-proc sexp))
(macro-id (token-value (caadr sexp)))) ; XXX
(string->es
"_macros[$$macro$_$id] = function(){
return $$macro$_$compile$_$result(
$$macro$_$proc.apply(this,arguments))};")
;; Because the macro procedure was evaluated at runtime, it would
;; never actually itself be output. This makes debugging difficult,
;; so we'll output it as a comment. This is admittedly a little bit
;; dangerous, as we're assuming that no block comments will ever
;; appear in `macro-proc'. But at the time of writing, this
;; assumption is perfectly valid.
(string-append "/*macro " macro-id ": " macro-proc "*/")))
;; Compile the S-expression resulting from the macro application into
;; ECMAScript.
;;
;; This simply converts the given S-expression SEXP into an AST and
;; compiles it using the same procedures that we've been using for all
;; other code. See below for details.
(define (macro-compile-result sexp)
(sexp->es (list->ast sexp)))
;; Produce a Rebirth List AST from an internal list form.
;;
;; Up until this point, the only way to represent Rebirth Lisp was using
;; a typical Lisp form. With macros, however, we have bypassed that
;; source form---we're working with our own internal representation of a
;; list.
;;
;; The structure of the AST is already done---it mirrors that of the list
;; itself. What we need to do is map over the list, recursively, and
;; convert each item into a token.
;;
;; Consider the tokens processed by `toks->ast': comments,
;; opening/closing delimiters, strings, and symbols. We don't need to
;; worry about comments since we aren't dealing with source code. We
;; also don't need to worry about opening/closing delimiters since we
;; already have our list. This leaves only two token types to worry
;; about: strings and symbols.
;;
;; And then there's the fascinating case of macro arguments. When a
;; macro or procedure application are encountered during compilation, the
;; arguments are represented as tokens (see `apply-proc-or-macro'). As
;; just mentioned, the end goal is to convert our list SEXP into tokens
;; for the AST. But the arguments are _already_ tokens, so they need no
;; additional processing---we just splice them in as-is! This trivial
;; operation yields the powerful Lisp macro ability we're looking for:
;; the ability to pass around chunks of the AST.
;;
;; Consequently, we have Rebirth-specific syntax to deal with when
;; processing the AST within macros. Up until this point, in place of
;; macros, we have used `fnmap', which operates on tokens. That is the
;; case here as well: if a macro wishes to assert on or manipulate any
;; syntax it is given, it must use the Rebirth token API that the rest of
;; the system uses. For example, say we have a macro `foo' that asserts
;; on its first argument as a string:
;;
;; (foo "moo") => "cow"
;; (foo "bar") => "baz"
;;
;; This will _not_ work:
;;
;; (define-macro (foo x)
;; (if (string=? x "moo") "cow" "baz"))
;;
;; The reason is that `x' is not a string---it is a `token?'. Instead,
;; we must do this:
;;
;; (define-macro (foo x)
;; (if (string=? (token-value x) "moo") "cow" "baz"))
;;
;; Of course, if you do not need to make that determination at
;; compile-time, you can defer it to runtime instead and use `string=?':
;;
;; (define-macro (foo x)
;; (quasiquote (if (string=? (unquote x) "moo") "cow" "baz")))
;;
;; Simple implementation, complex consequences. Scheme uses syntax
;; objects; we'll provide that abstraction over our implementation at
;; some point.
;;
;; Okay! That's trivial enough, isn't it?
(define (list->ast sexp)
;; Anything that is not a string is considered to be a symbol
;; token. But note that a symbol token does not necessarily mean an
;; ECMAScript Symbol object.
(define (%list-item item)
(case (es:typeof item)
(("string")
(list "string" item))
(("symbol")
(list "symbol" (string->es "Symbol.keyFor($$item)")))
(else
(list "symbol" (string->es "''+$$item")))))
;; Recursively create tokens for each item. Note that we will not have
;; any useful source code or source location information---just use the
;; empty string and 0 for them, respectively.
;;
;; The lexeme will simply be the item converted into a string, whatever
;; that happens to be.
(if (token? sexp)
sexp
(if (list? sexp)
(map list->ast sexp)
(let* ((item-parts (%list-item sexp))
(type (car item-parts))
(lexeme (cadr item-parts)))
(car (make-token type lexeme "" 0))))))))
;; pair selection
(define (cadr xs)
(car (cdr xs)))
@ -382,7 +607,7 @@
;; and that EOF isn't reached before all sexps are closed) and produces an
;; AST that is an isomorphism of the original sexps.
(define (parse-lisp src)
;; accessor methods to make you and me less consfused
;; accessor methods to make you and me less confused
(define (ast-depth ast) (car ast))
(define (ast-tree ast) (cadr ast))
(define (ast-stack ast) (caddr ast))
@ -530,7 +755,7 @@
(define (cdfn t)
(if (token? (cadr t))
(cdfn-var t) ;; (define foo ...)
(cdfn-proc t))) ;; (define (foo ...) ...)
(cdfn-proc t #f))) ;; (define (foo ...) ...)
;; Compile variable definition into ES
@ -546,16 +771,17 @@
;; Compile procedure definition into an ES function definition
;;
;; This will fail if the given token is not a `define'.
(define (cdfn-proc t)
(define (cdfn-proc t id-override)
;; e.g. (define (foo ...) body)
(let* ((dfn (cadr t))
(id (tname->id (token-value (car dfn))))
(id (or id-override
(tname->id (token-value (car dfn)))))
(params (params->es (cdr dfn)))
(body (body->es (cddr t) #t)))
;; this is the final format---each procedure becomes its own function
;; definition in ES
(string-append
"function " id "(" params ")\n{\n" body "\n};")))
"function " id "(" params ")\n{\n" body "\n}")))
;; Quote an expression
@ -670,6 +896,36 @@
""))
;; Determine whether the given name NAME represents a macro.
;;
;; If `string->es' is not yet supported, then this procedure always
;; yields `#f'. Otherwise, the compiler runtime `_macros' is consulted.
;;
;; See `cdfn-macro' for more information.
(define (macro? name)
(cond-expand
(string->es
(string->es "_macros[$$name] !== undefined"))
(else #f)))
;; Determine if FN is a procedure or macro and apply it accordingly with
;; arguments ARGS.
;;
;; These actions represent two separate environments: If a macro, then the
;; call needs to be executed immediately within the context of the compiler
;; runtime. Otherwise, procedure applications are simply compiled to be
;; produced with the rest of the compiler output and will be run at a later
;; time within the context of the compiled program.
(define (apply-proc-or-macro fn args)
(if (macro? fn)
(string->es "_macros[$$fn].apply(null,$$args)")
;; Procedures are produced as part of the compiler output.
(let* ((idfn (tname->id fn))
(argstr (join ", " (map sexp->es args))))
(string-append idfn "(" argstr ")"))))
;; Function/procedure aliases and special forms
;;
;; And here we have what is probably the most grotesque part of this file.
@ -719,7 +975,8 @@
(("quote") (quote-sexp (car args)))
(("quasiquote") (quasiquote-sexp (car args)))
(("define") (cdfn t))
(("define") (cdfn t))
(("define-macro") (cdfn-macro t)) ; not defined until string->es cond
(("lambda")
(let ((fnargs (car args))
@ -828,10 +1085,8 @@
(val (cadr args)))
(string-append (sexp->es varid) " = " (sexp->es val))))
;; normal procedure application
(else (let* ((idfn (tname->id fn))
(argstr (join ", " (map sexp->es args))))
(string-append idfn "(" argstr ")")))))
;; procedure or macro
(else (apply-proc-or-macro fn args))))
;; Convert s-expressions or scalar into ECMAScript