coope/sec/encap-hacks.tex

297 lines
16 KiB
TeX

\section{Encapsulating the Hacks}
\label{sec:encap}
Imagine jumping into a project in order to make a simple modification and then
seeing the code in \jsref{lst:prot-share}. This is a far cry from the simple
protected member declarations in traditional classical object-oriented
languages. In fact, there becomes a point where the hacks discussed in the
previous sections become unmaintainable messes that add a great deal of
boilerplate code with little use other than to distract from the actual
software itself.
However, we do not have to settle for those messy implementations. Indeed, we
can come up with some fairly elegant and concise solutions by encapsulating the
hacks we have discussed into a classical object-oriented framework, library or
simple helper functions. Let's not get ahead of ourselves too quickly; we will
start exploring basic helper functions before we deal with diving into a full,
reusable framework.
This section is intended for educational and experimental purposes. Before using
these examples to develop your own class system for ECMAScript, ensure that none
of the existing systems satisfy your needs; your effort is best suited toward
the advancement of existing projects than the segregation caused by the
introduction of additional, specialty frameworks.\footnote{That is not to
discourage experimentation. Indeed, one of the best, most exciting and fun ways
to learn about these concepts are to implement them yourself.} These are
discussed a bit later.
\subsection{Constructor/Prototype Factory}
\label{sec:ctor-factory}
Section~\ref{sec:extending} offered one solution to the problem of creating an
extensible constructor, allowing it to be used both to instantiate new objects
and as a prototype. Unfortunately, as \jsref{lst:ctor-extend} demonstrated, the
solution adds a bit of noise to the definition that will also be duplicated for
each constructor. The section ended with the promise of a cleaner, reusable
implementation. Perhaps we can provide that.
Consider once again the issue at hand. The constructor, when called
conventionally with the \operator{new} operator to create a new instance, must
perform all of its construction logic. However, if we wish to use it as a
prototype, it is unlikely that we want to run \emph{any} of that logic --- we
are simply looking to have an object containing each of its members to use as a
prototype without the risk of modifying the prototype of the constructor in
question. Now consider how this issue is handled in other classical languages:
the \keyword{extend} keyword.
ECMAScript has no such keyword, so we will have to work on an implementation
ourselves. We cannot use the name \func{extend()}, as it is a reserved
name;\footnote{Perhaps for future versions of ECMAScript.} as such, we will
start with a simple \func{Class} factory function with which we can create new
``classes'' without supertypes. We can than provide a \func{Class.extend()}
method to define a ``class'' \emph{with} a supertype.
\lstinputlisting[%
label=lst:ctor-factory,
caption=Constructor factory,
lastline=60,
]{lst/ctor-factory.js}
\jsref{lst:ctor-factory} demonstrates one such possible implementation of a
constructor factory. Rather than thinking of ``creating a class'' and ``creating
a class with a supertype'' as two separate processes, it is helpful to consider
them one and the same; instead, we can consider the former to be ``creating a
class \emph{with an empty supertype}''. As such, invoking \func{Class()} simply
calls \func{Class.extend()} with \keyword{null} for the base (on line 6),
allowing \func{Class.extend()} to handle the creation of a new constructor
without a supertype.
Both \func{Class()} and \func{Class.extend()} accept a \var{dfn} argument, which
we will refer to as the \dfn{definition object}; this object is to contain each
member that will appear on the prototype of the new constructor. The \var{base}
parameter, defined on \func{Class.extend()}, denotes the constructor from which
to extend (the constructor that will be instantiated and used as the prototype).
Line 11 will default \var{base} to an empty function if one has not been
provided (mainly, to satisfy the \func{Class()} call on line 6).
With that, we can now continue onto creating our constructor, beginning on line
16. Section~\ref{sec:extending} introduced the concept of using an
\var{extending} flag to let the constructor know when to avoid all of its
construction logic if being used only as a prototype (see
\jsref{lst:ctor-extend}). The problem with this implementation, as discussed,
was that it required that \emph{each} constructor that wishes to use this
pattern implement it themselves, violating the DRY\footnote{``Don't repreat
yourself'', \emph{The Pragmatic Programmer}.} principle. There were two main
areas of code duplication in \jsref{lst:ctor-extend} --- the checking of the
\var{extending} flag in the constructor and the setting (and resetting) of the
flag in \func{F.asPrototype()}. In fact, we can eliminate the
\func{asPrototype()} method altogether once we recognize that its entire
purpose is to set the flags before and after instantiation.
To address the first code duplication issue --- the checking of the flag in the
constructor --- we must remove the need to perform the check manually for each
and every constructor. The solution, as demonstrated in
\jsref{lst:ctor-factory}, is to separate our generic constructor logic (shared
between all constructors that use the factory) from the logic that can vary
between each constructor. \var{ctor} on line 16 accomplishes this by first
performing the \var{extending} check (lines 19--22) and then forwarding all
arguments to a separate function (\func{\_\_construct()}), if defined, using
\func{Function.apply()} (lines 25--28). One could adopt any name for the
constructor method; it is not significant.\footnote{The \code{\_\_construct}
name was taken from PHP.} Note that the first argument to
\func{Function.apply()} is important, as it will ensure that \keyword{this} is
properly bound within the \func{\_\_construct()} method.
To address the second code duplication issue and remove the need for
\func{asPrototype()} in \jsref{lst:ctor-extend} entirely, we can take advantage
of the implications of \func{Class.extend()} in \jsref{lst:ctor-factory}. The
only time we wish to use a constructor as a prototype and skip
\func{\_\_construct()} is during the process of creating a new constructor. As
such, we can simply set the \var{extending} flag to \keyword{true} when we begin
creating the new constructor (see line 14, though this flag could be placed
anywhere before line 31) and then reset it to \keyword{false} once we are done
(line 38). With that, we have eliminated the code duplication issues associated
with \jsref{lst:ctor-extend}.
The remainder of \jsref{lst:ctor-factory} is simply an abstraction around the
manual process we have been performing since section~\ref{sec:proto} --- setting
the prototype, properly setting the constructor and extending the prototype
with our own methods. Recall section~\ref{sec:prot} in which we had to manually
assign each member of the prototype for subtypes in order to ensure that we did
not overwrite the existing prototype members (e.g. \func{M.prototype.push()} in
\jsref{lst:prot-share}). The very same issue applies here: Line 31 first sets
the prototype to an instance of \var{base}. If we were to then set
\code{ctor.prototype = dfn}, we would entirely overwrite the benefit gained from
specifying \var{base}. In order to automate this manual assignment of each
additional prototype member of \var{dfn}, \func{copyTo()} is provided, which
accepts two arguments --- a destination object \var{dest} to which each given
member of \var{members} should be copied (defined on line 43 and called on line
34).
Like the examples provided in section~\ref{sec:hack-around}, we
use a self-executing function to hide the implementation details of our
\func{Class} function from the rest of the world.
To demonstrate use of the constructor factory, \jsref{lst:ctor-factory-ex}
defines two classes\footnote{The reader should take care in noting that the term
``class'', as used henceforth, will refer to a class-like object created using
the systems defined within this article. ECMAScript does not support classes, so
the use of the term ``class'' in any other context is misleading.} --- \var{Foo}
and \var{SubFoo}. Note that how, by placing the curly braces on their own line,
we can create the illusion that \func{Class()} is a language construct:
\lstinputlisting[%
label=lst:ctor-factory-ex,
caption=Demonstrating the constructor factory,
firstline=62,
firstnumber=last
]{lst/ctor-factory.js}
The reader should note that an important assertion has been omitted for brevity
in \jsref{lst:ctor-factory}. Consider, for example, what may happen in the case
of the following:
\begin{verbatim}
Class.extend( "foo", {} );
\end{verbatim}
It is apparent that \code{"foo"} is not a function and therefore cannot be used
with the \keyword{new} keyword. Given that, consider line 31, which blindly
invokes \code{base()} without consideration for the very probable scenario that
the user mistakenly (due to their own unfamiliarity or a simple bug) provided us
with a non-constructor for \var{base}. The user would then be presented with a
valid, but not necessarily useful error --- did the error occur because of user
error, or due to a bug in the factory implementation?
To avoid confusion, it would be best to perform a simple assertion before
invoking \var{base} (or wrap the invocation in a try/catch block, although doing
so is not recommended in case \func{base()} throws an error of its own):
\begin{verbatim}
if ( typeof base !== 'function' )
{
throw TypeError( "Invalid base provided" );
}
\end{verbatim}
Note also that, although this implementation will work with any constructor as
\var{base}, only those created with \func{Class()} will have the benefit of
being able to check the \var{extending} flag. As such, when using
\func{Class.extend()} with third-party constructors, the issue of extensible
constructors may still remain and is left instead in the hands of the developer
of that base constructor.
\subsubsection{Factory Conveniences}
Although our constructor factory described in section~\ref{sec:ctor-factory} is
thus far very simple, one should take the time to realize what a powerful
abstraction has been created; it allows us to inject our own code in any part of
the constructor creation process, giving us full control over our class-like
objects. Indeed, this abstraction will be used as a strong foundation going
forward throughout all of section~\ref{sec:encap}. In the meantime, we can take
advantage of it in its infancy to provide a couple additional conveniences.
First, consider the syntax of \func{Class.extend()} in \jsref{lst:ctor-factory}.
It requires the extending of a constructor to be done in the following manner:
\begin{verbatim}
var SubFoo = Class.extend( Foo, {} );
\end{verbatim}
Would it not be more intuitive to instead be able to extend a constructor in the
following manner?
\begin{verbatim}
var SubFoo = Foo.extend( {} );
\end{verbatim}
The above two statements are semantically equivalent --- they define a subtype
\var{SubFoo} that extends from the constructor \var{Foo} --- but the latter
example is more concise and natural. Adding support for this method is trivial,
involving only a slight addition to \jsref{sec:ctor-factory}'s \func{C.extend()}
method, perhaps around line 30:
\lstinputlisting[%
label=lst:ctor-factory-sextend,
caption=Adding a static \func{extend()} method to constructors,
firstnumber=31
]{lst/ctor-factory-sextend.js}
Of course, one should be aware that this implementation is exploitable in that,
for example, \func{Foo.extend()} could be reassigned at any point. As such,
using \func{Class.extend()} is the safe implementation, unless you can be
certain that such a reassignment is not possible. Alternatively, in ECMAScript 5
and later environments, one can use \func{Object.defineProperty()}, as discussed
in sections~\ref{sec:encap-naive} and \ref{sec:encap-proper}, to make the method
read-only.
Now consider the instantiation of our class-like objects, as was demonstrated in
\jsref{lst:ctor-factory-ex}:
\begin{verbatim}
var inst = new Foo( "Name" );
\end{verbatim}
We can make our code even more concise by eliminating the \keyword{new} keyword
entirely, allowing us to create a new instance as such:
\begin{verbatim}
var inst = Foo( "Name" );
\end{verbatim}
Of course, our constructors do not yet support this, but why may we want such a
thing? Firstly, for consistency --- the core ECMAScript constructors do not
require the use of the keyword, as has been demonstrated throughout this article
with the various \var{Error} types. Secondly, the omission of the keyword would
allow us to jump immediately into calling a method on an object without dealing
with awkward precedence rules: \code{Foo( "Name" ).getName()} vs. \code{( new
Foo( "Name" ) ).getName()}. However, those reasons exit more to offer syntactic
sugar; they do little to persuade those who do want or not mind the
\keyword{new} keyword.
The stronger argument against the \keyword{new} keyword is what happens should
someone \emph{omit} it, which would not be at all uncommon since the keyword is
not required for the core ECMAScript constructors. Recall that \keyword{this},
from within the constructor, is bound to the new instance when invoked with the
\keyword{new} keyword. As such, we expect to be able to make assignments to
properties of \keyword{this} from within the constructor without any problems.
What, then, happens if the constructor is invoked \emph{without} the keyword?
\keyword{this} would instead be bound (according to the ECMAScript
standard\footnote{See ECMAScript Language Specification, ECMA-262 5.1 Edition,
Section 1.4.3 on pg 58.}) to ``the global object'',\footnote{In most browser
environments, the global object is \var{window}.} unless in strict mode. This is
dangerous:
\lstinputlisting[%
label=lst:new-global,
caption=Introducing unintended global side-effects with constructors
]{lst/new-global.js}
Consider \jsref{lst:new-global} above. Function \func{Foo()}, if invoked with
the \keyword{new} keyword, results in an object with a \var{Boolean} property
equal to \keyword{true}. However, if we were to invoke \func{Foo()}
\emph{without} the \keyword{new} keyword, this would end up \emph{overwriting
the built-in global \var{Boolean} object reference}. To solve this problem,
while at the same time providing the consistency and convenience of being able
to either include or omit the \keyword{new} keyword, we can add a small block of
code to our generated constructor \var{ctor} (somewhere around line 23 of
\jsref{lst:ctor-factory}, after the extend check but before
\func{\_\_construct()} is invoked):
\lstinputlisting[%
label=lst:new-global-fix,
caption=Allowing for omission of the \keyword{new} keyword,
firstnumber=24
]{lst/new-global-fix.js}
The check, as demonstrated in \jsref{lst:new-global-fix}, is as simple as
ensuring that \keyword{this} is properly bound to a \emph{new instance of our
constructor \var{ctor}}. If not, the constructor can simply return a new
instance of itself through a recursive call.
Alternatively, the reader may decide to throw an error instead of automatically
returning a new instance. This would require the use of the \keyword{new}
keyword, while still ensuring the global scope will not be polluted with
unnecessary values. If the constructor is in strict mode, then the error would
help to point out bugs in the code. However, for the reason that the keyword is
optional for many core ECMAScript constructors, the author recommends the
implementation in \jsref{lst:new-global-fix}.