coope/sec/hacking-proto.tex

957 lines
42 KiB
TeX

\section{Hacking Around Prototypal Limitations}
Section~\ref{sec:class-like} demonstrated how one would work within the
limitations of conventional ECMAScript to produce class-like objects using
prototypes. For those coming from other classical object-oriented languages,
these features are insufficient. In order to address many of the remaining
issues, more elaborate solutions are necessary.
It should be noted that all the hacks in this section will, in some way or
another, introduce additional overhead, although it should be minimal in
comparison with the remainder of the software that may implement them.
Performance considerations will be mentioned where the author finds it to be
appropriate. Do not let this concern deter you from using these solutions in
your own code --- always benchmark to determine where the bottleneck lies in
your software.
\subsection{Extensible Constructors: Revisited}
\label{sec:extending}
Section~\ref{sec:ext-ctor} discussed improving constructor design to allow for
extensibility and to improve performance. However, the solution presented did
not provide a consistent means of creating extensible constructors with, for
example, optional argument lists.
The only way to ensure that the constructor will bypass validation and
initialization logic only when used as a prototype is to somehow indicate that
it is being used as such. Since prototype assignment is in no way different than
any other assignment, no distinction can be made. As such, we must create our
own.
\begin{lstlisting}[%
label=lst:ctor-extend,
caption=Working around prototype extending issues
]
var Foo = ( function( extending )
{
var F = function( name )
{
if ( extending ) return;
if ( typeof name !== 'string' )
{
throw TypeError( "Invalid name" );
}
this.name = name || "Default";
// hypothetical; impl. left to reader
this.hash = createHash();
};
F.asPrototype = function()
{
extending = true;
var proto = new F();
extending = false;
return proto;
};
F.prototype = {
// getName(), etc...
};
return F;
} )( false );
function SubFoo() { /* ... */ }
SubFoo.prototype = Foo.asPrototype(); // OK
// ...
var foo1 = new Foo();
foo1.getName(); // "Default"
foo1.hash; // "..."
var foo2 = new Foo( "Bar" );
foo2.getName(); // "Bar"
foo2.hash; // "..."
\end{lstlisting}
One solution, as demonstrated in \jsref{lst:ctor-extend}, is to use a variable
(e.g. \var{extending}) to indicate to a constructor when it is being used to
extend a prototype. The constructor, acting as a closure, can then check the
value of this flag to determine whether or not to immediately return, avoiding
all construction logic. This implementation would allow us to return only a
prototype, which is precisely what we are looking for.
It is unlikely that we would want to expose \var{extending} directly for
modification, as this would involve manually setting the flag before requesting
the prototype, then remembering to reset it after we are done. Should the user
forget to reset the flag, all future calls to the constructor would continue to
ignore all constructor logic, which could lead to confusing bugs in the
software. To work around this issue, \jsref{lst:ctor-extend} offers an
\func{asPrototype()} method on \var{Foo} itself, which will set the flag, create
a new instance of \var{Foo}, reset the flag and return the new
instance.\footnote{In classical terms, \func{asPrototype()} can be thought of as
a static factory method of \var{Foo}.}
In order to cleanly encapsulate our extension logic, \var{Foo} is generated
within a self-executing function (using much the same concept as privileged
members in section~\ref{sec:privileged}, with a slightly different
application).\footnote{Self-executing functions are most often used to introduce
scope, allowing for the encapsulation of certain data. In this case, we
encapsulate our extension logic and return our constructor (assigned to \var{F}
within the self-executing function), which is then assigned to \var{Foo}. Note
the parenthesis immediately following the anonymous function, which invokes it
with a single argument to give \var{extending} a default value of \code{false}.
This pattern of encapsulation and exporting specific values is commonly referred
to as the \dfn{Module Pattern}.} This gives \var{Foo} complete control over when
its constructor logic should be ignored. Of course, one would not want to
duplicate this mess of code for each and every constructor they create.
Factoring this logic into a common, re-usable implementation will be discussed a
bit later as part of a class system (see section~\ref{sec:ctor-factory}).
\subsection{Encapsulating Data}
\label{sec:encap}
We discussed a basic means of encapsulation with privileged members in
section~\ref{sec:privileged}. Unfortunately, the solution, as demonstrated in
\jsref{lst:privileged}, involves redeclaring methods that could have otherwise
been defined within the prototype and shared between all instances. With that
goal in mind, let us consider how we may be able to share data for multiple
instances with a single method definition in the prototype.
We already know from \jsref{lst:ctor-extend} that we can truly encapsulate data
for a prototype within a self-executing function. Methods can then, acting as
closures, access that data that is otherwise inaccessible to the remainder of
the software. With that example, we concerned ourselves with only a single piece
of data --- the \var{extending} flag. This data has no regard for individual
instances (one could think of it as static data, in classical terms). Using
\jsref{lst:ctor-extend} as a starting point, we can build a system that will
keep track of data \emph{per-instance}. This data will be accessible to all
prototype members.
\subsubsection{A Naive Implementation}
\label{sec:encap-naive}
One approach to our problem involves to assigning each instance a unique
identifier (an ``instance id'', or \var{iid}). For our implementation, this
identifier will simply be defined as an integer that is incremented each time
the constructor is invoked.\footnote{There is, of course, a maximum
number of instances with this implementation. Once \var{iid} reaches
\var{Number.MAX\_NUMBER}, its next assignment will cause it to overflow to
\code{Number.POSITIVE\_INFINITY}. This number, however, can be rather large. On
one 64-bit system under v8, \code{Number.MAX\_NUMBER =
1.7976931348623157e+308.}} This instance id could be used as a key for a data
variable that stores data for each instance. Upon instantiation, the instance
id could be assigned to the new instance as a property (we'll worry about
methods of ``encapsulating'' this property later).
\begin{lstlisting}[%
label=lst:encap-naive,
caption=Encapsulating data with shared members (a naive implementation)
]
var Stack = ( function()
{
var idata = [],
iid = 0;
var S = function()
{
// assign a unique instance identifier
// to each instance
this.__iid = iid++;
idata[ this.__iid ] = {
stack: []
};
};
S.prototype = {
push: function( val )
{
idata[ this.__iid ]
.stack.push( val );
},
pop: function()
{
return idata[ this.__iid ]
.stack.pop();
}
};
return S;
} )();
var first = new Stack(),
second = new Stack();
first.push( "foo" );
second.push( "bar" );
first.pop(); // "foo"
second.pop(); // "bar"
\end{lstlisting}
\jsref{lst:encap-naive} demonstrates a possible stack implementation using the
principals that have just been described. Just like \jsref{lst:ctor-extend}, a
self-executing function is used to encapsulate our data and returns the
\var{Stack} constructor.\footnote{The reader should take note that we have
omitted our extensible constructor solution discussed in
section~\ref{sec:extending} for the sake of brevity.} In addition to the
instance id, the instance data is stored in the array \var{idata} (an array is
appropriate here since \var{iid} is sequential and numeric). \var{idata} will
store an object for each instance, each acting in place of \keyword{this}. Upon
instantiation, the private properties for the new instance are initialized using
the newly assigned instance id.
Because \var{idata} is not encapsulated within the constructor, we do not need
to use the concept of privileged members (see section~\ref{sec:privileged}); we
need only define the methods in such a way that \var{idata} is still within
scope. Fortunately, this allows us to define the methods on the prototype,
saving us method redeclarations with each call to the constructor, improving
overall performance.
This implementation comes at the expense of brevity and creates a diversion from
common ECMAScript convention when accessing data for a particular instance using
prototypes. Rather than having ECMAScript handle this lookup process for us, we
must do so manually. The only data stored on the instance itself (bound to
\keyword{this}) is the instance id, \var{iid}, which is used to look up the
actual members from \var{idata}. Indeed, this is the first concern --- this is a
considerable amount of boilerplate code to create separately for each prototype
wishing to encapsulate data in this manner.
An astute reader may raise concern over our \var{\_\_iid} assignment on each
instance. Firstly, although this name clearly states ``do not touch'' with its
double-underscore prefix,\footnote{Certain languages used double-underscore to
indicate something internal to the language or system. This also ensures the
name will not conflict with any private members that use the single-underscore
prefix convention.} the member is still public and enumerable.\footnote{The term
\dfn{enumerable} simply means that it can be returned by \keyword{foreach}.}
There is no reason why we should be advertising this internal data to the world.
Secondly, imagine what may happen if a user decides to alter the value of
\var{\_\_iid} for a given instance. Although such a modification would create
some fascinating (or even ``cool'') features, it could also wreak havoc on a
system and break encapsulation.\footnote{Consider that we know a stack is
encapsulated within another object. We could exploit this \var{\_\_iid}
vulnerability to gain access to the data of that encapsulated object as follows,
guessing or otherwise calculating the proper instance id: \code{( new Stack()
).\_\_iid = iid\_of\_encapsulated\_stack\_instance}.}
In environments supporting ECMAScript 5 and later, we can make the property
non-enumerable and read-only using \code{Object.defineProperty()} in place of
the \var{\_\_iid} assignment:
\begin{verbatim}
Object.defineProperty( this, '__iid', {
value: iid++,
writable: false,
enumerable: false,
configurable: false
} );
\end{verbatim}
The \var{configurable} property simply determines whether or not we can
re-configure a property in the future using \code{Object.defineProperty()}. It
should also be noted that each of the properties, with the exception of
\var{value}, default to \code{false}, so they may be omitted; they were included
here for clarity.
Of course, this solution leaves a couple loose ends: it will work only on
ECMAScript 5 and later environments (that have support for
\code{Object.defineProperty()}) and it still does not prevent someone from
spying on the instance id should they know the name of the property
(\var{\_\_iid}) ahead of time. However, we do need the instance id to be a
member of the instance itself for our lookup process to work properly.
At this point, many developers would draw the line and call the solution
satisfactory. An internal id, although unencapsulated, provides little room for
exploitation.\footnote{You could get a total count of the number of instances of
a particular prototype, but not much else.} For the sake of discussion and the
development of a more concrete implementation, let us consider a potential
workaround for this issue.
For pre-ES5\footnote{Hereinafter, ECMAScript 5 and ES5 will be used
interchangably.} environments, there will be no concrete solution, since all
properties will always be enumerable. However, we can make it more difficult by
randomizing the name of the \var{\_\_iid} property, which would require that the
user filter out all known properties or guess at the name. In ES5+ environments,
this would effectively eliminate the problem entirely,\footnote{Of course,
debuggers are always an option. There is also the possibility of exploiting
weaknesses in a random name implementation; one can never completely eliminate
the issue.} since the property name cannot be discovered or be known beforehand.
Consider, then, the addition of another variable within the self-executing
function --- \var{iid\_name} --- which we could set to some random value (the
implementation of which we will leave to the reader). Then, when initializing or
accessing values, one would use the syntax:
\begin{verbatim}
idata[ this[ iid_name ] ].stack // ...
\end{verbatim}
Of course, this introduces additional overhead, although it is likely to be
negligible in comparison with the rest of the software.
With that, we have contrived a solution to our encapsulation problem.
Unfortunately, as the title of this section indicates, this implementation is
naive to a very important consideration --- memory consumption. The problem is
indeed so severe that this solution cannot possibly be recommended in practice,
although the core concepts have been an excellent experiment in ingenuity and
have provided a strong foundation on which to expand.\footnote{It is my hope
that the title of this section will help to encourage those readers that simply
skim for code to continue reading and consider the flaws of the design rather
than adopting them.}
\subsubsection{A Proper Implementation}
\label{sec:encap-proper}
Section~\ref{sec:encap-naive} proposed an implementation that would permit the
true encapsulation of instance data, addressing the performance issues
demonstrated in \jsref{lst:privileged}. Unfortunately, the solution offered in
\jsref{lst:encap-naive} is prone to terrible memory leaks. In order to
understand why, we must first understand, on a very basic level, how garbage
collection (GC) is commonly implemented in environments that support ECMAScript.
\dfn{Garbage collection} refers to an automatic cleaning of data (and
subsequent freeing of memory, details of which vary between implementations)
that is no longer ``used''. Rather than languages like C that require manual
allocation and freeing of memory, the various engines that implement ECMAScript
handle this process for us, allowing the developer to focus on the task at hand
rather than developing a memory management system for each piece of software.
Garbage collection can be a wonderful thing in most circumstances, but one must
understand how it recognizes data that is no longer being ``used'' in order to
ensure that the memory is properly freed. If data lingers in memory in such a
way that the software will not access it again and that the garbage collector is
not aware that the data can be freed, this is referred to as a \dfn{memory
leak}.\footnote{The term ``memory leak'' implies different details depending on
context --- in this case, it varies between languages. A memory leak in C is
handled much differently than a memory leak in ECMAScript environments. Indeed,
memory leaks in systems with garbage collectors could also be caused by bugs in
the GC process itself, although this is not the case here.}
One method employed by garbage collectors is reference counting; when an object
is initially created, the reference count is set to one. When a reference to
that object is stored in another variable, that count is incremented by one.
When a variable containing a reference to a particular object falls out of
scope, is deleted, or has the value reassigned, the reference count is
decremented by one. Once the reference count reaches zero, it is scheduled for
garbage collection.\footnote{What happens after this point is
implementation-defined.} The concept is simple, but is complicated by the use of
closures. When an object is referenced within a closure, or even has the
\emph{potential} to be referenced through another object, it cannot be garbage
collected.
In the case of \jsref{lst:encap-naive}, consider \var{idata}. With each new
instance, \var{iid} is incremented and an associated entry added to \var{idata}.
The problem is --- ECMAScript does not have destructor support. Since we cannot
tell when our object is GC'd, we cannot free the \var{idata} entry. Because each
and every object within \var{idata} has the \emph{potential} to be referenced at
some point in the future, even though our implementation does not allow for it,
it cannot be garbage collected. The reference count for each index of
\var{idata} will forever be $\geq 1$.
To resolve this issue without altering this implementation, there is only one
solution --- to offer a method to call to manually mark the object as destroyed.
This defeats the purpose of garbage collection and is an unacceptable solution.
Therefore, our naive implementation contains a fatal design flaw. This extends
naturally into another question --- how do we work with garbage collection to
automatically free the data for us?
The answer to this question is already known from nearly all of our prior
prototype examples. Unfortunately, it is an answer that we have been attempting
to work around in order to enforce encapsulation --- storing the data on the
instance itself. By doing so, the data is automatically freed (if the reference
count is zero, of course) when the instance itself is freed. Indeed, we have hit
a wall due to our inability to explicitly tell the garbage collector when the
data should be freed.\footnote{There may be an implementation out there
somewhere that does allow this, or a library that can interface with the garbage
collector. However, it would not be portable.} The solution is to find a common
ground between \jsref{lst:privileged} and \jsref{lst:encap-naive}.
Recall our original goal --- to shy away from the negative performance impact of
privileged members without exposing each of our private members as public. Our
discussion has already revealed that we are forced to store our data on the
instance itself to ensure that it will be properly freed by the garbage
collector once the reference count reaches zero. Recall that
section~\ref{sec:encap-naive} provided us with a number of means of making our
only public member, \var{\_\_iid}, considerably more difficult to access, even
though it was not fully encapsulated. This same concept can be applied to our
instance data.
\begin{lstlisting}[%
label=lst:encap-inst,
caption=Encapsulating data on the instance itself (see also
\jsref{lst:encap-naive})
]
var Stack = ( function()
{
// implementation left to reader
var _privname = genRandomName();
var S = function()
{
Object.defineProperty( this, _privname, {
enumerable: false,
writable: false,
configurable: false,
value: {
stack: []
}
} );
};
S.prototype = {
push: function( val )
{
this[ _privname ].stack.push( val );
},
pop: function()
{
return this[ _privname ].stack.pop();
}
};
return S;
} )();
\end{lstlisting}
\jsref{lst:encap-inst} uses a random, non-enumerable property to make the
discovery of the private data considerably more difficult.\footnote{The property
is also read-only, but that does not necessarily aid encapsulation. It prevents
the object itself from being reassigned, but not any of its members.} The random
name, \var{\_privname}, is used in each of the prototypes to look up the data on
the appropriate instance (e.g. \code{this[ \_privname ].stack} in place of
\code{this.stack}).\footnote{One may decide that the random name is unnecessary
overhead. However, note that static names would permit looking up the data if
the name is known ahead of time.} This has the same effect as
\jsref{lst:encap-naive}, with the exception that it is a bit easier to follow
without the instance management code and that it does not suffer from memory
leaks due to GC issues.
Of course, this implementation depends on features introduced in ECMAScript 5
--- namely, \code{Object.defineProperty()}, as introduced in
section~\ref{sec:encap-naive}. In order to support pre-ES5 environments, we
could define our own fallback \func{defineProperty()} method by directly
altering \var{Object},\footnote{The only circumstance I ever recommend modifying
built-in bojects/prototypes is to aid in backward compatibility; it is otherwise
a very poor practice that creates tightly coupled, unportable code.} as
demonstrated in \jsref{lst:defprop}.
\begin{lstlisting}[%
label=lst:defprop,
caption=A fallback \code{Object.defineProperty()} implementation
]
Object.defineProperty = Object.defineProperty
|| function( obj, name, config )
{
obj[ name ] = config.value;
};
\end{lstlisting}
Unfortunately, a fallback implementation is not quite so simple. Certain
dialects may only partially implement \code{Object.createProperty()}. In
particular, I am referring to Internet Explorer 8's incomplete
implementation.\footnote{IE8's dialect is JScript.} Surprisingly, IE8 only
supports this action on DOM elements, not all objects. This puts us in a
terribly awkward situation --- the method is defined, but the implementation is
``broken''. As such, our simple and fairly concise solution in
\jsref{lst:defprop} is insufficient. Instead, we need to perform a more
complicated check to ensure that not only is the method defined, but also
functional for our particular uses. This check is demonstrated in
\jsref{lst:defprop-check}, resulting in a boolean value which can be used to
determine whether or not the fallback in \jsref{lst:defprop} is necessary.
\begin{lstlisting}[%
label=lst:defprop-check,
caption=Working around IE8's incomplete \code{Object.defineProperty()}
implementation (taken from ease.js)
]
var can_define_prop = ( function()
{
try
{
Object.defineProperty( {}, 'x', {} );
}
catch ( e ) { return false; }
return true;
} )();
\end{lstlisting}
This function performs two checks simultaneously --- it first checks to see if
\code{Object.defineProperty()} exists and then ensures that we are not using
IE8's broken implementation. If the invocation fails, that will mean that the
method does not exist (or is not properly defined), throwing an exception which
will immediately return false. If attempting to define a property using this
method on a non-DOM object in IE8, an exception will also be thrown, returning
false. Therefore, we can simply attempt to define a property on an empty object.
If this action succeeds, then \code{Object.defineProperty()} is assumed to be
sufficiently supported. The entire process is enclosed in a self-executing
function to ensure that the check is performed only once, rather than a function
that performs the check each time it is called. The merriment of this result to
\jsref{lst:defprop} is trivial and is left to the reader.
It is clear from this fallback, however, that our property is enumerable in
pre-ES5 environments. At this point, a random property name would not be all
that helpful and the reader may decide to avoid the random implementation in its
entirety.
\subsubsection{Private Methods}
\label{sec:priv-methods}
Thus far, we have been dealing almost exclusively with the issue of
encapsulating properties. Let us now shift our focus to the encapsulation of
other private members, namely methods (although this could just as easily be
applied to getters/setters in ES5+ environments). Private methods are actually
considerably easier to conceptualize, because the data does not vary between
instances --- a method is a method and is shared between all instances. As such,
we do not have to worry about the memory management issues addressed in
section~\ref{sec:encap-proper}.
Encapsulating private members would simply imply moving the members outside of
the public prototype (that is, \code{Stack.prototype}). One would conventionally
implement private methods using privileged members (as in
section~\ref{sec:privileged}), but it is certainly pointless redefining the
methods for each instance, since \jsref{lst:encap-inst} provided us with a means
of accessing private data from within the public prototype. Since the
self-executing function introduces scope for our private data (instead of the
constructor), we do not need to redefine the methods for each new instance.
Instead, we can create what can be considered a second, private prototype.
\begin{lstlisting}[%
label=lst:method-priv,
caption=Implementing shared private methods without privileged members
]
var Stack = ( function()
{
var _privname = getRandomName();
var S = function()
{
// ... (see previous examples)
};
var priv_methods = {
getStack: function()
{
return this[ _privname ].stack;
}
};
S.prototype = {
push: function( val )
{
var stack = priv_methods.getStack
.call( this );
stack.push( val );
},
pop: function()
{
var stack = priv_methods.getStack
.call( this )
return stack.pop( val );
}
};
return S;
} )();
\end{lstlisting}
\jsref{lst:method-priv} illustrates this concept of a private
prototype.\footnote{Alternatively, to reduce code at the expense of clarity, one
could simply define functions within the closure to act as private methods
without assigning them to \var{priv\_methods}. Note that \func{call()} is still
necessary in that situation.} The object \var{priv\_methods} acts as a second
prototype containing all members that are private and shared between all
instances, much like the conventional prototype. \code{Stack.prototype} then
includes only the members that are intended to be public. In this case, we have
defined a single private method --- \func{getStack()}.
Recall how \keyword{this} is bound automatically for prototype methods (see
section~\ref{sec:proto}). ECMAScript is able to do this for us because of the
standardized \var{prototype} property. For our private methods, we have no such
luxury. Therefore, we are required to bind \keyword{this} to the proper object
ourselves through the use of \code{Function.call()} (or, alternatively,
\code{Function.apply()}). The first argument passed to \func{call()} is the
object to which \keyword{this} will be bound, which we will refer to as the
\dfn{context}. This, unfortunately, increases the verbosity of private method
calls, but successfully provides us with a private prototype implementation.
Since private members needn't be inherited by subtypes, no additional work needs
to be done.
\subsection{Protected Members}
We have thus far covered two of the three access modifiers (see
section~\ref{sec:encap}) --- public and private. Those implementations allowed
us to remain blissfully ignorant of inheritance, as public members are handled
automatically by ECMAScript and private members are never inherited by subtypes.
The concept of protected members is a bit more of an interesting study since it
requires that we put thought into providing subtypes with access to encapsulated
data, \emph{without} exposing this data to the rest of the world.
From an implementation perspective, we can think of protected members much like
private; they cannot be part of the public prototype, so they must exist in
their own protected prototype and protected instance object. The only difference
here is that we need a way to expose this data to subtypes. This is an issue
complicated by our random name implementation (see
section~\ref{sec:encap-proper}); without it, subtypes would be able to access
protected members of its parent simply by accessing a standardized property
name. The problem with that is --- if subtypes can do it, so can other,
completely unrelated objects. As such, we will focus on a solution that works in
conjunction with our randomized name (an implementation with a standardized
name is trivial).
In order for the data to remain encapsulated, the name must too remain
encapsulated. This means that the subtype cannot request the name from the
parent; instead, we must either have access to the random name or we must
\emph{tell} the parent what the name should be. The latter will not work
per-instance with the implementation described in
section~\ref{sec:encap-proper}, as the methods are not redefined per-instance
and therefore must share a common name. Let us therefore first consider the
simpler of options --- sharing a common protected name between the two classes.
\begin{lstlisting}[%
label=lst:prot-share,
caption=Sharing protected members with subtypes
]
var _protname = getRandomName();
var Stack = ( function()
{
var _privname = getRandomName();
var S = function()
{
// ... (see previous examples)
Object.defineProperty( this, _privname, {
value: { stack: [] }
} );
Object.defineProperty( this, _protname, {
value: { empty: false }
} );
};
// a means of sharing protected methods
Object.defineProperty( S, _protname, {
getStack: function()
{
return this[ _privname ].stack;
}
} );
S.prototype = {
push: function( val )
{
var stack = S[ _protname ].getStack
.call( this );
stack.push( val );
this[ _protname ].empty = false;
},
pop: function()
{
var stack = this[ _protname ]
.getStack.call( this )
this[ _protname ].empty =
( stack.length === 0 );
return stack.pop( val );
}
};
S.asPrototype = function()
{
// ... (see previous examples)
};
return S;
} )();
var MaxStack = ( function()
{
var M = function( max )
{
// call parent constructor
Stack.call( this );
// we could add to our protected members
// (in practice, this would be private, not
// protected)
this[ _protname ].max = +max;
};
// override push
M.prototype.push = function( val )
{
var stack = Stack[ _protname ].getStack
.call( this );
if ( stack.length ===
this[ _protname ].max
)
{
throw Error( "Maximum reached." );
};
// call parent method
Stack.prototype.push.call( this, val );
};
// add a new method demonstrating parent
// protected property access
M.prototype.isEmpty = function()
{
return this[ _protname ].empty;
};
M.prototype = Stack.asPrototype();
M.prototype.constructor = M;
return M;
} )();
var max = new MaxStack( 2 );
max.push( "foo" );
max.push( "bar" );
max.push( "baz" ); // Error
max.pop(); // "bar"
max.pop(); // "foo"
\end{lstlisting}
\jsref{lst:prot-share} makes an attempt to demonstrate a protected property and
method implementation while still maintaining the distinction between it and the
private member implementation (see section~\ref{sec:encap-proper}). The example
contains two separate constructors --- \var{Stack} and \var{MaxStack}, the
latter of which extends \var{Stack} to limit the number of items that may be
pushed to it. \var{Stack} has been modified to include a protected property
\var{empty}, which will be set to \code{true} when the stack contains no items,
and a protected method \var{getStack()}, which both \var{Stack} and its subtype
\var{MaxStack} may use to access the private property \var{stack} of
\var{Stack}.
The key part of this implementation is the declaration of \var{\_protname}
within the scope of both types (\var{Stack} and \var{MaxStack}).\footnote{One
would be wise to enclose all of \jsref{lst:prot-share} within a function to
prevent \var{\_protname} from being used elsewhere, exporting \var{Stack} and
\var{MaxStack} however the reader decides.} This declaration allows both
prototypes to access the protected properties just as we would the private
data. Note that \var{\_privname} is still defined individually within each type,
as this data is unique to each.
Protected methods, however, need additional consideration. Private methods, when
defined within the self-executing function that returns the constructor, work
fine when called from within the associated prototype (see
section~\ref{sec:priv-methods}). However, since they're completely encapsulated,
we cannot use the same concept for protected methods --- the subtype would not
have access to the methods. Our two options are to either declare the protected
members outside of the self-executing function (as we do \var{\_privname}), which
makes little organizational sense, or to define the protected members on the
constructor itself using \var{\_protname} and
\code{Object.defineProperty()}\footnote{See section~\ref{sec:encap-proper} for
\code{Object.defineProperty()} workarounds/considerations.} to encapsulate it
the best we can. We can then use the shared \var{\_protname} to access the
methods on \var{Stack}, unknown to the rest of the world.
An astute reader may realize that \jsref{lst:prot-share} does not permit the
addition of protected methods without also modifying the protected methods of
the supertype and all other subtypes; this is the same reason we assign new
instances of constructors to the \var{prototype} property. Additionally,
accessing a protected method further requires referencing the same constructor
on which it was defined. Fixing this implementation is left as an exercise to
the reader.
Of course, there is another glaring problem with this implementation --- what
happens if we wish to extend one of our prototypes, but are not within the scope
of \var{\_protname} (which would be the case if you are using
\jsref{lst:prot-share} as a library, for example)? With this implementation,
that is not possible. As such, \jsref{lst:prot-share} is not recommended unless
you intended to have your prototypes act like final classes.\footnote{A
\dfn{final} class cannot be extended.} As this will not always be the case, we
must put additional thought into the development of a solution that allows
extending class-like objects with protected members outside of the scope of the
protected name \var{\_protname}.
As we already discussed, we cannot request the protected member name from the
parent, as that will provide a means to exploit the implementation and gain
access to the protected members, thereby breaking encapsulation. Another
aforementioned option was \emph{telling} the parent what protected member name
to use, perhaps through the use of \func{asPrototype()} (see
section~\ref{sec:extending}). This is an option for protected \emph{properties},
as they are initialized with each new instance, however it is not a clean
implementation for \emph{members}, as they have already been defined on the
constructor with the existing \var{\_protname}. Passing an alternative name
would result in something akin to:
\begin{verbatim}
Object.defineProperty( S, _newname, {
value: S[ _protname ]
} );
\end{verbatim}
This would quickly accumulate many separate protected member references on the
constructor --- one for each subtype. As such, this implementation is also left
as an exercise for an interested reader; we will not explore it
further.\footnote{The reader is encouraged to attempt this implementation to
gain a better understanding of the concept. However, the author cannot recommend
its use in a production environment.}
The second option is to avoid exposing protected property names entirely. This
can be done by defining a function that can expose the protected method object.
This method would use a system-wide protected member name to determine what
objects to return, but would never expose the name --- only the object
references. However, this does little to help us with our protected properties,
as a reference to that object cannot be returned until instantiation. As such,
one could use a partial implementation of the previously suggested
implementation in which one provides the protected member name to the parent(s).
Since the protected members would be returned, the duplicate reference issue
will be averted.
The simplest means of demonstrating this concept is to define a function that
accepts a callback to be invoked with the protected method object. A more
elegant implementation will be described in future sections, so a full
implementation is also left as an exercise to the reader. \jsref{lst:prot-func}
illustrates a skeleton implementation.\footnote{Should the reader decide to take
up this exercise, keep in mind that the implementation should also work with
multiple supertypes (that is, type3 extends type2 extends type1).} The
\func{def} function accepts the aforementioned callback with an optional first
argument --- \var{base} --- from which to retrieve the protected methods.
\begin{lstlisting}[%
label=lst:prot-func,
caption=Exposing protected methods with a callback (brief illustration; full
implementation left as an exercise for the reader)
]
var def = ( function()
{
var _protname = getRandomName();
return function( base, callback )
{
var args = Array.prototype.slice.call(
arguments
),
callback = args.pop(),
base = args.pop() || {};
return callback( base[ _protname ] );
};
} )();
var Stack = def( function( protm )
{
// ...
return S;
} );
var MaxStack = def( Stack, function( protm )
{
// for properties only
var _protname = getRandomName();
// ...
// asPrototype() would accept the protected
// member name
M.protoype = S.asPrototype( _protname );
M.prototype.constructor = M;
return M;
} );
\end{lstlisting}
\subsubsection{Protected Member Encapsulation Challenges}
Unfortunately, the aforementioned implementations do not change a simple fact
--- protected members are open to exploitation, unless the prototype containing
them cannot be extended outside of the library/implementation. Specifically,
there is nothing to prevent a user from extending the prototype and defining a
property or method to return the encapsulated members.
Consider the implementation described in \jsref{lst:prot-func}. We could define
another subtype, \var{ExploitedStack}, as shown in \jsref{lst:prot-exploit}.
This malicious type exploits our implementation by defining two methods ---
\func{getProtectedProps()} and \func{getProtectedMethods()} --- that return
the otherwise encapsulated data.
\begin{lstlisting}%
[label=lst:prot-exploit,
caption=Exploiting \jsref{lst:prot-func} by returning protected members.
]
var ExploitedStack = def( Stack, function( protm )
{
var _protname = getRandomName();
var E = function() { /* ... */ };
E.prototype.getProtectedProps = function()
{
return this[ _protname ];
}:
E.prototype.getProtectedMethods = function()
{
return protm;
};
E.prototype = Stack.asPrototype( _protname );
E.prototype.constructor = E;
return E;
} )();
\end{lstlisting}
Fortunately, our random \var{\_protname} implementation will only permit
returning data for the protected members of that particular instance. Had we not
used random names, there is a chance that an object could be passed to
\func{getProtectedProps()} and have its protected properties
returned.\footnote{Details depend on implementation. If a global protected
property name is used, this is trivial. Otherwise, it could be circumstantial
--- a matching name would have to be guessed, known, or happen by chance.} As
such, this property exploit is minimal and would only hurt that particular
instance. There could be an issue if supertypes contain sensitive protected
data, but this is an implementation issue (sensitive data should instead be
private).
Methods, however, are a more considerable issue. Since the object exposed via
\func{def()} is \emph{shared} between each of the instances, much like its
parent prototype is, it can be used to exploit each and every instance (even if
the reader has amended \jsref{lst:prot-share} to resolve the aforementioned
protected member addition bug, since \code{Object.getPrototypeOf()} can be
used to work around this amendment). Someone could, for example, reassign
\code{Stack[ \_protname ].getStack()} to do something else;
\var{Object.defineProperty()} in \jsref{lst:prot-share} only made \code{Stack[
\_protname ]} \emph{itself} read-only. The object itself, however, can be
modified. This can be amended by using \code{Object.defineProperty()} for each
and every protected method, which is highly verbose and cumbersome.
Once we rule out the ability to modify protected method definitions,\footnote{Of
course, this doesn't secure the members in pre-ES5 environments.} we still must
deal with the issue of having our protected methods exposed and callable. For
example, one could do the following to gain access to the private \var{stack}
object:
\begin{verbatim}
( new ExploitedStack() ).getProtectedMethods()
.getStack.call( some_stack_instance );
\end{verbatim}
Unfortunately, there is little we can do about this type of exploit besides
either binding\footnote{See \func{Function.bind()}.} each method call (which
would introduce additional overhead per instance) or entirely preventing the
extension of our prototypes outside of our own library/software. By creating a
protected API, you are exposing certain aspects of your prototype to the rest of
the world; this likely breaks encapsulation and, in itself, is often considered
a poor practice.\footnote{\code{Stack.getStack()} breaks encapsulation because
it exposes the private member \var{stack} to subtypes.} An alternative is to
avoid inheritance altogether and instead favor composition, thereby evading this
issue entirely. That is a pretty attractive concept, considering how verbose and
messy this protected hack has been.