1
0
Fork 0
liza/doc/bucket.texi

149 lines
5.1 KiB
Plaintext

@c This document is part of the Liza Data Collection Framework manual.
@c Copyright (C) 2017 R-T Specialty, LLC.
@c
@c Permission is granted to copy, distribute and/or modify this document
@c under the terms of the GNU Free Documentation License, Version 1.3
@c or any later version published by the Free Software Foundation;
@c with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
@c Texts. A copy of the license is included in the section entitled ``GNU
@c Free Documentation License''.
@node Bucket
@chapter Bucket
@helpwanted
@menu
* Value Assignment:Bucket Assignment. Writing data to the Bucket.
* Bucket Diff:: Representing bucket changes.
* Calculated Values:: Dynamic data derived from other values.
* Metabucket:: Bucket holding document metadata
@end menu
@c TODO
@node Bucket Assignment
@section Bucket Value Assignment
@helpwanted
@node Bucket Diff
@section Bucket Diff
@cindex Bucket diff
Changes to the bucket are represented by an array with certain conventions:
@enumerate
@item A change to some index@tie{}@math{k} is represented by the same
index@tie{}@math{k} in the diff.
@item A value of @code{undefined} indicates that the respective index
has not changed.
Holes in the array (indexes not assigned any value) are treated in
the same manner as @code{undefined}.
@item A @code{null} in the last index of the vector marks a
truncation@tie{}point;
it is used to delete one or more indexes.
The vector will be truncated at that point.
Any preceding @code{null} values are treated as if they were
@code{undefined}.@footnote{
The reason for this seemingly redundant (and inconvenient) choice
is that JSON encodes @code{undefined} values as @code{null}.
Consequently, when serializing a diff, @code{undefined}s are lost.
To address this,
any @code{null} that is not in the tail position is treated as
@code{undefined}.
We cannot truncate at the first null,
because @samp{[null,null,null]} may actually represent
@samp{[undefined,undefined,null]}.}
@end enumerate
Diffs are only tracked at the vector (array of scalars) level@mdash{
}if there is a change to a nested structure assigned to an index,
that index of the outer vector will be tracked as modified.
It will, however, recursively compare nested changes to determine if a
modification @emph{has taken place}.@footnote{
See @srcrefjs{bucket,StagingBucket} method @code{#_parseChanges}.}
Examples appear in @ref{f:diff-ex}.
@float Figure, f:diff-ex
@multitable @columnfractions 0.30 0.30 0.40
@headitem Original @tab Diff @tab Interpretation
@item @samp{["foo", "bar"]}
@tab @samp{["baz", "quux"]}
@tab Index@tie{}0 changed to @samp{baz}.
Index@tie{}1 changed to @samp{quux}.
@item @samp{["foo", "bar"]}
@tab @samp{[undefined, "quux"]}
@tab Index@tie{}0 did not change.
Index@tie{}1 changed to @samp{quux}.
@item @samp{["foo", "bar"]}
@tab @samp{[, "quux"]}
@tab Index@tie{}0 did not change.
Index@tie{}1 changed to @samp{quux}.
@item @samp{["foo", "bar"]}
@tab @samp{["baz", null]}
@tab Index@tie{}0 changed to @samp{baz}.
Index@tie{}1 was removed.
@item @samp{["foo", "bar", "baz"]}
@tab @samp{[undefined, null]}
@tab Index@tie{}0 was not changed.
Index@tie{}1 was removed.
Index@tie{}2 was removed.
@item @samp{["foo", "bar", "baz"]}
@tab @samp{[null, undefined, null]}
@tab Index@tie{}0 was not changed.
Index@tie{}1 was not changed.
Index@tie{}2 was removed.
@item @samp{["foo", "bar", "baz"]}
@tab @samp{[null, null, null]}
@tab Index@tie{}0 was not changed.
Index@tie{}1 was not changed.
Index@tie{}2 was removed.
@end multitable
@caption{Bucket diff examples.}
@end float
Diffs are generated by @srcrefjs{bucket,StagingBucket}.
@code{null} truncation is understood both by @code{StagingBucket}
and by @srcrefjs{bucket,QuoteDataBucket}.
A diff is applied to the underlying bucket by invoking
@code{StagingBucket#commit}.
@node Calculated Values
@section Calculated Values
@helpwanted
@node Metabucket
@section Metabucket
@cindex Metabucket
The @dfn{metabucket} is a loosely-structured key/value store
separate from the data bucket.@footnote{
It is stored in the @code{meta} field on the Mongo document.}
It should be used to save data that should be accessible only to the server,
but never the client.
The client has no means by which to access the metabucket.
Custom fields can be populated by server-side DataAPIs
(@pxref{Server-Side Data API Calls}).
Any fields prefixed with the string @samp{liza_} are reserved and are
populated automatically by the Server.
They are shown in @ref{t:liza-meta}.
@float Table, t:liza-meta
@table @code
@cindex Initial rated date
@item liza_timestamp_initial_rated
A Unix timestamp representing the first time a document was acted
upon by a rating service.
This value is set once and is never updated or cleared.
@end table
@caption{Metabucket fields populated automatically by the Server}
@end float