Runtime Dependency Tracking with Causality

Runtime Dependency Tracking
Using the Causality Design Pattern

Written 06/08/1999.
Revised 03/15/2000.
©1999 and ©2000 by Eliezer S. Yudkowsky

[This was an article, originally written for a programming journal, which I never got around to submitting. It appears here as part of the information on Flare. I've found the Causality design pattern to be extremely useful, and it will almost certainly be included in Flare.]

Pattern:

Consider the problem of change propagation in a system where the network of dependencies is not known to the programmer - where even the general pattern of dependencies is not known to the programmer. Perhaps the user puts active objects together in arbitrary order; or the code is part of a library that will be used by complete strangers; or the program is just too huge for any one human to understand, and continually changing as well (role-playing games, tax codes, billing systems).

The Causality design pattern solves this problem using three independent sets of local, modular declarations: Declare the result being computed; declare reliances on data; and declare changes to the data. The system tracks dependencies during the computation, creating links between the relied-on data and the currently computing object. After the computation is finished, the system has a record it uses to propagate changes in any piece of data to all dependent objects.

Walkthrough:

Example 1 shows how Causality can be used to implement a many-to-many Model-View-Controller system in nine lines of code. At this point in the program, the user is editing an instance of the class MyObject; this is the Model. Any editor that targets the Model displays the title of the object, perhaps as a window title or a caption. In this case, one class of editor is GUIObjectViewer; the instances of this class are the Viewers. Finally, some editors create a slider control that operates on the "mValue" instance member of the Model; these sliders are the Controls.

Example 1:

void GUIObjectViewer::Compute ()
{
   // A1: Make us the "currently computing object".
   globalEffectStack.push(this);

   // A2: Ask our Model to compute a user-readable title,
   // copying the value into one of our instance members.
   targetedObject->GetTitle(mCaptionString);

   // A5: Tell the UI system that this component needs
   // to be redrawn during the next update.
   RequestRedraw();
   
   // A6: Mark our causal state as "valid".
   SetEffectValid(true);
   
   // A7: Restore previous "currently computing object".
   globalEffectStack.pop();
   
} // end GUIObjectViewer::Compute

void MyObject::GetTitle (String& outString)
{
   // A3: Print our class and the value of an instance member.
   outString = "MyObject, value=" + tostr(mValue);

   // A4: Tell the global causality system that someone has just
   // performed a computation which relies on a certain area of memory.
   Causality::RelyOnPointer(&mValue);
   
} // end MyObject::GetTitle

void GUISliderControl::UserSetValue (float inNewValue)
{
   // A graphical slider control (which acts on an arbitrary
   // pointer to a floating-point number) is manipulated by the user,
   // resulting in a call to UserSetValue with the new value.

   // B1:  The actual number is set to the new value.
   *mValuePtr = inNewValue;

   // B2: Tell the global causality system that someone has just
   // altered a certain area of memory.
   Causality::AffectPointer(mValuePtr);    

} // end GUISliderControl::UserSetValue

GUIObjectViewer inherits from Effect. Effect is the base class for objects that can perform causally dependent computations, and the base class for objects that can be marked as invalid by the causality system. Compute() is the method that performs the computation; it is declared as a virtual method in the parent Effect class so that the causality system can queue invalidated objects and automatically call the Compute() method.

MyObject inherits from the generic Model class that defines the GetTitle() method, which computes the human-readable title. GUIObjectViewer can target any descendant of Model. MyObject doesn't need to inherit from any other class.

GUISliderControl doesn't need to inherit from any other class. In a many-to-many editing system it would probably inherit from Effect, so that changes to the model made by other editors would be reflected in the slider display.

The first flow of control, "A", begins in GUIObjectViewer::Compute(). This method will usually be called in one of two cases:

The GUIControlViewer, when asked to draw itself, checks to see if its state is currently valid, and if not, calls the Compute method to update itself.
The GUIControlViewer, when invalidated by the causality system due to a change in relied-on data, is added to a global queue of Effects. At some later point, the Causality package calls the Compute() method on each object in the queue; or rather, it calls the Compute() method on each object which remains in an invalid state.

The first statement in the Compute method, statement A1, pushes the GUIObjectViewer instance onto a global stack of pointers to Effect instances. When any reliance is noted, the causality system will attach that reliance to the top item on the stack. After pushing itself onto the stack, the Viewer instance will be the "currently computing object" unless it asks another Effect, in an invalid state, for a value which requires that Effect to call Compute() and validate itself, in which case that sub-Effect becomes the "currently computing object" until the sub-Effect finishes computing, pops itself off the global Effect stack and restores the Viewer instance as the current computing object.

This is admittedly not a robust method. Later in the article I talk about multithreading and exception handling. The code for a sample Causality package, available online, handles multithreading transparently, behaves properly in the case of exceptions thrown across Compute() methods, detects infinite recursion, maintains correct queue state despite multiply re-validated or re-invalidated objects, and uses separate Compute() and ComputeSelf() methods to avoid giving the programmer the chance to forget the call that marks the Effect as valid (a frequent headache from my early days).

Statement A2 begins the actual computation - it asks the targeted Model, which happens to be an instance of MyObject, to produce a user-readable title for the Viewer's caption. Note that this call does not have to be direct; because the causal variables are global, this call can pass through any number of intervening, non-Causality-aware functions. This is a very important characteristic which preserves encapsulation (and hence, modularity), much like C++'s exception-handling feature.

Statement A3 is where the actual dependency occurs. The title-string produced by the MyObject instance is dependent on the value of the instance member mValue.

Statement A4 declares this dependency by invoking the Causality::RelyOnPointer() function on a pointer to the instance variable "mValue". (Note that the dependency is on a pointer to the instance variable, not a pointer to the Model.) Causality::RelyOnPointer() retrieves a pointer to the "currently computing Effect" - globalEffectStack.top() - in this case the Viewer. The causality system then retrieves or creates an entry for the relied-on pointer - in this case, a pointer to "mValue" - in a global hashtable. The Effect (i.e., the Viewer) is added to the Effect list for the relied-on pointer, and the relied-on pointer is added to the Effect's (i.e., the Viewer's) list of reliances. (These two-way references prevent things from getting hosed when Effects are deleted, or when their dependencies are restructured.) In effect, a dependency has been noted between an address in memory and the Viewer. GetTitle() then returns.

Statement A5 is a UI call that invalidates the screen area occupied by the Viewer, causing a redraw during the next update. Note that the redraw does not occur immediately; an immediate redraw would probably check whether the Effect was in a valid state before drawing; since statement A6 has not yet occurred, this would cause another call to Compute() and infinite recursion would result. (This was another frequent "Gotcha!" in my early days; a robust implementation catches the error and throws an exception.)

Statement A6 marks the causal state as valid. (Forgetting to do this is a hard-to-detect bug, which is why a robust implementation relegates this action to the outer Compute() method instead of the inner, overridable ComputeSelf() method.)

Statement A7 pops the Viewer off the global effect stack, so that we are no longer the "currently computing object". If an exception is thrown, statement A7 will never be called, which is why a robust implementation uses a stack-based object whose constructor pushes the Viewer on the stack and whose destructor takes the Viewer off the stack; the destructor will be called even if an exception is thrown.

The second flow of control, "B", occurs whenever the user manipulates a slider, the Control.

Statement B1 is the actual alteration - a pointer to a floating-point number is dereferenced and set to the new value input by the User.

Statement B2 declares the alteration to the Causality package. Causality::AffectPointer() looks up the pointer in the global hashtable described earlier. If any dependent Effects are found, the change is propagated to them. In this case, the Viewer's Action() method is called, which, in the default implementation, marks the Viewer's state as invalid and adds it to the global queue described earlier. When the queue is processed, the Viewer's Compute() function will be called and flow "A" will begin, causing the string to be recomputed and the caption to update.

Encapsulation is preserved at every point. The Control is designed to operate on a generic float - it is not aware that the float is part of the Model. The Viewer displays a generic model's title; it is not aware that this particular Model's title is dependent on an instance member. The Model computes a title without being aware that the title is being used by the Viewer, or that the instance member is being modified by the Control. The controller class, model class, and viewer class could in principle have been designed by three independent programmers.

The elements are modular, and the network self-organizes. The editing system is automatically integrated with all the other editing systems in the application. (Some commercial applications have to think for a while before one window notices what's going on in another window! A unified Causality package gives the user instant feedback without extra code.)

Implementation challenges:

Hacking up a Causality package shouldn't take too long, but a robust, full-featured, fast, general, extensible, and pervasive Causality system will encounter a number of challenges. The problem is handling these challenges in such a way as to preserve encapsulation and simplicity - keeping the complexity in the library instead of burdening the application programmer. To cite two examples:

Dynamic scoping:

A Causality package needs dynamically scoped variables (as in Perl's "local(var)" declaration), since one Compute() call may cause further Compute() calls. For example, if Effect X computes using a value provided by Effect Y, then X->Effect() will probably call Y->Value(). Y->Value() will probably call Y->CheckValid(), which will check if Y's state is valid, and if not, call Y->Compute().

Thus the "currently computing object" is not a single value, but a stack; one object computing may cause another to compute. Furthermore, different threads may have different "currently computing objects", and thus, in C++ at least, you need a way to store thread-local "global" variables. The same holds true of the Change objects that are used to convey information for action-based causality (see below). One change may cause another change, creating a stack; also, different threads may contain different changes. See the sample Causality package for a MacOS implementation.

If you're hacking a quick Causality package and you don't need to dance around legacy code, you can duplicate dynamic scoping by passing the Causality globals as an argument to just about every function you'll ever write for the rest of your days. I don't recommend it.

Action-based causality:

Suppose you have a list X of integers, and a list Y where Y[i] == (X[i] + 1). How can you maintain this correspondence using Causality? In two ways: First, each time an integer is inserted into X, you can recompute the entire Y list. This is "data-based" or invalidate-and-recompute causality. Second, each time X changes (a move, an insert, a delete), the same structural change is made to Y; new or changed items in X invalidate the corresponding items in Y. This is "action-based" causality, duplicating the action on the dependent item.

For action-based causality you need three things: First, a Causality framework where an Action method is called on each dependent object within the scope of the function that made the original alteration.. Second, a dynamically scoped stack of polymorphic Change objects, to convey information about the alteration. Third (for list causality), a standard set of list actions that can be stored in a Change object. See the sample Causality package for an implementation.

Action-based causality looks faster than data-based causality, since you only have to recompute a few items, but you should bear in mind both Causality overhead and the cost of programming. I first implemented a Change system for lists, not to save recomputation time, but because the unnecessary invalidation of all list items was breaking some correspondences. Use action-based rules to change the program structure; don't spend a week saving a few milliseconds. (On the other hand, once I had a generalized action-based system, writing and maintaining causally dependent lists became considerably more intuitive.)

Maintaining any action-based correspondence more complicated than lists rapidly becomes extraordinarily difficult. The most intricate bloody piece of code I have ever written maintained a perfect correspondence between generalized high-level context-sensitive user-editable representations and generalized low-level C++ objects. Would I do it again? No.

Related Patterns:

Attachment Pattern:

The Attachment pattern, also known as Observer/Observable or Listener/Broadcaster, can be viewed as a special case of the Causality pattern. For example, the JavaBeans specification allows a PropertyChangeListener to hear any event affecting a bound property of a JavaBean. However, the Listener must still bind itself to the specific object and property - there is no "current listener" to which functions can pipe dependencies.

JavaBeans implement the "Affect" set of statements, but bundle together the "current computing object" and "RelyOn" statements, thus decreasing modularity. The result is one-way Causality. A JavaBean doesn't need to know in advance who's listening, but a Listener still needs to know who to listen to; an Attachable doesn't need to know about specific Attachments in advance, but the Attachment has to be designed for the specific Attachable. One-way innocence is enough to enable sharing of attachments, but true reusability requires no entanglements in either direction.

Situation Pattern:

The Causality pattern is an instance of the powerful Situation design pattern. A Situation occurs when a complex piece of data is assembled by the cooperative effort of many smaller functions (or objects), calling each other in arbitrary order, in such a way that any function can modify any aspect of the Situation - Situations are globally accessible. Situations are the reason I keep reusing my dynamic scoping package.

The primary advantage of a Situation is that the functions assembling it do not have to be aware of what other functions are doing. Trying to implement a text-based role-playing game (what I was doing when I discovered the Situation pattern, way back when) as a prestructured set of procedures inevitably collapses in checking every possible spell or magic amulet each time damage is computed. Making the outcome of an action a Situation allows the ubiquitous modifiers to have their day without every function being aware of all possible modifiers.

The Causality pattern consists simply of viewing the dependency network as a Situation, thus allowing it to be assembled by self-organization at runtime.

Conclusion:

What is a "higher-level language"? There are many benefits that accompanied the transition from machine code to assembler, from assembler to procedures, from procedures to objects. These benefits include better encapsulation, increased modularity, and concise code. All these benefits are manifestations of a single truth: Higher-level is closer to the human mind. The human mind intuitively tracks dependencies in our mental representation of cause and effect; the Causality package provides a crystalline imitation for computer programs.

I have found that using the Causality pattern can provide the same pure benefits associated with object-oriented language features, particularly exceptions. Like exceptions, causality can make local, concise, and modular what was previously global, tedious, and intertwined. Like exceptions, causality replaces vast amounts of argument-passing, intervening-function-rewriting, and special-case-checking. The result: Faster implementation, greater reuse, cleaner code. Perhaps, like exceptions, causality will someday become a part of most object-oriented languages.

Runtime Dependency Tracking Using the Causality Design Pattern