The words "Level Zero", "Level One", and so on, refer to a notion I had for specifying various levels of Flare conformance. A very early implementation might be "A conforming Flare Three interpreter with features up to Flare Eight"; a production version might be a "conforming Flare Seven interpreter". It has nothing to do with the old "Flare Zero", "Flare One", "Flare Two" milestones from 1999. "Level Zero" means that you can't leave this out and still have something which is "Flare" in any real sense of the word.
These features are currently in no particular order. They may eventually be sorted into categories.
Not written by Dmitriy Myshkin.
By default, all metadata has the ability to recognize special elements of the form <p-something>, where "something" is the name of the plane. A module may define a new plane or state that it operates in an existing plane. A module operating in a plane containing code of the type foo.%bar will automatically access the local plane. A module operating in a plane may define new (static) subelement types permitted in that plane, and otherwise modify the metadata of the plane <p-something>.
Possible recommendation for implementation: All the metadata of all the basic types inherit from a common ultimate base class that recognizes planes as subelements. This permits planes to be unrecognized in unusual cases, such as an XML variable type intended to contain arbitrary "untrusted" XML data. Use of the expression foo.("p-something").bar should still conceptually work as well as foo%something.bar; it may violate a program invariant but it should still work in terms of the underlying machinery. I think.
All tags of the form "X-foobar", a single letter followed by a dash, are reserved for use as special names. (This includes the case of "p-something" indicating planes.) This reservation applies to variable names, property names, method names, property names, annotation names, and so on.
All metadata (almost all) inherit from a common base type that recognizes planes as subelements. Alternatively, all metadata marked as "I recognize planes" have planes added to them at runtime. (May make a difference to the language - language decision.)
FlareCode codon for "planar access". A special codon that directly corresponds to '%' is probably preferable to automatically translating foo.p-something.bar to foo%something.bar. A special codon that directly corresponds to '.%' is probably preferable to automatically translating foo%our-plane.bar to foo.%bar.
FlareSpeak keywords and operators for: Saying which plane a module is in; defining new subelements for that plane; possibly altering the plane's metadata in other ways. Should look much like the idioms for defining/altering a class.
FlareCode codons for doing above. Again, should mostly look like the standard idioms for defining metadata.
Every Flare element has metadata associated with it. Subelements are assigned metadata in a way specified by the home element, or rather, the home element's metadata. The home element associates metadata with tags, i.e., the subelement identifiers. All tags named <something> must be assigned the same <something> metadata. The home element may declare a default metadata to be assigned to all subtags not otherwise specified. This default metadata will almost always be of class <var>, and this will probably already be declared in classes <num>, <str>, and so on. <struct> might leave it open, though. Annotating, in the absence of default metadata, without a locally specified class (if permitted), is an error.
Recommendation: That the element named <something> located in <foo>, foo.something, have the metadata contained or referenced in foo^.m-something, or the metadata contained in foo^.default if no element <m-something> exists, or cause an invariant violation if no element named default exists.
When a new subelement is added, or when a series of subelements are read in, metadata must be assigned on the basis of that subelement's tag, by reference to the metadata of the home element.
Prerequisite: Means of defining classes and so on.
Quoting an expression, as in `a + b`, results in a miniature Flare codelet - that is, a fragment of FlareCode that can be passed around. Conversion to a Value results in the expression being evaluated; the expression is reevaluated each time a Value is needed. When treated as an Operand, the expression itself is passed around. The transparency idiom is similar to that for references. (Need to write more about transparency, Values, and Operands.)
"Expression" may be another kind of basic type. It may be that expressions, generators, and functions are all the same basic type, except that functions (and generators?) are opaque by default and expressions are transparent by default. Like the type Reference, the expression type would be transparent in Value contexts and opaque in Operand contexts.
Dual quoting (that is, a quoted expression inside a quoted expression) - a FlareSpeak idiom for this is probably not needed, since it would be plain old lousy code. Use subfunctions instead. It might also be possible to write:
a + `b + c`
FlareSpeak idiom for ``
Ability to convert a FlareCode reference to an expression value into the corresponding quoted FlareSpeak expression.
Expression/function/FlareCode basic type, probably called "code" (?)
Have `+` or `+ 3` evaluate to a fragment of FlareCode, such as <add><left><unbound/></left><right><num>3</num></right></add>.
Another way to do this might be to have standard placeholder keywords; i.e. `$1 + 3` evaluates to <add><left><lambda_1/></left><right><num>3</num></right></add>.
`2` + `+ 3` yields `2 + 3`.
All three features are intended for use in code that will actually build up complex pieces of FlareCode through direct manipulation. There doesn't need to be any quick way to bind a lambda'd expression to a variable - this should be done using subfunctions or through self-modifying code, not as a way to make prosaic code less readable.
Functions, inside a method, that can be called without adding a new method call to the stack - possibly even called within the current block. A section of FlareCode, contained inside a semistatic local variable, which is interpreted directly when called, rather than a new method call being added to the stack. ("Semistatic local variable": Having a default value, but locally overridable by assignment - parenting rules.)
Recommendations: A subfunction which has no scope at all - which is executed directly inside the current block - should not accept any arguments, since these arguments would need to be newly defined variables.
FlareSpeak keyword for setting aside a subfunction inside a function block. Probably similar to the way functions (methods) are defined inside a class block.
Some way to have subfunctions with arguments that are assigned to variables inside the block. This is mostly what distinguishes a subfunction from a quoted expression.
Prerequisite: Distinction between functions and methods.
The ability to pass a subfunction or a quoted expression outside its originating method call, and have any invocation of that subfunction or quoted expression refer to local variables contained in the originating method call.
Closure should occur by default so that interesting expressions can be passed as arguments.
Have a way to mark a subfunction as having scope "closure block", "closure", or "block", or "none". Have a way to mark a quoted expression as having scope "closure" or "none". Both subfunctions and quoted expressions should have closure by default. In FlareSpeak, a quoted expression with no closure should be defined inside an "expression" block, rather than through the `a + b` idiom.
A subfunction or a quoted expression with closure should contain a hard reference to the current method call when referenced or created. Thus, passing a subfunction or expression with closure outside the method call will result in an error if the reference is still around when the method pops off the stack.
This may be implemented by parenting, use of location, and the distinction between a method parent that contains a reference to a subfunction or expression, and a method parent that contains the body of the subfunction or expression directly. However, this is just an idea.
If a reference to a subfunction/expression with closure still exists when a method call pops off the stack, an error occurs, just as if any other reference to a variable with automatic scope still existed.
FlareSpeak and FlareCode idioms for describing nondefault closure.
FlareCode evaluation which is sensitive to the "home" of the executed FlareCode, or which can "jump" to the home context of the FlareCode when executing an expression or subfunction; this includes transparent execution of a "code" type.
Instead of having a single main() function, permit any file to contain static code which will be executed when the program runs. Python does this, but without initialization priorities, a rather important feature, because without initialization priorities it's hard to tell whether static code is there for setup or there for application logic; without initialization priorities, it's very hard in practice to write an application with distributed main threads.
Given initialization priorities, and a clean, clear way to distinguish between setup actions and application logic actions, and plenty of innocent code, the existence of static application code is a major contributor to the ability to drag in a module and automatically have that functionality added to the application. With static code, the "main" function or equivalent thereof can be innocent of functionality added by drag-and-drop. For example, a module drag-and-dropped in can have static code which registers tools in a repository, or takes periodical actions, or opens a window displaying memory usage, and so on, without main() needing to know about it.
A way for FlareCode files to contain, in addition to classes, statements (FlareCode blocks) which are called automatically at some point after the file is parsed.
A FlareSpeak idiom for writing such static statements inside a file/module.
For the purpose, see "static code" above.
Let priority be a floating-point number. There are other uses for priority, and usually priority "zero" will be standard/default. Higher priorities execute sooner. For initialization priorities, "0" is application main logic. Priority "10" is the standard for application setup. Class initialization has priority "20" by default.
There should be some way to execute actions that point out other application files that need reading.
FlareSpeak/FlareCode idioms for annoting a priority on static code, and for that matter, on class blocks (in case static code wants to use a class block and therefore needs the class to initialize earlier, or in case some classes need to init before others).
Some way to resolve circular dependencies (class A wants to declare a subelement that is a statically typed reference to class B, and vice versa). Recommendation: Metadata objects be created and named (as a very early action) before the metadata content is processed.
A central subsystem that accumulates initialization/running actions, sorts them by priority in descending order, and executes them in order.
Level Seven: A way to automatically execute a static codelet or initialize a class, from code, out of the normal execution order, without the code being run a second time. The ability to run initialization actions with identical priorities in parallel.
Optional: A program invariant capable of complaining if application logic is dependent on the order in which items with identical priorities are executed. (Also holds true of interceptions and so on.)
Level Five: A way for static code defined in library classes to be, optionally, used only if classes from that file are actively referenced by the primary application. Requires a way to distinguish between dead code and live code before the program runs, and detect if a section of code marked as "dead" is referenced anyway. This includes initialization actions such as creating objects held in static variables.
Prerequisite: Static code.
See 3.3.1: Priorities.
Part of the general principle of annotation is distant processes, even processes that are innocent of each other, working together to assemble a single object. In terms of modules, this means that multiple files should be able to belong to the same module - for example, multiple FlareCode files may contain data for a single class - or a single global object. The idea would be that, on reading in a FlareCode file, the FlareCode file may specify subelements to be added to a target element - where that target may be a class, a module, et cetera. If that target element does not yet exist, it is created - this prevents self-assembly from being dependent on order of execution. (Alternatively, initialization priorities can be used.)
Conceptually, prioritized initialization can be thought of as having FlareCode files add elements to a central "Initalization Code" object, which then sorts the elements by priority and executes them. This is a good example of when distributed assembly is important.
Requirements: A way to specify the "target" element to add subelements to. A way to regard FlareCode files, or Flare object files, as specifying subelements to be added to a target.
I'm not currently sure whether all threads should have a common parent. In a way, this seems like a bad idea on several grounds. If all threads have a common parent, then all static code should execute in a thread that is a child of this standard parent. If not all threads have a common parent, then static code should execute in a thread with no parent. Either way, a bit of static code should be (language design requirement) a good place to put a bit of miscellaneous code that executes in a slow low-priority thread, or that executes periodically once every number of seconds (or microseconds or femtoseconds or whatever).
Requirements: Threads with differing priorities; threads that wake up periodically; a good FlareSpeak idiom for declaring a "periodical" static function that is marked as executing every 10 seconds or whatever
Level checking allows someone with a Level Eight Flare interpreter that includes several high-level Adaptations to, e.g., write a Flare program that is guaranteed to run on any Level Six or better conforming interpreter. Note that Level N conforming Flare interpreters may optionally include higher-level features, but must include all features of Level N or less, and must run all conforming Flare programs of Level N or less.
Certain Adaptations of the Flare language may also be assigned a standard name, and a standard level indicating how large an adaptation it is. For example, a Level Nine Adaptation would be enormously complex, something on the order of a small embedded AI, indicating to one and all that there is basically no hope of taking a program written in Flare Eight with Level Nine Adaptations and running it in anything except another interpreter with exactly the same Adaptations.
Flare is a language that is so temptingly easy to extend and adapt that some degree of modularity in the language definition itself is required to ensure any hope of standardization.
Requirements: Ability to examine a FlareCode file and get a list of which Flare language features are used and which Flare language libraries are invoked; annotation of such language features and language libraries with a numeric "level" mark; ability of a Flare IDE to detect and flag the use of a language feature of higher level than has been declared acceptable.
FlareSpeak implies the ability to maintain a one-to-one correspondence between a visual representation (archetypally, a Pythonlike plaintext representation) and the FlareCode modules (stored as XML, i.e. extensible tree structures). A conforming FlareSpeak IDE must be unambiguous and complete; it must be capable of displaying any conforming FlareCode file, such that the UI manipulation of the IDE representation into an identical state by the user would result in the production of an identical FlareCode file, with the exception of timestamps, author stamps, and other data which is supposed to be dependent on the specific IDE, author, and time. In other words, an "Open" followed by a "Copy", a "Paste" into a new file, and a "Save" should result in an identical FlareCode file except for attribution stamps (specifically marked as such). However, if the FlareCode is conforming but unusual (not the sort of FlareCode that would be produced by a standard UI), there is no requirement that the FlareSpeak displayed be either compact, natural, or even all that readable; just that the FlareCode be displayable and that a Cut&Paste of the displayed information preserve the underlying FlareCode.
Extraction of FlareSpeak from FlareCode, or FlareCode from FlareSpeak, is metaphorically similar to layered feature extraction in a sensory modality. To extract FlareCode from FlareSpeak plaintext, one parses the text into identifiers and operators, then parses the identifiers and operators into expressions and statements. Similarly, extracting FlareSpeak from FlareCode proceeds by recognizing FlareCode features that can be transformed into FlareSpeak elements (usually the correspondence should be very local and straightforward).
Several FlareSpeak idioms may not translate directly into FlareCode - that is, there may not be a one-to-one correspondence between FlareSpeak elements and FlareCode elements, even though there is (by requirement) a one-to-one correspondence between the complete representations. When a FlareSpeak element is transformed into multiple FlareCode elements, there must be an inverse transformation that recognizes the multiple elements produced by the IDE and encapsulates them into the single FlareSpeak element for the GUI display. This is all that is needed for an interpreter to read and recognize its own code, which is the minimal level of functionality needed for a Flare One prototype. (Actually, Flare One also requires that the FlareSpeak GUI read legal FlareCode files that may not have been produced by the IDE, but the IDE can display nonunderstandable sections as direct FlareCode XML if necessary!)
However, production-quality Flare interpreters should be more tolerant and should be able to read code produced by other interpreters as easily as code produced locally, and should handle most departures from the norm in a suave and tolerant manner (i.e., without producing ugly FlareSpeak).
The XML data type, which allows text to be annotated, is a good way to write a Flare IDE in Flare, or failing that, a good metaphor.
Ability to translate a FlareCode file into FlareSpeak.
Ability to translate any legal FlareCode file into FlareSpeak.
Ability to translate a FlareSpeak representation into FlareCode.
Ability to preserve things like semantic comments with dating and attribution information across "Copy" and "Paste", even if the dating and attribution is not directly visible in the text (note that this implies a trans-plaintext editor). Semantic comments are Flare Three, but the need to preserve opaque information is universal.
IDE Level Six (not a language feature)
The correspondence between FlareCode and FlareSpeak should be maintained as much as possible during editing, and exposed through an API, so that interpreter actions can contain Flare fragments which will look at the FlareCode to decide what to do. For example, going to a Find/Replace box and writing a bit of FlareCode that searches for an assignment operator whose right-hand operand is an addition expression, then highlights the text of the addition expression.
Extraction of FlareSpeak from FlareCode, or FlareCode from FlareSpeak, is metaphorically similar to layered feature extraction in a sensory modality. To extract FlareCode from FlareSpeak plaintext, one parses the text into identifiers and operators, then parses the identifiers and operators into expressions and statements. Given a really good system, it is possible to change a single bit of text and break only that statement, so that only that statement, and not any of the nearby and enclosing statements, needs to be reinterpreted. (Yes, I've once implemented such a system. Yes, it's a lot of work. No, you don't need this feature for a Flare prototype.) Similarly, extracting FlareSpeak from FlareCode proceeds by recognizing FlareCode features that can be transformed into FlareSpeak elements (usually the correspondence should be very local and straightforward); again, given a good system, local changes in FlareCode should translate into local changes in FlareSpeak, without changes propagating across the entire FlareSpeak display or the entire program.
As I know from experience, it is very hard (but possible!) to do this on the level of individual identifiers and expressions. However, that is an unnecessary level of elegance. Reparsing an entire procedure each time a keystroke is entered should still be a negligible computational expenditure, as long as the parsing is limited strictly to that local procedure. Greater elegance will be required only if, i.e., interpreter features make perseverant references to particular parsed FlareCode elements, which references would be destroyed by re-parsing and the generation of de novo identical FlareCode elements.
Given Flare language support for changing FlareCode in realtime, this should make it possible to edit Flare programs in realtime during debugging. It should also lower the computational cost of continuously maintaining the correspondence between FlareSpeak and FlareCode, not just doing so during "Open" and "Save".
Managing the degree to which changes propagate, and reducing the amount of code "broken", when a single keystroke is entered into a FlareSpeak IDE.
Parsing algorithms which act by locally reparsing unparsed sections, rather than sweeping through an entire program file.
I assure you that this is possible but I do not claim that it is easy. It requires a totally different design for the IDE. Managing change propagation means understanding Causality and explicitly knowing which local identifications are dependent on program structure, and which program structures are dependent on local identifications. In Java, which is where I originally implemented this, a keystroke that puts a space between the two characters in /* breaks the /* open-comment identifier and can thus have almost unlimited effects; everything inside the comment needs to be reparsed. FlareSpeak, with its semantic whitespacing, should be much more manageable.
IDE Level Five (not a language feature)
The use of "FlareSpeak sugar" - extensions to FlareCode with no semantic value and intended solely to help the IDE reconstruct its own FlareSpeak - is ABSOLUTELY PROHIBITED because it breaks the ability of IDEs to read each other's code, and in fact, breaks the entire FlareSpeak idiom. It is for this reason that, for example, all whitespacing is semantic whitespacing; if non-semantic whitespacing existed, there would be no way to represent it in FlareCode and therefore no way to store it without extending the FlareCode to represent information about FlareSpeak. As stated, this is absolutely prohibited. If the user wants to be able to align a series of similar variable assignments - for example
foobar = 1this is not accomplished by typing in extra spaces, since these spaces could not be represented in FlareCode except through prohibited sugar. Instead, the Flare IDE should contain Preferences options for:
baz = 2
quux = 3
[x] Allow slop of % for aligning code.Thus, all code viewed, whether created by that particular user and IDE, or not, will have the user's preferred form and style; any code written by the user will similarly be viewable by all other IDEs and will have the style preferred by those other viewers.
[x] Allow slop of % for similar lines.
Recommendations: If the user has "align variable assigments" turned on, it should still be possible for the user to type in unaligned variable assignments which will automatically become aligned variable assignments when parsed; probably, the text will change when the user hits return and moves on to the next line.
Options in the Preferences screen.
Ability to display styled code in a semantically sensitive way (see Realtime FlareSpeak above).
<author>Baron Hans Nidrach von Pompzidaize</author>
<content>Add <i>three</i> to "theVar".</content>
theVar += 3 # Add three to "theVar".
Moving the cursor over # will show the date and author information.
FlareSpeak style preferences govern whether the comment comes above the statement, or alongside, or runs onto multiple lines, et cetera.
Semantic comments can contain a limited subset of HTML. Very limited, like limited to <i> and <b>, is just fine! The requirement here is "If you're going to allow styled text in comments, use an HTML format", not "Display arbitrary HTML 4.0."
When "commenting out" code, preserve it as FlareCode inside a <comment>, not FlareSpeak inside a <comment><content>.
Timestamp and author-stamp comments when they are created.
On parsing FlareCode, make the stamps "part of" the comment. Preserve them across Cut&Paste. I.e., any Cut&Paste that grabs the # and not just three should get the stamps as well.
Have some way of examining the stamps, even if they do not (by default) show up in the comment directly. Such as hovering the mouse over the #.
Have FlareSpeak style preferences if stylistic issues arise, such as when a comment should be placed alongside, on top, across multiple lines, and so on.
Record an MP3 of a spoken programmer statement. Parse it into text, possibly using speaker-dependent voice recognition. Display the text as the content, but retain the MP3 as well, and allow playback of it. This ensures that unrecognized information is not lost information.
A way to catch remarks addressed to the interpreter and recognize comments
Ability to store embedded MP3 data in a FlareCode file. This may be inline or may take the form of a reference to an outside file; I would recommend the latter.
Two-way references allow for safe dynamic deletion and safe stack allocation.
Use a two-way linked list to track the references. The ->next and ->prev members should probably be FlareReference->next and FlareReference->prev, of type FlareReference*, meaning that the linked-list links are located on the References themselves. I'm not sure how to handle the endpoints, but one example would be to have a reference to the first FlareReference in the chain, and then treat that as a special case, detectable because FlareReference->prev == 0.
Given the existence of soft references, if an item is supposed to be garbage-collected, then track the number of hard references in a counter, rather than iterating through the list to count the hard references each time. This may someday be obviated by a more sophisticated mark-and-sweep algorithm. If a hard reference exists to a subelement of an element with GC scope, then that may prevent the superelement from being deleted, and vice versa; thus, hard references to subelements with "subelement of GC" scope should increment the reference counter of the enclosing element or super-superelement with GC scope.
The "reference" data type may not be the only place that trackable references exist, so make sure the FlareReference class is divorced from all these.
Lock the reference list separately from the object itself. (This comes at almost no cost if we use James Roger's suggestion, which, I think, would let us lock arbitrary pointers.) Threads outside the object may create and destroy references to an object without wanting to lock the object itself, and actions on an object may occur without needing to lock the reference list as well. Only creating or destroying a reference with a reference-scoped invariant would require a lock on the object.
Preserving scalability may require that certain objects have "permanent, do not track" scope, meaning that they are never deleted and that it is not necessary to track references to them. This prevents thread lockup given an object that is very frequently referenced (metadata, for example; particularly the metadata for class "num"). This suggestion may be obviated if someone comes up with an algorithm that would create only local locks and prevent most collisions. (For example, deleting a reference from the middle of a two-way linked list requires locking only the next and previous references, plus a lock of course on the reference being deleted, plus a lock that would prevent someone else from iterating through the list (but not someone else from creating or deleting a reference), for a total of four locks. I wouldn't do it.)
The list of references is defined to be unordered - one should not rely on a reference list being maintained in a particular chronological or prioritized order.
Object persistence in XML should use id="foobar" attributes to define the target of the class. A persistent object does not need to contain a list of the references to it - this is maintained invisibly by the interpreter. The two-wayness of references is not, conceptually, a characteristic of the object; it is, conceptually, a property of references in the Flare ontology; references are physical arrows that can be seen from both ends.
Two-way linked list class with FlareReference structures as the links in the list.
Separate lock for the list of references.
Reference metadata which distinguishes between "hard reference" and "soft reference". (Level Two.)
Testing for remaining hard references on deletion of a stack-scope or dynamic-scope object. Setting of soft references to null. Note that this procedure also needs to be performed for subelements of a deleted element.
Reference-counting GC (Level One), "GC scope". Ability to increment or decrement the reference counter on creation or destruction of a hard reference to a subelement with "subelement of GC" scope.
Mark-and-sweep GC (handles circular references). Level Five.
Builtin function which returns a list of the references to a Flare element.
Performing another check for dangling references after any "finalization" method is called; if dangling references exist it is an error.
An element should be able to have multiple subelements of a particular type, such as <eye>; a <head> element with three eyes might have three distinct <eye> elements. An element may also declare that all elements with list type are bound together into a single ordered list - in effect, everything except planes, or any other cases statically declared, would be bound together in a single ordered list.
It should be possible to declare that a list type is actually an "array" with predetermined size, treatable both as a Flare invariant and as an interpreter optimization. Level Two.
The expression head.eye is a list operand.
FlareSpeak keywords for declaring list subelement or list superelement.
A Flarespeak idiom for declaring an "array of X" class on the fly (Level One).
A way to declare a sized array. (Level Two.)
A way to declare list subtypes - for example, a list whose underlying interpreter representation is a linked lisk instead of an array. This should not affect the semantics. (Level Six.)
Semantics for handling list operands.
As described in Causality (yes, you need to read the paper), there are essentially two kinds of causality; action-based and validity-based. Causality occurs when element/object B makes a computation that uses data from element/object A, and when the application logic requires the maintenance of a correspondence (mapping) between element A and element B. Validity-based causality is when a change to A causes B to be marked as having invalid state, in which case B needs to recompute before it can be reused. Action-based causality involves maintaining the correspondence in real time; when A is changed, B is notified in time for the corresponding change to be carried out on B.
I need to write up a paper about causality, actions, parallel locking, noding, and transactions. Until then...
Causality can be implemented as application logic, through interceptions, if necessary. However, the fact that Causality can be implemented with interceptions does not mean that Causality should be implemented with interceptions; you can do anything with Flare interceptions. I tend to think that Causality should have direct support in the interpreter.
Causality is one of the most important language features of Flare.
Requirements (see also subsections below):
Support (through interceptions or interception-like properties) for individual Flare elements (such as objects and properties) which are declared to be "dependent", meaning that any use of them while there is a currently computing object will cause that currently computing object to become "dependent" on that object or element, and such that any change to that object or property will cause the action or invalidation to propagate along those dependency links.
A way to explicitly declare dependencies (invalidate on change) and correspondences (look at actions and decide whether to mimic the action, or invalidate, or both). Correspondence declarations can also be used to innocently create general "watching" patterns. In the libraries I wrote, this are separate flags for an "action" (a modification has occurred) and a "shock" (a non-modifying event has occurred).
Language library support for "Effect" classes which can be dependent.
Language libraries which ubiquitously use Causality internally - there is a network effect involved.
Some standard method names on the Effect class:
(I need to go over my old code and make sure my memory is correct on this.)
Compute() marks the object as currently recomputing, makes the element the currently computing object to which dependencies will accrue, calls ComputeSelf(), removes the "currently recomputing" mark, and marks the current state as valid if ComputeSelf() returns a null or true value. (But not if a false value is returned.) Checks to see if the object is in the process of deletion, if the object is marked as currently computing, or if the object is marked as valid.
Action() marks the object as currently propagating an action. Calls Invalidate() if the effect is being propagated along a simple dependency; calls ActionSelf() if the action is propagating along a correspondence. Marks self as no longer propagating the action and returns.
ccheck() ("Check Causality") calls Compute() if the current state is not valid. This would go into the "setup" code of a complex method if a FlareSpeak dialect wants to declare a "valid" keyword that applies to methods, or would be a prosaic ccheck() call otherwise. This is a small name so it can be ubiquitous.
Queue() calls QueueSelf() and marks the object as queued if QueueSelf() returns true. Some objects may want to override this to create separate queues (to ensure that recomputation happens in a particular thread, for example).
Invalidate() marks the state as invalid and calls Queue().
(One may wish to prefix most of these names with an 'F' or an 'L', indicating that they are Flare or Library method names.)
NEW - maybe all this Compute() and ComputeSelf() should be implemented as structured methods - i.e., complex methods - containing "wrapper" submethods? Wrappers with priorities? It's all, conceptually, the same method, and this would obviate a number of inheritance headaches. I, too, need to start thinking in Flare!
My experience with validity-based causality suggests the following:
It is important to separately track whether an effect is currently valid, and whether it is currently queued for recomputation. If an effect is dynamically called to recompute while it is still queued, it should remain in the queue; when a queued item arrives at the head of the queue, it should be checked for validity and simply discarded from the queue if already valid.
It is important to separately track whether an effect is currently valid and whether an effect is currently computing, in order to detect and prevent circular computations.
If a causality event strikes an object that is in the process of being deleted, this should cause an error.
The recomputation of a queued effect should not invalidate other effects. If so, it allows for circular propagation. Any propagation of the invalidation of an object to the invalidation of other dependent objects should occur at the time the original invalidating action occurred, thus queueing all the affected objects simultaneously and preventing circular propagation. This should probably be some kind of program invariant, or at least a module invariant that can be switched off by particularly daring programmers.
One effect, in the process of recomputing, may be dependent on another effect; if so, that effect will need to recompute. Any dependencies tracked during the process of recomputation will then accrue to that object, until it finishes computing. Thus, in Causality the "current computing object" is a stack-based variable, requiring the use of a Situation.
In terms of Parallel Transactional Noded Causality (which I have not yet had the pleasure of implementing), I suggest that an action which invalidates an object first achieve a lock on all other objects which will be invalidated by the propagation of causality, thus ensuring that it is safe to invalidate the object at the time the action begins. This would of course be automatic, tracked by the Causality system rather than the programmer.
The effect queue can be as important as the event-handling loop. This will probably hold doubly true for ubiquitously parallel applications - one can simultaneously compute as many Effects as there are available processors.
Maintaining action-based correspondence is considerably more complex than invalidating and recomputing, but the effect is often worth it. List actions (see below) are probably the best example.
Actions save on computational power by making local actions have local effects. For example, if List B is equal to List A plus 1 (`B[@] = A[@] + 1`), say B = [2, 5, 7] and A = [1, 4, 6], then using A.move(2, -1) can have two possible effects:
First, list B is invalidated and the entire list recomputes. This would occur if a dependency had been explicitly stated, or if a dependency had been implicitly stated by evaluating an expression that refers to an element marked as "dependent". (Again, the latter would be implemented through interceptions or an interception-like idiom.)
Second, the statement A.move(2, -1) sets up an Action. Before the action is carried out, locks are achieved on both A and B. B (and any other objects with a declared pre-change-notification correspondence to A) are first notified that the change is about to occur; then the change is made to A; then B is notified that the change has occurred. Finally, the locks on both A and B are released. (The locking part on this is speculative, since I have not implemented a parallel-safe locking system. The need for notifications both before and after the change is not speculative.) In this specific case, B would be notified that A's third element was about to move to the second position (and consequently, that the second element would move to the third position). B would do nothing on the advance notification, but on being notified that the change had actually occurred, would apply the same action - move(2, -1) - to its own elements. When the action had completed, A would equal [1, 6, 4] and B would equal [2, 7, 5]. In this instance relatively little computation has been saved (a net loss, actually), but with more complex elements in A and B, or with elements that have Causality implications in their own right, it becomes much more important to control change propagation by having limited actions cause limited results instead of invalidating entire objects.
Circular correspondences should be caught at runtime, and should cause errors. If A is derived from B and B is derived from A, then you need to break down the problem and have element A.1 rely on B.1 and have element B.2 rely on A.2, or use validity-based Causality.
Have a way of stating correspondences. This should require an explicit statement and should be rarer than simple dependency. Dependencies can be completely innocent; but if A has a complex correspondence to B, then A is likely to know at least a little about what kind of object B is. The correspondence should state whether notification occurs before the action, after the action, or both; default is that notification occurs once, after the action.
Work out the locking mechanism and notification mechanism:
An action occurs to A.
First, A and all objects with a declared correspondence to A are locked.
Then, all objects with a declared pre-action-notification correspondence to A are notified of the impending change. In parallel Flare, this action may be carried out over all objects in parallel.
All objects with a simple dependency on A are locked.
All objects with a simple dependency on A are invalidated. In parallel Flare, this action may be carried out over all objects in parallel.
The action is carried out on A.
All objects with a simple dependency on A are unlocked. (Note - this should work even if the dependency list changes in the meanwhile, so store the list of what you locked. Use the standard statement-block idiom so that the objects are unlocked even if an exception occurs. Note also that if an exception occurs during invalidation, all objects marked as valid are still valid since A has not changed; if an exception occurs while A is changing, all objects dependent on A have been marked as invalidated. Although correspondences may be left in an inconsistent state.)
All objects with a declared post-action-notification correspondence to A are notified of the change that has occurred. The object may duplicate the change locally to maintain the correspondence, recompute completely, invalidate itself, queue to recompute later, ignore the change as irrelevant, and so on.
Actions may also be declared to be "events" rather than "changes", in which case objects dependent on A are not locked.
Code determines the current action by looking at the current situation. I.e., sit%causality.action or whatever.
Have a set of standard libraries declaring a standard set of actions. This should include all the actions that can be undertaken in simple FlareCode expressions - for example, list manipulations, addition, subtraction, assignment, et cetera.
Know how to undo standard actions. If you can keep track of all the actions taken by a function and maintain all the information needed to undo those actions - for example, undoing an assignment requires saving the previous value - you can theoretically rewind that function. This has nothing to do with causality but it's fun. This should not be done by default; it's just an optional self-examination thing. (Flare Eight, maybe.)
Have standard ActionSelf() methods for the basic types. (It'll never get called unless a correspondence is declared - this represents no performance hit for ordinary types.)
I have some code for this... written for Metrowerks Powerplant, unfortunately, but at least sufficient to show how list-based Causality works.
There are two kinds of list actions:
This is the Set/Add/Remove/Replace action:
If mInserting == mDeleting, then that many list items are being set to new values. If mDeleting == 0, items are being inserted. If mInserting == 0, list items are being deleted. If mInserting != mDeleting != 0, then a range of list items is being replaced.
This is the Move action:
There's also a whole bunch of other complex stuff, for which I really need to post the code. Nonetheless, this suffices to give the flavor of what it means to specify an action.
Given the frequency with which metadata needs to be checked, it may be wise to parse a Flare object of class Metadata into a C++ object, containing as C++ data all the information which has a standard interpretation and which needs to be checked directly by the Flare interpreter.
If runtime changes to metadata are permitted (rather a Level Five or even Level Seven feature, one expects, at least if it is to be done safely), then those changes would need to be echoed to the "uberdata". If changes to metadata are impossible (or at least, changes to the part of the metadata the interpreter cares about, rather than user-defined annotations), then threads don't need to acquire "read" locks on metadata objects. Perhaps, though, this optimization should wait on finding out how much of a slowdown, if any, the lock acquisition actually represents.
If changes to certain parts of the metadata are not permitted, then those Flare elements would need to be declared (automatically declared?) as constant, and attempted changes detected and disallowed.
A significant part of the functionality that would be gained by changing metadata objects can be acquired by allowing new metadata objects to be created, but not altered once they have been used.
Something that parses Flare metadata into a C++ object.
Something that detects modifications of the metadata and either screams "invariant!", or acquires a "write" lock on the metadata (which may take a while) and makes the corresponding modification.
And, of course, interpreter code that uses the uberdata.
The basic types of interception (that I currently know of, anyway):
redirect_lookup (Level Four)
Note that, in all cases, the interception applies directly to the specific element intercepted, and not to properties, methods, or other subelements of that element. In Python, __getattr__() and __setattr__() work for any property on the object. This is because properties in Python are not independent objects. In Flare, however, the properties are independent elements; to intercept a property one should use an interception placed on that property, rather than an interception placed on the property's location. <redirect_lookup> can be used for most of the dynamic, nonspecific interception idioms that would be implemented by __getattr__ or __setattr__ in Python. Note also that <redirect_lookup> will break a chain of property accesses, just like following a weak reference. There is no such thing as foo.&bar if <redirect_lookup> is called on <bar> and chooses to intercept (i.e., a reference is produced to whatever element is returned by <redirect_lookup>.
Example: Causality can be implemented using a watch_access and a watch_set. watch_access would declare a dependency and watch_set would declare an alteration. These interceptions might be implemented as elements in %causality marked as "interceptions" with the above properties, rather than needing to occur in %intercept... in other words, what makes something an interception is the metadata of the interception, not a unique name in a unique plane.
If elements can be acted upon (for example, moving items in lists and so on, or "add 3" considered as an action rather than just the new value being set), then there might be watch_action, modify_action, and replace_action interceptions. (Probably Level Four again.) Note that even after watch_action or modify_action occurs, watch_set or modify_set might still be called at the end, although not if replace_action is fired and returns true (thus indicating that the action has been successfully replaced). Similarly, watch_set is called after modify_set, and neither watch_set nor modify_set is called if replace_set exists, fires, and returns true. Only one replace_access or replace_set can return successfully - the first one to indicate success obviates the need to call any others.
Trapping unfound property accesses (i.e., foo.bar but there is no <bar>) should probably take the form of a specially named method (see below), as with operator overloading, rather than being implemented as an interception. But I am not entirely sure of that.
Interceptions should be cumulative and should have priorities. If two or more interceptions exist with the same priority, or without any priority, then an error occurs - there must always be a deterministic ordering. Interceptions, including the sort by priority, should probably be preprocessed into uberdata.
watch_access, replace_set, etc. (Level Three)
redirect_lookup, watch_action, etc. (Level Four)
Priorities (Level Four). Without priorities, at Level Three, only watch_ interceptions can be multiple.
An interception - in canonical form, an element in %intercept aka <p-intercept> - is a subelement (or, in this case, sub-subelement) that changes the behavior of the enclosing element, the location or superlocation. Rechecking for an interception every time any element is accessed would tend to bring the interpreter to its knees, and for that matter, cause an infinite recursion error if done the obvious way (i.e., if the FLHasSubElement("redirect_lookup") call also automatically tries checking for the <redirect_lookup> interception...).
The need for preprocessing is even more urgent when one considers that an interception may be located on a parent.
Tracking interceptions in a computationally feasible way adds complexity to the interpreter, not just because it's one more thing for the Flare interpreter to check, but because interception involves an interaction between elements; the existence of one element may change the behavior of another. If interception involves a change in uberdata, then it means that the creation, deletion, or moving of one element may change the uberdata of another. This adds complexity; it is, however, useful complexity, and I have a feeling it's complexity that we would end up implementing eventually, bit by bit, whether or not we planned for it in advance. Interceptions are not the only case where one element affects the uberdata of another.
In the general case, the behavior of a Flare element is likely to be implemented by a C++ structure containing a set of pointers to C++ functions which implement the behaviors. For the language-default behaviors on a type, the C++ functions implement the behaviors directly. An object marked as intercepted gets new C++ uberdata, containing pointers to C++ functions which delegate the intercepted behavior to the appropriate interception. This C++ uberdata might also contain auxiliary, element-specific data, containing pointers to the specific interceptions and prioritized lists of those interceptions. If so, this list will need to be kept in consistent state with the Flare elements representing the actual interceptions, and operations on Flare elements flagged as interceptions will need to invalidate the state of the intercepted element. (Note that, in parallel terms, this requires a write lock on the intercepted element before an invalidating action on the interception can be carried out.) It also requires that the metadata of an interception indicate the intercepted element; the only two cases I know of are the location and the superlocation (the location of the location); if these are assumed to be the only possible cases then a single bit suffices to distinguish between them.
Complex actions on interceptions may make it desirable that the intercepted element be locked and placed in invalid state while a series of complex manipulations are carried out, rather than recomputing the interception state (which may not even be consistent) after each individual action. This, however, is part of the general theory of causality and transactions.
A program which uses no interceptions at all should be precisely as fast as a program written in a language in which interceptions never existed. The interception elements have interception metadata, implying C++ function lists which maintain the state of the intercepted element. The intercepted element acquires new C++ function lists which maintain the interception logic. In the absence of interceptions or interceptees, the default C++ functions are used in all cases, and the speed is the same as before.
Other alterations of behavior may also exist. This implies some solution to the problem of "N orthogonal behavior-altering features exist; it is therefore necessary to write 2^N separate C++ functions in which these behaviors are switched on or switched off." An example of this solution is to have behaviors that implement each of the N features alone, plus a behavior that checks all of the N features but implements only those that are actually used; this latter behavior would be used whenever two or more behavior-altering features were used simultaneously.
Dynamically adding or removing interceptions from an element implies either that the element uses a universal behavior list that implements the "check intercept" C++ functions for all operations, or that the element uses a private behavior list that implements the "check intercept" behaviors only for intercepted operations. The use of auxiliary data for intercepted behaviors automatically implies that the data is private to the intercepted element. If class parents define interceptions that will be shared by all class members - for example if an instance property is declared "dependent" under Causality semantics - then it may be desirable to compute the behavior list and auxiliary data for the class parent and share it out among all descendants.
Given the ability to manipulate behavior lists, the presence of a bit for "intercepted" or "interception" may be superfluous; however, it may still turn out to be a good idea. Elements or metadata are almost certain to wind up having a list of flags at some point.
The behaviors for an interception must maintain the state of the intercepted element. (It therefore follows that all program atomic actions which could affect the state of an interception, in a way which affects the intercepted element, must pass through the interception's behavior list.)
The intercepted element must possess behaviors which implement the language logic of interception, presumably by delegating the intercepted operation to the interception element.
It may be desirable for elements to have auxiliary data which maintains a sorted list of the interceptions, thus obviating the need to repeat a Flare subelement (or sub-subelement) lookup on each occasion.
The behavior lists, and/or the auxiliary data if any, may be local to the intercepted object, or local to an intercepted class parent.
See Expressions for a discussion of why type lifting is considered harmful, and is replaced with type priorities and operator delegation to operands. In fact, see Expressions for a better discussion of all these semantics.
In the expression `a + b`, the actual addition is delegated to one of the operands. Which operand the task is delegated to is determined by the priority of the class/type/metadata for that operand. Thus, "priority" is a subelement of the Metadata class, which is translated into a floating-point or fixed-point number in the C++ representation of the uberdata. The higher-priority operand is asked to handle the operator first; if that fails, the interpreter attempts to delegate the operation to the other operand; if that fails, the expression fails. If the two priorities are equal, the left-hand operand is asked first. Except in cases of operator overloading, the actual operation will presumably be carried out by a C++ function which implements that operator. The C++ uberdata for any type/class/metadata will contain a pointer to a structure containing the function pointers for the language-standard implementations of that operation on that type, or to the C++ functions which implement the delegation of operator overloading to Flare methods. The right-hand addition operator may differ from that of the left-hand addition operator (addition is not always commutative - it is not commutative for strings, for example). In other words, `2 + foo` may call a different function on `foo` than `foo + 2`.
In the case of augmented assignment (+=, etc.) the left-hand operand is always checked first, regardless of type priorities. The order of execution in this case is as follows: Check the left-hand operand for an in-place operator; if that fails, try to expand the `x op= y` expression to `x = x op y`, with the additional requirement that the current type and subtype of x not be changed by the assignment; in other words, the result of `x op y` must be x's current type. Thus, `x = 3; x += 1.5` is not legal; if this capability is important, x should be declared rational or float in advance, or the expression `x = x + 1.5` should be used. Aside from that, in the attempted calculation of `x op= y` as `x = x op y`, `x op y` is carried out using the usual, priority-based order of execution for a binary operator.
In the case of direct assignment, the left-hand operand is again checked first; the default language behaviors on most built-in types will probably implement assignment by attempting to coerce the right-hand operand to an acceptable value, and fail if an exact coercion is not possible.
The "priority" property in a metadata element must be reflected directly within the uberdata, for efficiency. Changing this subelement at runtime is dangerous, and permitting it to be done safely is a higher-level feature.
Operators, both left-hand operators, right-hand operators, in-place operators, and assignment operators, must be implemented as C++ functions, the function pointers to which are located inside a C++ structure, pointed to by the uberdata for an element or operand. (These functions are referred to as "behaviors".)
The order of execution for non-assignment binary operators must be determined by type priorities. (Note that order of execution is not the same as order of evaluation! Delegation of operator handling occurs after both operands have been evaluated; otherwise, there is nothing to delegate to.)
Note: Dmitriy says that strings should never coerce exactly to numeric types - i.e., "4" does not coerce exactly to 4. This is because `num x; x = 2; x += "4"` introduces a behavior that depends on the content of the string, and this behavior may be unexpected.
It is conceivable that, for reasons of speed, floating-point priorities will be limited to 4 decimal places of precision, enabling FMetadata->priority to be represented as a signed integer type (priority * 10,000). I think this will be a trivial speed boost, but if you're using 5 decimal places your code probably sucks anyway, so making this a program invariant / language constraint is probably reasonable. Priorities should probably also be limited to the range [-100,000.0000..+100,000.000] to ensure that they fit in a 32-bit integer. (Note that this is stored on a C++ metadata structure, not a C++ representation of the Flare element, so conserving space is even less of an issue.)This is probably an acceptable general constraint wherever priorities are discussed, including interceptions, cumulative wrappers, and so on.
It is a good thing if the IDE has a way to globally browse all priorities, and especially to browse all priorities that apply to a certain category (such as type priorities, for example). Some kind of enum-like idiom is also needed so that priorities don't have to be flat numbers, although since often priorities are unsharable, having a set of identical priorities will sometimes be unuseful.
Operator overloading in some sense resembles interception, in that it involves subelements (the methods, in particular) which change the behaviors implemented by the underlying uberdata. However, most operator overloading will probably be implemented by classes (class parents, rather). Allowing operator overloading to change at runtime is a dangerous and hence higher-level feature; but, if operator overloading does change at runtime, it will probably have more or less the same semantics as interception, described above.
Recommendations: The special names for overloaded operators should be <op-add>, <op-right-add>, <op-inplace-add>, and so on. It should be easy to overload <op-add> and <op-right-add> using the same method.
That the presence of a method with a certain special name, indicating an overloaded operator, change the underlying behaviors to delegate that operator to the Flare method.
See Expressions. Transparency means that in certain contexts (most contexts, actually) references are followed automagically. This has been described in Expressions at much greater length.
On reflection, however, I am not sure that quoted expressions should be evaluated automagically in the same way that references are automagically dereferenced; programmers are much more likely to be thinking of quoted expressions as quoted expressions when using them.
Transparent operators should be controlled for side effects, if control of side effects is available; the act of following a transparent operand should not change application state. (This is especially important if operator overloading comes into play.) This is because following transparency may sometimes be a tentative action, taken back if it proves to be useless.
A Flare operand should - if asked to by an operator or operand behavior - find any transparent operands lurking underneath the exterior. Since this transparency operation may be attempted multiple times, the result should be cached on the original operands.
The act of following a transparency or series of transparencies may sometimes be sensitive to context; in particular, the context may determine the type of the expected result. If it is unambiguously known that a result of type num* is desired, then a pointer of type num* should not be transparently dereferenced further. (This needs further definition.)
Control of side effects is the logic underlying the use of "functional programming languages". Functional programming languages distinguish between "functions" and "procedures"; under this usage (not standard in this document, BTW) a "function" returns an output based on its inputs and has no other effects; a "procedure", used by evil procedural programming languages, makes changes to global state. While I have never tried to write a program in a truly functional programming language (Haskell, for example), I can see that controlling side effects would be a powerful optional way of enforcing cleaner programs. C++'s ability to declare "const" methods, "const" arguments, and "const" instance members is another case of enforcing control of side effects.
As a language feature, control of side effects requires that the interpreter be capable of detecting violations at compile time (i.e., in the IDE) or at runtime. Detection in the IDE is preferred; detection at runtime would have the usual semantics for detecting invariant violation (the more computing power you invest, the more invariants can be checked).
Control of side effects has the following flavors (names are suggestions only):
Pure function or "functional":
Output is strictly a function of input; a given set of inputs is guaranteed to always result in the same output. This is the strongest form of control. Math libraries would be the most obvious example.
In terms of a program invariant, this means that a dependency on any piece of data that was not passed as an input is an error. Use of any non-constant global variable is an error. Accessing `location(anInput)` is an error; subelements of inputs can be accessed, but not superelements. Following a reference is acceptable. (Backtracing along a two-way reference should logically be unacceptable but it probably isn't worth bothering to check; the system is meant to prevent accidental, not deliberate, breakage.) Local variables can be declared and modified, but an assignment to anything other than a local variable - including an assignment to the referent of a reference given as input - is an error. Accessing non-constant static variables on an input should logically be an error, although again it may not be worth it to check.
If an instance method is declared as a pure function, then the location of invocation - the `this` value - is treated as if it had been an input, which is to say that dependency is allowed, but modification is not, and non-constant global or static variables are still inaccessible. (A class method which is inherited through parenting, and is therefore locally overridable in theory, can be logically treated as an accessible subelement of an input rather than a forbidden global variable.)
A pure function cannot call any method or instance method that is not also a pure function.
A pure function is forbidden to rely on Situation variables, and there is no circumstance under which a pure function could usefully create local Situation variables.
Const instance method / const function / const function arguments:
Given a const function argument, modification of that argument, or of any subelement or referent of that argument, is an error. A const instance method treats the value of `this` as a const function argument. (Note that `this` is not, as it is in other languages, a function argument itself.) This can be enforced either through compile-time checking, or through a tainting mechanism applied to expression operands and local variables. (IDE checking implies that some local variables are also declared as a "reference to const SomeClass" or a "const SomeClass*", as in C++; in other words, constness is treated as a statically typed property which can be traced and checked.)
Declaring a function as "const" is the same as declaring all arguments as "const", although the IDE may wish to expand this requirement by showing all arguments explicitly declared as "const".
An instance method or function declared as "const" may access global variables, including static class variables, but may not modify them. A function or instance method not declared as const, but having some const arguments, may both access and modify global variables.
As in C++, a method that obtains a const value is forbidden to pass that value anywhere where it may not be treated as const. An instance method declared as const is forbidden to pass `this`, or subelements and referents derived from `this`, as non-const arguments to procedures, or to invoke non-const instance methods on those elements. A function declared as const is similarly forbidden to pass global variables to non-const procedures.
While the above will probably be the most common forms of side effect control, other combinations are logically possible. The orthogonal components of side effect control are as follows:
There are some common design patterns that involve temporary breaks in an otherwise controlled set of side effects; for example, caching, in which a pure function has a strictly internal global variable that matches previously computed results to previously encountered arguments. (Output is still a strict function of input; only the amount of time necessary to compute the result is changed as a result of cache state.) Profilers need to be able to change certain pieces of local state, for example, a counter showing the number of times a function has been called, even if the function is declared const. Stack-based modifications temporarily change the global state, but then restore it.
Creating or destroying a reference, in Flare, requires modifying the uberdata list of references maintained on the target object, but this is certainly not something that side effect control should regard as a "modification", even though it may require obtaining a write lock on the reference list. (In Flare, the two-wayness of references is considered as a global ontological characteristic of references rather than a characteristic of the referenced object. Even adding a new reference-scoped invariant is considered as not representing a "modification" as long as the invariant check is debugging/error rather than application logic/exception.)
C++ allows loopholes by letting constness be cast away, or by letting certain variables be declared "mutable"; this kind of unbounded exception introduces added complexity, and significantly decreases elegance in a number of ways. Furthermore, allowing uncontrolled exceptions to side effect control may break the interpreter's ability to make useful assumptions - for example, that a const instance method will never need to acquire a write lock on an object (write locks for objects and for reference lists need not have the same behaviors). Thus, idioms for allowing controlled exceptions to side effect control would need to be higher-level language features, perhaps Level Five or Level Six, and would need to be tightly controlled to maintain consistency.
If constness is treated as static typing it can introduce further clashes with common patterns; for example, it makes sense to control a lookup operator or <redirect_lookup> interception for side effects, but the element reference returned should not logically be treated by the calling function as const unless the object itself was being treated as const when the lookup occurred. Under the C++ idiom, however, trying to return a non-const pointer to a submember at the end of a const instance function will result in an error. This idiom results in functions often needing to be overloaded with identical const and non-const versions. I would regard this as a buried flaw in the C++ statically typed constness mechanism - a flaw that breaks innocence in a very subtle way.
Side effect control means that a const instance method must not change the object, and that a function called with a const argument must not change the argument. Using static typing to control constness, however, means that a function called with a const argument must not only refuse to change the argument, but if returning a reference derived from that argument, must return a result that the calling function needs to treat as const, whether or not the calling function has promised not to modify that argument. In C++, this is necessary; supposing that the calling function had promised to treat an object as const, passing it to a const function and getting back a non-const reference would create the potential for violation of side effect control. But it violates innocence; whether the side-effect-controlled function treats the argument as const can have an effect on whether the calling function must treat as const references derived from that argument.
The ideal solution consists of derivation tracking. To prevent violation of innocence, the calling function needs to know which argument the return value is derived from, thus enabling the calling function to track constness locally - maintaining, in fact, the same idiom that would be used if no other function calls were involved. When `foo.bar` is written, without operator overloading being involved, the function in which that statement appears knows that if `foo` is const, it implies that `foo.bar` is const. Whether `foo.bar`, as an expression, has any effect on `foo` (it shouldn't) is an entirely separate issue from knowing that `foo.bar` is derived from `foo` and will have the same constness or non-constness as `foo`.
The first apparent idiom for dynamic tracking of side effect control, as opposed to the statically typed tracking seen in C++, is for constness to be dynamically represented as a %const.something annotation which is contagious across assignments, lookups, dereferences, and expression invocations. That is, if `foo` is const (has an annotation <p-const><something/></p-const>), then in `bar = &foo`, `bar = foo.&baz`, `bar = *foo`, and `bar = foo()` must all cause variable `bar` to inherit constness. Constness is not just a binary value, in this formulation; the plane %const actually tracks derivation through the use of specific tags. foo(const var a1, const var a2) will distinguish between constness derived from a1 and constness derived from a2.
# Legal Flare!What happens in this series of calls is that funcMod receives its arguments as <a1><ref>someobject</ref></a1> and <a2><ref>a string containing the subtag name<p-const>oldTag</p-const></ref></a2>. On passing the arguments to funcLookup, funcMod passes the existing constness for a2 to the argument c2, but funcMod creates a new "disposable" constness tag for the argument c1. Thus, funcLookup receives c1 as <c1><ref>someobject<p-const>newTag</p-const></ref></c1> and receives c2 as <c2><ref>sub tag name<p-const>oldTag</p-const></ref></c2>. The return value is <ref>subelement<p-const>newTag</p-const></ref>. Since the constness tag <p-const>newTag</p-const> is known to be "disposable" by the calling function, funcMod, the return value is no longer treated as being const.
var funcMod(var a1, const var a2)
ref b1 = funcLookup(a1, a2)
*b1 += 2
const ref funcLookup(const var c1, const var c2)
return c1.&(c2) #result has constness of c1
This algorithm appears to be viably recursive. I don't know how much it will consume in the way of additional resources, but functions that make no use of constness have no reason to track it. If individual elements have flags (which they probably will), a flag for HAS_CONST might be a good idea. Since side effect control is conceptually a debugging/error invariant, it should be possible to switch off constness without affecting application logic, just as with any other invariant.
And before you ask: No, I can't see any situation where a value has two different const tags. I do, however, find it quite easy to imagine that tracking constness might extend to dynamically tracking other kinds of derivation.
Like an instance method, a quoted expression must have inherent constness before it can be invoked, at all, by a const method. Most quoted expressions will probably be pure functional, but even so a quoted expression needs to be originally declared as const before it becomes possible for a const method to invoke one.
Even so, invoking a quoted expression with closure, or a subfunction with closure, is a tricky matter. A quoted expression with closure, or a subfunction with closure, may refer to variables that have constness from much farther up the calling stack; this implies either some clever tracking, or that constness tags are unique on the calling stack. (The latter is a good idea for a number of reasons, and can be done very easily by incrementing a long integer (bignum) each time a new tag is needed; in fact, constness tags can easily be unique on the scale of the application and this would be a good idea.)
But this means that if a function fails to recognize the constness tag of an argument handed back, it should assume that it represents a constness tag from higher up the stack that got in through a quoted expression invocation or through the use of a subfunction with closure. This feels a little fragile. It should work, but it still feels fragile, in the sense that if a nonsensical const tag somehow got in, it would never be found out. (It would be possible to verify this for debugging purposes by crawling back up the stack, looking for the declared use of a constness tag that matches the one found.)
If a quoted expression has no closure, then it can be executed as if it were a statement found in the ordinary code. Assuming the quoted expression is const, of course.
Since a quoted expression has no arguments, constness consists of promising not to change local variables higher on the stack and promising not to change global variables. If a quoted expression has no closure, constness consists purely of promising not to change global variables, with constness on local variables being handled as if the quoted expression had been found directly in the FlareCode.
Creating or destroying a reference may require a write lock on the reference list, but it is not, conceptually, a modification of the referenced object. This may turn out to require that all references to const objects automatically be soft references, so that GC behavior is not affected, but it looks to me like this case is very unlikely to arise in practice; the reference passed had to come from somewhere, after all. Maybe the rule should be that following a soft reference on a const object automatically gives rise to another soft reference, but it seems to me that this is likely to violate the programmer's intentions as often as conform to them.
Caching seems to require allowing a type of controlled exception that could conceivably be used to turn an allegedly strict function of input into a sequence; i.e., a "functional" function that returns f(1) == 1, f(1) == 1, f(1) == 2, f(1) == 3, f(1) == 5, f(1) == 8, and so on through the Fibonacci sequence. In other words, allowing caching of "functional" methods means trusting the programmer, with the only possible means of verification being a test suite. Thus, caching, or rather the feature that enables caching, should be a high-Level feature and one marked as dangerous.
Objects that are const during most of program execution may not be const during program setup. For example, a static constant value that is computed during startup, assigned to during startup, and which then becomes constant. C++ handles this by allowing constant statics to be assigned to during their initialization statement, which seems to me like a good idiom, as long as there's a place for a non-default priority if one is needed.
Profilers and logging, again, seem to require a type of controlled exception that could be very easily abused in the general case, since it requires certain watching threads or certain method calls to be excluded from the realm of "application logic" and placed in the realm of "debugging logic", like invariants. At the least, logging calls should never, ever block - the actual write to a file, if any, should take place in some completely different thread, and the call that passes a message to that thread should return immediately.
In the very long term, it seems like there is conceptually a separate realm for "debugging" things and "application" things, and that things that are controlled for application side effects are not necessarily controlled for debugging side effects, but that all debugging things are strictly controlled for application side effects. But such fine-grained control of side effects is a serious extra order of complexity, and it is acceptable to implement profiling and logging via ugly hacks initially.
Since invariants are supposed to be external to the application logic, and all invariant violations treatable as errors rather than exceptions, it should go without saying that invariants themselves are controlled for side effects!
There are some invariant behaviors that reinforce the idea of a separate realm for debugging information - for example, it may be desirable for an invariant to compare an argument on entrance with the return value, which implies that the initial value is stored in the realm of debugging, which implies that invariants are not controlled for side effects on the realm of debugging, since storing a value is a side effect. Initially, however, it is acceptable to just have special keywords for `enter.varname` or whatever, or even to rule that postconditions that look at values on entrance are a higher-Level feature.
If elements have flags, a flag for HAS_CONST, to
save some lookups in the default case.
A way of dynamically tracking constness, including different constness tags. (Note that constness belongs on local variables and on expression operands - contagious constness, the kind of constness that requires a tag to track, should never wind up on an application object!)
Behaviors for lookup, dereference, and assignment that infect the returned value with the constness (if any) of the operand.
The behavior for assignment can be handled by labeling <p-const> as a "content" subelement. (There are other planar elements that may wish to be transmitted through assignment as well.)
The behavior for lookup and dereference can be implemented through the underlying behavior functions, and by requiring that any operator overloads for lookup and dereference be "const" (meaning that they will automatically track any pre-existing constness handed them). If there is ever allowed a non-const operator for lookup or dereference, it will be a special case and a dangerous feature, since that type of behavior is highly unexpected.
The behavior for expression invocation is more complicated, so invoking an expression with constness is a Level Five feature. When this feature is not present, quoted expressions have no constness, and const methods can never invoke them; methods with a few const arguments cannot invoke quoted expressions found on those arguments, i.e., quoted expressions can never be invoked if they have been infected with a constness tag.
See "Elements of Flare".
Methods, in today's programming languages, are flat. There are a few growing exceptions to this rule; some languages permit a function to be defined inside a function, although there is often the added assumption that such subfunctions have closure with respect to a particular function call; similarly, some languages allow classes to be defined inside classes, but with the assumption that the subclass belongs to a particular instance of the enclosing class.
While both of these are interesting language constructs, especially the idea of a subfunction with closure, I'm more interested in subfunctions (and subclasses, for that matter) as a way of neatly packing functionality in modular ways. If a certain simple instance method is used only by another, more complex instance method, then there is no reason why the simple instance method shouldn't be defined within the namespace of the more complex instance method. It helps to keep things neatly packaged, puts the code where it belongs, and reduces the number of immediately visible functions when looking at the class. The problem with making subfunctions or subclasses special language features that do neat stuff with closure is that they are then reserved for special occasions, rather than being used ubiquitously; furthermore, IDEs with built-in class browsers may fail to have features for browsing subfunctions or subclasses.
The real guts of human-written methods will probably always be flat, having most of their structure in the mind of the programmer who thought of the algorithm. Even so, there are several common structures, patterns if you like, that could be explicitly represented within methods, but usally aren't today.
There are several benefits to making this structure explicit. (There are almost always benefits for making things explicit or declarative.) The first benefit is that it becomes easier to write IDE tools that observe or modify functions in particular ways, with AspectJ being an example of a language that implements a very simple form of method structures. The second benefit is that breaking functions into smaller pieces makes it easier to write a small piece of code that can be reused - either through cut-and-paste, or better yet through explicit reference. The third benefit is that it becomes much easier to override or extend exactly the piece of the method that you want to override and extend, without needing to cut and paste and modify the whole original method. The fourth benefit is that certain forms of self-modification, self-analysis, and code generating code, all become much easier. And the fifth benefit is that breaking up methods into smaller (annotable!) pieces often enables the language to easily do things that would otherwise be nearly impossible. (Prioritized wrappers, for example, or correctly handling cumulative content under multiple inheritance.)
Remember that the term "submethod" can mean a full method, or just a function. That is, you can use a method structure inside a method structure, but usually you'll just use the flat block of code that is a subfunction.
(Worth looking at: The language "Beta". I'd provide a link but my Internet connection is down and I can't google for it. I'm not even sure I'm recalling the name of the language correctly.)
One of the most common patterns in any programming language is a function - or, more likely, an instance method - whose sole purpose is to "wrap" another. For example, in C++, I often found myself using X() and XSelf() methods; the idea is usually that X() is defined on the base class and contains a fairly constant set of setup and cleanup actions, while the XSelf() function performs the actual meat of the method and is often overridden by derived classes.
Another common usage of the Wrapper pattern is when a derived class overrides a method to perform some extra setup or cleanup actions, but still passes the arguments to the base class's overridden method and gets the final result back from that method. (This should not be confused with the case of overriding the base class to perform some extra actions, but still calling the base method to perform the rest of the actions. Wrappers are typically used for setup and cleanup, or for modifying function arguments before continuing, or for ensuring that the content function can compactly return a value from anywhere in the function and that some final action will be performed with that return value, or for performing some polymorphic action that can succeed or fail and taking different actions based on that success or failure.)
A structured method can contain any number of "wrapper" submethods. A wrapper submethod must have the same method signature as the wrapped method. This may seem like an annoying and innocence-breaking requirement, since X() and XSelf() methods often have different signatures; the purpose is to ensure that multiple wrappers can be innocently stacked. In other words, when a wrapper submethod finally makes the statement `return pass(arg1, arg2)` or `result = pass(arg1, arg2)`, that `pass` statement may either pass to the content submethod, or it may pass to another wrapper. This ensures that polymorphic derived types can add new wrappers at any point.
Wrappers are sorted by priority. Higher priorities go first. If no priority is declared (assuming the FlareSpeak IDE allows no priority to be declared), the default priority is zero. If two wrappers have the same priority, it is an error. See 3.3.1: Priorities. Thus, a polymorphic derived class can declare a wrapper that comes between two previously declared wrappers.
Wrappers will most often be simple functions rather than complex methods, at least if you don't want your code to be insane.
Wrappers, parenting, and inheritance:
Polymorphic derived classes should be able to add new wrappers to the list.
Polymorphic derived classes should also be able to override existing wrappers. Thus, wrappers must have names as well as priorities; otherwise, they will not be overridable. A wrapper's name is defined within the namespace of the method that defines it (wrappers on different methods cannot conflict!).
This occurs in much the same way, and for much the same reasons, that polymorphic derived classes need to be able to add new methods or override existing methods.
In terms of the parenting semantics defined in "Elements", this can be implemented as follows: A singleton <wrapper> element inside the method object, which is automatically parented to the <wrapper> object (if any) on the parent of that method object. The <wrapper> object may contain any number of named subelements, and the default metadata for these subelements is the metadata of a method with the same signature as the wrapped method. That is, the <wrapper> element has the parenting type "single element - parent", and the subelements of <wrapper> have the type "set element - accumulate and replace".
Sometimes, overriding a method is used to add new actions to a method, rather than actually overriding anything. The new method looks superficially like a wrapper - calling super.meth() or inherited::meth() or pass(args) - not because the arguments have been modified, or because something is being done with the return value, but just to carry out the previously implemented behaviors, unmodified.
This way of implementing cumulative content works most of the time, but often runs into trouble in languages that permit multiple inheritance. If you have an inheritance diamond, where A is the parent of B and C, and D inherits from C and B, then - in most languages - both B and C may override a behavior meth(), such that B.meth() calls A.meth() and C.meth() also calls A.meth(). This introduces at least two problems. First, simply declaring the class D will often force you to override D.meth(), just because a call to D.meth() is otherwise ambiguous (does it call the inherited B.meth() or C.meth()?). Second, when you do override D.meth(), you find that passing to the inherited B.meth() and C.meth() results in two calls to the original function A.meth().
The first time I ran into this problem in C++, I was forced - if I recall correctly - to actually cut and paste from A, B, and C in order to write a polymorphic override of D that did the right thing. This is, of course, utterly unacceptable, not just because of the implicit dependency, but because I was using a system where there were a lot of inheritance diamonds, quite often for surface-level application classes that were supposed to be ubiquitous, and I didn't want to have to define D.meth() at all, let alone cut and paste. (Yes, I had to use inheritance diamonds. Inheritance diamonds are often exactly the right way of expressing something, and it's a pity that present-day languages don't know how to handle it.) The eventual solution I used involved some black magic that took a pointer to a virtual member function and called all implementations of that function on all base classes. I couldn't have done it without an existing reflectivity system, but it worked like a charm once I had it set up. And yes, I testify that cumulative content is useful.
One important rule, though, is that cumulative content can only be called automatically by the system if there is no return value. To use cumulative content with a return value requires a programmer-written coordinating function that calls the pieces of cumulative content and does something interesting with the return values (for example, stopping when a piece of cumulative content returns true). There are thus two ways to use cumulative content. One way is to define a function that has no return value. In this case, all pieces of cumulative content should have the same method signature as the greater method, and all pieces of cumulative content will be called sequentially with the same arguments. I want to make this Parallel By Default but I think in most cases that won't be what the programmer wants, so I think instead cumulative content should use the same ordering used in parenting for accumulating list subelements (this is, in fact, the underlying idiom), or prioritized under the standard rules if any piece of cumulative content possesses a priority.
The second alternative would be to have the main content of the method contained in a <body> element; the <body> method must possess the same signature as the greater method. If a <body> method exists, it is responsible for obtaining a list of the pieces of cumulative content and calling them. In this case, all pieces of cumulative content must have the same signature and this signature must be declared in advance even if no content is present, but the signature need not be the same as the signature of the calling method.
Unlike wrappers, it is probably not a good thing if cumulative content can be overridden - if you need to override something, then cumulative content is the wrong idiom to use.
Cumulative content should be implemented through the use of the "list subelement - accumulate" parenting behavior on subelements of type <content>.
Another common pattern is to have certain sections of a method that are there just to perform the setup of local variables, or sections that just close up and shut down after the method is finished.
There is no overriding structural reason to have separate <setup> and <cleanup> elements, but it's still nice to be able to do so - it makes explicit what was previously implied. Explicit formalization always has benefits, even if they're small. For example, it might be easier to have an IDE that enables you to attach common pieces of setup or cleanup code to an existing function; it might be easier to override the body of a function and leave the setup and cleanup code in place; explicitly declaring the purpose might make it easier to take in a method at a glance, especially if there are subtle user interface cues; someone might invent invariants or forms of side effect control that work especially well with setup and cleanup code... and so on.
For example, the Causality library might define a keyword (optional, used only if Causality is imported) which declares that a function requires the object to be in valid state; if the function is called when the object is not in valid state, the object will recompute before the function body is entered. This might be implemented as a small <setup> block, or a reference to a standard version of that block, which when present causes the FlareSpeak IDE to show the "valid" keyword - if Causality is being used, anyway.
Setup and cleanup blocks cannot be submethods; they must be subfunctions that act within the scope of the body of the main method. This is because setup and cleanup blocks should have access to local variables.
Thus, if "setting up the local variables" looks like a distinct chunk of code, and one likely to remain constant as the body of the method changes in derived types, it may make sense to put this in a <setup> block.
Setup and cleanup blocks should be cumulative, and should have the ordering defined for cumulative list subelements, which is what <setup> and <cleanup> should be. I don't think they should have priorities because that strikes me as being Just Too Much. Likewise for making setup blocks overridable. Use scopeless subfunctions or wrappers for this sort of thing.
Some of the most important parts of a method are the preconditions and postconditions, invariants that apply to the function arguments or function return; or, in an instance method, to the state of the object as well; or even to global state. Invariants may access function arguments, the object, or global state, but may not modify any of these things; invariants, as described above, are subject to side effect control - not least because some implementations may turn off some invariants except during debugging. Violation of an invariant is thus always an error, not an exception that might form part of the application logic. (A Level Five feature: Allow a variant of an invariant that is always called and that is permitted to throw an exception. It probably still shouldn't be possible to allow an invariant that has side effects; that is simply abusing the idiom.)
A Level Three feature is to allow postconditions to examine the state of the function arguments and compare them to the return result. This implies that the initial values of the function arguments are automatically saved when the method is invoked, in some private little place that the postcondition can then examine. This feature, which introduces some extra overhead, need not be enabled if there are no postconditions present - even if postconditions are present, they still may not need to look at the initial argument values. Perhaps there will be a <save_entry/> flag in methods that indicates whether this needs to be done.
Most preconditions should be called before all wrappers, and most postconditions should be called after all wrappers. I can think of conditions under which I'd want it to work the other way - call preconditions after all wrappers, and postconditions before all wrappers - but this way should probably be the default; this is because the whole idea of "design by contract" implies that an invariant is a promise to the external world. Of course, if a wrapper is a submethod and not just a subfunction, it can have its own invariants, but this will usually constitute Flare Abuse.
If a precondition or postcondition is ever a submethod rather than a simple subfunction or expression, it probably implies Flare Abuse. Whether a subfunction or an expression, preconditions and postconditions - like all invariants - must return a boolean truth value, and should return true.
Invariants are cumulative and cannot be overridden. A derived type may add new invariants, but may not relax existing invariants, because an invariant is a promise. (Although lately I've been thinking that the rule should actually be that an overriding method may relax preconditions, but not add new ones, and add new postconditions, but not relax existing ones. This would follow the design by contract paradigm, and ensure that a function that can be safely called and safely used on a base class can be safely called and safely used using any derived type.)
Companion to the idea of invariants, a test suite is composed of a method or methods that test out a method, class, or module. A method can be tested by checking that outputs match what are expected from inputs; classes can be tested by creating an object and putting it through its paces; modules can be tested when multiple classes or modules are supposed to relate in certain ways. Modules, classes, and methods can all contain test suites, used for debugging to verify that the behavior of a module, class, or method still matches what was promised.
Finally, of course, there are submethods (usually, simple subfunctions) defined inside a method and existing in the namespace of that method. These submethods can be overridden, without overriding the rest of the method, in a derived class. I put this idiom last because it's so easy to do all the other things described wrongly by using this one language feature - the features described earlier are explicit, and declarative, and can have language support for things like priorities and multiple inheritance and so on; overridable submethods are the proverbial 50% right way of doing it. The 50% right way may get implemented before the 90% right way or the single perfect gem, but it's usually better to hear about the perfect gem first so that you know what you're approximating.
If the prototype can have full support for the other kinds of method structures described here, if it's not too expensive, then great.
Subfunctions, submethods, wrappers, et cetera, can all be called only
from within the enclosing method. If there's ever a reason to call
an overridable submethod outside the enclosing method, make it a regular
instance method! That's what everyone has to do in existing languages.
Part of the whole point of having and using submethods is to reduce the
clutter of top-level methods and to help protect from inappropriate
use a method that should belong to another - for example, a method that
is only called by another method while the object is in a certain inconsistent
blocks with stack-based stuff
variable invariants (metadata)
variable static typing (metadata)
try... except... finally
subfunctions with closure; Situation handlers
active stacks and yankback
method call issues
value of "this"
how to preserve (a) instance method references and (b) references as instance methods - answer: use parenting for latter
expressions - how assignment works
expressions - list of statements
Parenting and stuff
Non-transparent types (Pointer, List, etc.)
...and much much more; I have a checklist somewhere...
PEP 253 - Subtyping Built-in Types
Lemburg's reasoning for PEP 208
Fixing the JVM Memory Model