Flare Operators

Not written by Dmitriy Myshkin.
Date: 8/13/01

This document supercedes any conflicts with FlareSpeak and FlareCode.

1: Flare operators

If it seems like some of these operators may be ambiguous, remember that all whitespacing in FlareSpeak is significant. To distinguish between '%' the arithmetic remainder and '%' the planar access, it can be stated that % arithmetic includes whitespace on both sides, while % planar includes whitespace on neither side.

The familiar:
+    arithmetic add, string concatenate, list concatenate
-    arithmetic subtract, list set removal, unary negate
*    arithmetic multiply, unary dereference, string repeat?, list repeat?
/    arithmetic true divide
//   arithmetic floor divide
%    arithmetic remainder, also planar access (if there's a conflict, the arithmetic remainder goes first, or is replaced with /%)
**   arithmetic power operator
<<   integer left shift (overloaded for stream)
>>   integer right shift (overloaded for stream)
>>> logical (unsigned) right shift
&    integer and, list intersection?, list hasmember?, unary reference?
|    integer or, list union?
^    integer xor, post-unary metadata (foo^)
~    unary integer negate (invert bits)? Would prefer this to mean "completely follow pointer".
!    boolean negate (inverse truth value)
&&   boolean and (return first, or second if first true)
||   boolean or (return first, or second if first false)
<    arithmetic compare (may also indicate start of XML literal)
>    arithmetic compare (or XML meaning, depending on whitespace)
>=   arithmetic compare
<=   arithmetic compare
<>   rich arithmetic compare (-1, 0, or 1)
==   universal compare
!=   universal compare
$== non-transparent universal compare (see below)
Quotes:
""   string quote
``   quoted expression
''   string quote?
List:
l[x]      list access
l[x:y]    list slice
l?[x]     test list access
l?[x:y]   test list slice
l[@]      synchronized mapping operator, parallel by default
l@[x]     map access across list of lists (the other kind of slice)
l$[x]     non-transparent list access (see below)
          Note: x and y must both be integers, or must be exactly
          coerced to integers. If x or y is negative, the syntax is
          the same as for Python.
Access:
    Normal accesses:
f.b       ordinary access
.b        targeted access (in an instance method, equals `this.b`)
f%b       named planar access
%b        targeted planar access (`this%b`)
f.%b      default planar access (use default plane for this module)
.%b       targeted default planar access
f.(b)     find element whose name is the value of b - if b is not a string, b can make an intelligent search.
    Access adornments (for '.' below, substitute any access or some adorned accesses):
$.        non-transparent access (see below)
.&        produce address of accessed element
@.        mapping operator (produces list value)
.?        access test (true if subelement exists, doesn't go with .& or @.)
@?.       tentative mapping operator (produce only those values that exist, removing failures from list result)
Assignment:
=         ordinary assignment
in-place assignments:
+=, -=, *=, /=, //=, %=, **=, <<=, >>=, >>>=, &=, |=, ^=
(Note that, unlike Python, an in-place assignment never changes the type of the left-hand variable. This may result in an error.)
+=        string concatenation, list addition?
|=        set addition?
-=        list remove?
*=        string repeat?, list repeat?
Assignment adornments (may substitute any of above or below for '='):
=?        tentative assign, assign if right-hand expression does not fail
?=        assign if left-hand expression does not fail
?=?       assign if neither expression fails; trap failures either side
=$        non-transparent assignment (see below)
Procedure invocation:
f()       invoke f (works for methods + subfunctions + scopeless subfunctions + quoted expressions + generators)
(f)()     as above
f(b)      invoke f with argument b (methods + subfunctions + coroutines)
f(b, z)   invoke f with arguments b, z (methods + subfunctions)
f(b=z)    invoke f with argument z bound to keyword b?
          Distinguish between assignment and keywording through use of whitespace?
          Use ':' instead of '=' ?
Invocation adornments:
f@()      map access and produce list result
f@?()     map access and produce list result, ignoring failures
f$()      non-transparent invocation (see below)

Issues:

Convert * to post-unary operator so that foo*.bar means (*foo).bar? (Though usually this shouldn't be necessary, thanks to transparency.)

Make both & and * work as unary or post-unary operators, i.e. foo& == &foo?

$ definitely needs to work as both unary and post-unary.

2: Transparency for references, expressions, and generators

References in Flare are transparent under certain circumstances. That is:

bar = 4
foo = &bar
foo
=> (a <ref> to element <bar>)
*foo
=> 4
foo + 8
=> 12

There are certain operators that do not generally expect a reference as operand. For example, `+` will work on numbers and strings and lists, but has no meaning when applied to a reference. Therefore, if a reference is used as one of the operands to `+`, the reference will be transparently followed, without a `*` dereference being necessary. Similarly, quoted expressions will be transparently evaluated. Subfunctions with no arguments and a return value will be transparently invoked. Generators will be transparently invoked.

If the result of the dereference or invocation is another reference, expression, subfunction with no arguments, or generator, that result will be transparently dereferenced or invoked as well. The chain continues until a nontransparent value is found (or, under some circumstances, until a static type is matched; see below).

The operators that result in transparent invocation or dereference are these:

binary operators resulting in transparency on both sides: +, -, *, /, //, %, **, <, >, <=, =>, <>, ==, !=, <<, >>, >>>, &, |, ^, .
f + b     transparent on both sides
f = b     transparent on right side only - use *f = b to assign.
f *= b    transparent on right side only
f =$ b    not transparent on either side
f = $b    same as above, `$b` or `b$` produces non-transparent operand `b`
f == b    transparent on both sides
f != b    transparent on both sides
f $== b   not transparent on either side (replace with =$= ?)
f $!= b   not transparent
f()       f is transparent
f$()      f is not transparent
f[x]      f, x are transparent
f$[x:y]   f is not transparent, x and y are transparent
f.b       f is transparent
f$.b      f is not transparent
f.&b      f is transparent, produces <ref>to f.b</ref>, but this ref is not transparent when used as an operand
&f        f is not transparent, produces <ref>f</ref>, this ref is not transparent when used as an operand
$f        produces f, f is not transparent when used as an operand
f$        as above - works on both sides.
          f$.b or f$() is probably two distinct operators
f $+ b    Does not exist - should never be needed in good code.
f !+ b    Do not add f and b (joke! this doesn't exist)
f && b    f and b are transparent
f $&& b   f and b are not transparent
!f        f is transparent
!$f       f is not transparent
*f        Dereference f once, f is not transparent, result is not transparent when used as an operand
~f        Would like this to mean "fully dereference f". May override integer negate if necessary (~i would be become negate(i) - note that the potential ambiguity lies in the FlareSpeak, not the FlareCode or the language).
          Produces a value, not an operand; ~f cannot be assigned to. (Iffy language decision there.)

An object that overloads <op-invoke> or <op-dereference> does not automatically become transparent. A separate overload for <op-transparent> is needed for that (albeit this may just be a reference to the function for <op-dereference>). Since transparency is not automatic, it is possible to construct opaque "pointer" types which are never transparent and which will only dereference when * or ~ is applied, by overloading <op-dereference> but not <op-transparent>.

A final general rule is that a transparent operand targeted at a statically declared non-variable type stops being transparent as soon as it matches that type. For example:

a = 2
b = &a
c = &b
int* d = c # d now equals &a, i.e. the value of b. c is tested against the type that d accepts, doesn't fit, and so becomes transparent once, is dereferenced once, then is tested again, and now fits the appropriate type.
e = 3
f = &e
d = $f # changes d to point to e instead of a - does not assign a's value to e. d = f is an error if d is declared int*, or produces d = 3 if d is an undeclared local variable.
*d = 4 # changes value of e to 4

This requires that the interpreter have knowledge (explicitly represented in FlareCode) of when a possibly transparent operand is expected to have a specific type which would normally be transparent. As far as I can tell, this situation should arise only for assignment and argument passing (and returning). Furthermore, unless the only types accepted by the operand/operator are ones which would normally be transparent, transparency carries on as before. In other words:

a = 2 # a is now <num>2</num>
b = &a # b is now <ref>...a...</ref>
var c = b # c now contains <num>2</num>, even though type var can also contain <ref>a</ref>. This helps prevent the unwanted proliferation of chains of references.

3: Type priorities replace type coercion

(Please take a moment to read at least the "Introduction" section to Lemburg's reasoning for PEP 208, which becomes especially relevant when considering operator overloading.)

Suppose that a float is being multiplied by an integer. One common method of resolving the situation is to lift both arguments to the least common multiple; i.e., in this case, a float. However, as Lemburg describes, lifting both arguments to a common type has several problems; not least, that it negates many good uses for operator overloading in which operators act on different types. Coercion to a common type is at best a means for dealing with numeric arguments.

Another approach is to make one operand responsible for implementing the operation on the other operand. This is the approach that Python uses. The Python uberdata for a type includes a set of operand implementations - that is, there's a C struct that contains function pointers to the functions that implement, for example, the arithmetic operators. The actual Python operator is dispatched using the function pointer contained on one of the operands.

So which operand gets to handle the other? One approach is to always call the left-hand operand to handle the right-hand operand, or to check to see if the left-hand operand can handle the right-hand operand before checking to see if the right-hand operand can handle the left-hand operand. Python, I believe, currently uses a hybrid approach, in which, whenever an object is used as an operand along with a basic type, the object is checked first; otherwise (two objects or two basic types) the left-hand operand is checked first. (Of course, this totally breaks Python's ongoing alleged unification of classes and basic types.) So, in addition to __add__ operator overloading, Python also defines __radd__ and __iadd__, handling right-side operations and in-place operations.

# In Python:
f.__add__(2)     overloads f + 2
f.__radd__(2)    overloads 2 + f
f.__iadd__(2)    overloads f += 2

I am in basic agreement that a binary operator can have three different operator overloads. I believe that Python uses intelligent defaulting, so that if f.__add__ is defined but the others are not, then `2 + f` === `f + 2` and `f += 2` === `f = f + 2`. I am in basic agreement with this as well.

But since Flare is an annotative programming language, classes can have explicit operand priorities that determine which operand is first checked to see if it handles the other. (However, augmented assignments always check the left operand for an augmented assignment handler first; see below.)

Priorities are floats; priorities are always floats, though they will usually be floats with integer values. Default operand priorities for builtin types are as follows:

Numbers:
10.0 integer
20.0 long
30.0 rational
40.0 float
50.0 complex integer
51.0 complex long
52.0 complex rational
53.0 complex float
100.0 default for most classes
110.0 strings
120.0 list types
200.0 default for classes that overload at least one operator

Again, note that Flare does not use type lifting. float + complex long == complex float. The priorities indicate which type is first called on to handle arguments. It does not declare a hierarchy that determines result types.

Note that everything from an integer through a float still fits in class <num> - these priorities apply to the content subtypes, not just the explicit class. So too with the complex class and the complex subtypes. (I'm not currently sure whether complex numbers should be unified with numbers... Python does this, I think, but it strikes me as a bad idea. Open language decision.)

Note that `"foo" + 2` equals "foo2", because the string is of a higher priority and is thus supposed to know how to handle lower-priority types like integers. Furthermore, whether the expression is `"foo" + 2` or `2 + "foo"`, the string will still be called on to handle the operation, because the string has the higher priority. Most objects, when added to a string, will be handled by the string, which can thus choose to call object.toString() or whatever; however, objects that implement operator overloading will get a chance to handle strings, rather than vice versa.

And that is how an annotative language handles the problem.

As always, a priority can be explictly declared - the defaults are just that, defaults. Since priorities are very frequently accessed, they should be compiled into the C++ uberdata. Similarly, operator overloading would change the underlying uberdata (the C/C++ struct with all the function pointers) to replace the function pointer for a given operation (possibly null) with a function pointer to a function that looks up the method name which overloads that operator. This function (as a piece of C++ code) may also try to coerce the other operand if the operator overload declares a static type or types.

Note: Need to decide how operator overloading interacts with transparency. In fact, this issue arises in some cases even without operator overloading:

a = 2
b = &a
int*[] c
c += b # error? requires $b? Language decision.

3.1: Augmented assigment

Augmented assignment cannot change the type, or subtype, of the left-hand operand. This works as follows:

a = 2
a += 1.5 # error; augmented assignment cannot change type or subtype of left-hand operand.
a = 2
a += 1.0 # succeeds; 1.0 can be exactly coerced to 1
a = 2
a *= 1.5 # succeeds; 3.0 can be exactly coerced to 3
a = 2.0
a += 1.5 # succeeds; a's type is already float
a = 6
a /= 4 # error; result is a rational number
a = 6
a /= 2 # succeeds; result can be exactly coerced to an integer
a = 6
a //= 4 # succeeds; this is floor division, and result is integer
a = 6.0
a //= 2 # succeeds; result can be exactly coerced from 3 to 3.0
a = 6
a = a / 4 # succeeds; augmented assign not used, result is rational 2/3
a = 2
a += "4"   # quite possibly succeeds, even though `2 + "4"` is usually "24".
           # NEW: Dmitriy thinks this should fail. On reflection,
           # I agree with him. Use num("4") or something.
a += "4.0" # quite possibly succeeds - not!
a += "4.5" # definitely fails
a += "f4" # definitely fails

In an assignment or augmented assignment, therefore, the left-hand operand is always first called on to handle the right-hand operand; never the other way around. The assumption is that the left-hand operand will coerce the other operand to something it knows how to handle, or will try and ask the other operand to handle the binary operation, then exactly coerce the result to its own type. If both of these attempts go awry, the result is a failed expression.

Note the order of execution:

1. In-place assignment (in Python terms, __iadd__) is called on the left-hand operand.
The function that performs in-place assignment will probably attempt to exactly coerce the right-hand operand into something it can handle.
This in-place assignment, if it fails, or if no applicable operator exists, may attempt to delegate the task to a binary operation on the two operands, meaning that:
2. The higher-priority operand is called upon to handle the binary operation. (If the two priorities are equal, the left-hand operand gets first crack at handling the binary operation.)
3. If the higher-priority operand can't handle the binary operation, the lower-priority operand is called upon to handle the binary operation.
4. The result of the binary operation, if any, is exactly coerced to the type of the left-hand operand - not just the static type, such as "num" or "var", but the current subtype, such as "integer". If the coercion cannot be performed, or the coercion is not exact, the expression fails.

A possible alternate language design decision would be that augmented assignment can change the type or subtype of the item assigned to, as long as it still matches any static type declared by the container. In which case `a = 2; a += "4"` would yield "24".

3.2: Explicit priority declarations

Inside a FlareSpeak file, an explicit priority declaration (shouldn't be necessary all that often) would probably look like:

class foo
      priority: 250.0   # extra whitespace; usual FlareSpeak idiom
                        # for "attribute, not content".
    var someInstanceMethod(int arg)
        return arg + 3
    # ... usual class stuff

Note, incidentally, that in the case where at least one operator is overloaded, the Flare IDE should automatically write a FlareCode file where the class has a "priority" value of 200.0. On reading this file, the Flare IDE should notice that this is the default value for a class with at least one overloaded operator, and should not add a "priority: 200.0" statement in the generated FlareSpeak. If a non-default value such as 250.0 is encountered, the Flare IDE should show "priority: 250.0" in the generated FlareSpeak.

Because priority needs to be checked with every operation, priority must be processed into the uberdata - i.e., the C++ FMetadata object corresponding to the Flare metadata must have a FMetadata->priority instance member.

It is conceivable that, for reasons of speed, floating-point priorities will be limited to 4 decimal places of precision, enabling FMetadata->priority to be represented as a signed integer type (priority * 10,000). I think this will be a trivial speed boost, but if you're using 5 decimal places your code probably sucks anyway, so making this a program invariant / language constraint is probably reasonable. Priorities should probably also be limited to the range [-100,000.0000..+100,000.000] to ensure that they fit in a 32-bit integer. (Note that this is stored on a C++ metadata structure, not a C++ representation of the Flare element, so conserving space is even less of an issue.)

4: So how does the interpreter actually evaluate an operator?

The handling of an operator is delegated to the operand. In the event that a binary operator (1) is used, the first opportunity goes to the operand with the higher priority.

Priority is represented as a <priority> subelement on metadata objects, and is parsed into C++ uberdata as an FMetadata->priority instance member.

The handler for an operand is a function pointer located on the metadata, or accessible through the metadata. For built-in types, the work will be done within this C/C++ function.

Operator overloading within a class causes that class's metadata to have a new table of function pointers. In this new table, the function pointer for the overloaded operator is replaced by a standard C/C++ function which looks up the appropriate Flare method on the Flare object and invokes that specialized method - <op-add>, <op-right_add>, or <op-inplace_add>, and so on.

In unoptimized pseudocode mostly inspired by looking at Python's source code:
(Does not include locking, temporary value reference counting, or any top-down context information deduced from looking at static typing.)
For FlareCode codon <plus>:
FOperand* FC_Plus (FOperand* left, FOperand* right)
{
SInt32 pleft, pright;
pleft = left->meta->priority;
pright = right->meta->priority;
FOperand* result = null;
FOperatorTable* ops;
FBinopFunc handler;
if (pleft >= pright) {
ops = pleft->meta->operators;
handler = ops->binop_left_plus;
if (handler) {
    // In our story, calls FOperator_Float_Plus
    result = (*handler)(OP_PLUS, left, right);
    if (result)
        return result;
}
}
// fall through
ops = pright->meta->operators;
handler = ops->binop_right_plus;
if (handler) {
    result = (*handler)(OP_RIGHT_PLUS, left, right);
    return result;
}
return null;
}
FOperand* FOperator_Float_Plus ( FOperatorDesc,
                                 FOperand* inLeft,
                                 FOperand* inRight )
{
/* Not necessary - addition is commutative (right?)
if (op == OP_RIGHT_PLUS) {
    FOperand* swap = left;
    left = right;
    right = swap;
}
*/
FOperand *left, *right;
// Cache these? Pass an argument that holds the resulting transparency if it's already been done? Cache them on the FOperand?
left = inLeft->FLTransparent(null);   // No static type to match.
right = inRight->FLTransparent(null);
if (!FTestType_Float(left)) {
    FOperand* coerce = left->FLCoerceExact(FType_Float);
    if (!coerce) {
        return null;
    }
    left = coerce;
}
if (!FTestType_Float(right)) {
    FOperand* coerce = right->FLCoerceExact(FType_Float);
    if (!coerce) {
        return null; /* probably not this simple anymore, due to reference counting */
    }
    right = coerce;
}
// Lock left, right?
float f_left, f_right;
static_cast<FOperandFloat*>(left)->LGetFloatValue(&f_left);
static_cast<FOperandFloat*>(right)->LGetFloatvalue(&f_right);
float f_result = f_left + f_right;
FOperandFloat* result = FOperandFloat::CreateFromFloat(&f_result);
return result;
} // end FOperator_Float_Plus

1: Python has a single trinary operator, the "power" function, which has an optional modulus argument. I think that sort of thing ought to be done in a math library, but I guess the moral is to write operand resolution algorithms that generalize easily.