Goals of Flare

©2001 by Singularity Institute for Artificial Intelligence, Inc.; All Rights Reserved.

Foreword:  Earth needs Flare

A new programming language has to be really good to survive.  A new language needs to represent a quantum leap just to be in the game.  Well, we're going to be up-front about this:  Flare is really good.  There are concepts in Flare that have never been seen before.  We expect to be able to solve problems in Flare that cannot realistically be solved in any other language.  We expect that people who learn to read Flare will think about programming differently and solve problems in new ways, even if they never write a single line of Flare.  We think annotative programming is the next step beyond object orientation, just as object orientation was the step beyond procedural programming, and procedural program was the step beyond assembly language.

Flare was created under the auspices of the Singularity Institute for Artificial Intelligence, an organization created with the mission of building a computer program far before its time - a true Artificial Intelligence.  Flare, the programming language they asked for to help achieve that goal, is not that far out of time, but it's still a special language.  Flare is not particularly an Artificial Intelligence language, although there are certain properties (for example, ease of self-examination and self-modification) that are visibly suited.  What the Singularity Institute asked for was a programming language powerful enough to grow to match the job; a programming language that took a first step beyond the limitations of the present day.  That's Flare.

You can see it in the idea of a FlareSpeak IDE; an IDE that, even if the first versions show FlareCode as plaintext, is not bound to plaintext; where languages like Python and Perl represent the upper limit of what can be done with plain text, a plaintext FlareSpeak IDE is the beginning of what can be done to edit FlareCode.  You can see it in Parallelism By Default, which may have only a few uses on today's single-processor machines, but which can conceivably grow to make full use of massively multiparallel systems.  You can see it in the idea of building a programming language represented as XML, an extensible substrate.

There are features in Flare that are expected to be immediately and powerfully useful for the Singularity Institute and more mundane programming tasks; annotation, self-examination, two-way references, invariants, planes, causality, assimilation, and the other abilities built into Flare.  But there are also parts of Flare that are included specifically as the first step into a new design space of programming languages; the characteristics that make Flare a tool that can grow without breaking to meet deeper and deeper problems.  Flare doesn't have as much future shock as, say, the idea of a recursively self-improving Artificial Intelligence, but Flare isn't an entirely mundane language either.  There is a touch of future shock about Flare, and not just because Flare is a large immediate improvement, or because of Flare's intended use, but because of Flare's intended potential.

1: Introspection

Introspection is a higher standard of reflectivity or self-modification.  An introspective program should have as much information as possible about its own functioning, and should ideally be able to modify anything it can see.

This includes:

1.1: Reflection

A program should always be able to discover any piece of information known to the interpreter, and obtain any piece of information that the interpreter could obtain, unless there is a specific security reason otherwise.  Compilers lose information in the transition between source file and machine code, but an interpreter does not; thus, information known to the interpreter should include all the information visible to the programmer.

In C++, for example, and initially in Java, it was impossible to obtain a list of all the instance members of an object.  Java 1.1 later added the java.lang.reflect package to enable dynamic discovery and access of an object's fields (properties).  Interpreted languages, such as Python and Perl, usually get this right from the beginning.  (The best reflective language of all has always been LISP, because in LISP, both programs and program data take the form of LISP lists.)

Most of the reflectivity in Flare is pretty blatant, but the general principle of complete reflection still deserves to be noted as a language goal.  Yes, it should be possible to get (a) a complete list of the locally contained subelements of a Flare element, i.e. the properties of an object; it should also be possible to get (b) a complete list of all the subelements of a Flare element, including those contained on parents; (c) the list of statically typed instance members; (d) determine whether or not a property is "clear" on a particular object; (e) determine whether something that looks like an object is a locally stored Flare element or a transparent reference to a distant Flare element... and so on.

If the interpreter maintains a list of two-way references, then a program should be able to get a list of all the references to a Flare element.  If the interpreter makes some action transparent for the programmer's convenience, it should still be possible to look past that transparency.  Planar annotations are usually hidden away from the list of object instance members, but it should still be possible to get a list of all Flare subelements within a Flare element, including planar elements.  Getting a list of all instance members, or checking to see whether a property exists on an object or its parents, or automatically following a transparent reference, are idioms of the application logic; getting a list of all Flare subelements, or checking to see whether a property is defined directly by a single object, or determining the direct content of an element without following transparent references, are idioms of reflectivity.

Reflection includes modifying reflective information as well as retrieving it.  Thus, if there's a way to access the list of an element's Flare subelements, there should be a way to manipulate that list - to add, or delete, dynamically determined elements.  The application-logic idiom is "prop = 'arms'; human.(prop) = 2".  The reflective version of that same statement might be, as in Python, "setattr(human, 'arms', 2)" - although "set_property" would be the term more likely used in Flare.  Such a reflective statement might also be carried out as a direct copy, rather than a content change - perhaps blowing away any planar data or annotations within the <arms> element and setting its content directly to the argument provided.  (Obviously this is a dangerous way of doing things, and not necessarily a fast one; destructors on locally stored objects will still need to be called, interceptions may be notified that they are being removed, and so on.)

Reflection is necessary because programmers very often wish to do something that requires reflection.  Our own ability to reflect causes us to naturally think of algorithms that require reflectivity.  If a programmer thinks "Now I want to do X with each property on this object", he should be able to represent that action directly as code - rather than needing to write a separate line of code for each property, in a separate function on each object, because he's writing in a nonreflective language like C++.

1.2: Self-modification

Where "reflection" traditionally applies to program objects and program data, self-modification extends reflectivity (awareness) into the realm of code, classes, and other program attributes.  LISP, of course, is the traditional king of self-modifying languages, because LISP uses the same representation for program code and program data.  Flare also uses the same representation for code and data, except that the common representation is extensible tree structures (XML) rather than lists.  The difference is a major one; in a list structure, an object's role is determined by where it is.  In Flare, an object's role is determined by its name and its metadata.  Extensible tree structures are thus less breakable; in LISP lists, meaning is often conveyed by position, and inserting new elements can change the position of other elements in a list.  Adding another Flare subelement, by contrast, does not at all change the behavior or appearance of the other elements.  Furthermore, Flare elements have annotable metadata.  The data that a particular subfunction is an "invariant" is explicitly represented in the program.  The preconditions of a method can be easily found by writing "method.precondition".  The meaning and structure of code is usually stored implicitly, rather than explicitly; procedurally, rather than declaratively; it is knowledge possessed by the programmer.  By having standard names for a number of programming constructs, Flare makes certain common structures possessed by code to be explicit rather than implicit.

Most of the real information about the meaning of code is still implicit, of course, and will never be truly explicit except to AIs, but again the annotative paradigm still makes it easier to express further information about code; if you really need to attach a planar annotation to a statement or code block, within a program that needs to examine its own code, you can easily do so without disrupting the integrity of the code.  The program itself will find it equally easy to make planar annotations to code and can thus gradually build up knowledge about the functioning of a particular system element.

Self-modification has three uses:  Code that writes code, code that understands code, and code that modifies code.

Code that writes code is traditionally very hard to understand, although it's worth considering that this tradition was formulated back in the days when it referred to self-modifying machine code.  I find that decoding LISP code which writes LISP is a lot easier on my brain than decoding C++ macros, and I have substantial experience in C++ but none whatsoever in LISP.  While I'm sure it is possible to abuse self-modifying code, there are also some programming problems to which code that writes code is the most natural solution, and it will always be easier to understand and maintain that natural solution than to decode and debug a hundred slightly different fragments of non-automatically-generated code.

Code that understands code has not been so widely used, probably due to the lack of Flare.  However, trying to write a regular expression that searches for a line of plaintext (C++, Python, FlareSpeak) with a particular semantic meaning is probably better done by writing a few lines of Flare that examine FlareCode to find, for example, all the cases where someone adds three to a number, multiplies it by nine, and stores it in a property ending in "_account".  I have also found, on occasion, that I want to determine at runtime, e.g., how many arguments a method accepts (looking at data stored in a method is around halfway between reflection and self-modification).

Code that modifies code, with the intention of testing random mutations against a metric or making guided improvements, is strictly an AI task.  It is, however, part of what the Singularity Institute is asking for from Flare.  Nine-tenths, maybe ninety-seven hundredths, of self-improving AI is cognitive science and ultimately has very little to do with language issues.  The Flare side of it, however, means that the AI researchers don't have to become compiler and interpreter specialists before getting started - language issues play an especially large role in the beginning of AI, when the AI doesn't yet have that small but substantial degree of intelligence needed to pull stunts like decoding assembly language, mutating the high-level visualization, checking the visualization for safety, and translating the visualization back into assembly language.  It's much easier to get started if you can declare an invariant that a function has no side effects, determine what FlareCode does by examining helpful annotations on the FlareCode elements' metadata, modify the FlareCode directly, catch any invariant-violating exceptions, and benchmark the result against an annotation containing a test suite.

Ideally, Flare programs should be modifiable at runtime, especially for debugging purposes.  In practice, and especially in the initial implementation, there may be cases where the problem of change propagation would defeat the imagination and runtime modification is therefore impossible.  However, it should still be possible, wherever feasible, for Flare programs to modify Flare programs, or for programmers to modify running Flare programs so as to debug them in realtime.

While violating invariants is usually an error rather than application logic, it will probably be very useful for AI work if there is a way to try running a piece of code, seeing if it violates any invariants, making sure there are no side effects outside of what's been declared, and then benchmarking the code against the %test.benchmark definition.

1.3: Self-examination

If reflection is introspection on program objects, and self-modification is introspection on program code, then self-examination is introspection on program state.  In a truly ideal, gemlike Flare implementation, the interpreter's representation of the program state would also be an extensible tree structure representable as XML.  This degree of perfection, however, is not likely to be necessary, even for the Singularity Institute's purposes.

Introspection on program state includes such operations as reflectively determining the current value of the local variables in an invoked method, getting a snapshot of the current stack (well, tree), or even one thread watching another, slower thread as a Flare expression evaluates in slow motion, taking snapshots of intermediate values produced - a privilege usually reserved for debuggers.  It is questionable in the extreme as to whether one Flare thread should be able to modify another in this way, since I can think of very few legitimate uses for such an action.  However, read-only self-examination of Flare programs has two uses; first, it permits programmers to perform very powerful debugging actions, and secondarily, it again lowers the bar on getting an AI to take the first baby steps toward understanding code.  Total self-examination may not be present in the first interpreter written, but it's something to think about in language design.

Threadwatching means that the ordinary interpreter has the kind of capabilities that are often reserved for debuggers.  With threadwatching it is very easy - just a matter of UI, even - to write a Flare debugger in Flare.

Interceptions are an example of a much more mundane form of self-examination; they permit Flare programs to watch the access or alteration of a specific property (or other Flare element).  Preconditions and postconditions and most other invariants are also, conceptually, introspection of the "self-examination" category.

1.4: Purpose

Introspection:

2: Naturalness

Programming is the art of translating mental recipes into a form that a computer can understand.  A higher-level language is one in which the written instructions are closer to the original form of the mental recipe.  Object orientation, for example, is an improvement over procedural code because we naturally tend to think in terms of objects which exhibit behaviors.  Object-oriented languages are thus more natural than procedural languages; they correspond more closely to the mental recipes we form when we think about problems.  Of course, not everything in our mental recipes consists of object behaviors; some of the things we think about are procedures that exist apart from any object.  Python, which enables programmers to write procedures outside a class, is thus more natural to code in than Java, which insists that every procedure be placed inside an object or class.  Python also feels distinctly more natural than Java or C++, at least on certain occasions, because in Python it is possible to declare a variable that can be bound to an object, number, string, list, et cetera, which enables programmers to directly write down those mental recipes that require "Something that can be a number or object" to be passed to a function.

There are two ways in which a programming language can increase naturalness.  The first is negative; programming languages often get in the programmer's way by imposing arbitrary rules.  The second is positive; a programming language can add complexity so as to take things off the programmer's mind.  The border between the two is often hazy.  C++ appears to impose the arbitrary requirement that a variable have a single static type, and Python appears to more generously permit variables to be numbers or strings or objects, but this is actually a positive feature; Python adds complexity and loses efficiency in order to provide a feature that is almost never present in compiled languages.

Basic to the philosophy of Flare is a sense of freedom and explicit formalization - that if you have a mental representation of an algorithm, you should ideally be able to write it down directly, as code.  The philosophy of Flare is not that we will prevent you from writing bad code, but rather that you should be able to write absolutely any piece of code your heart desires, and we'll trust to your innate goodness of heart to keep you safe.  If not, then we'll provide invariants to keep the objects safe, and program invariants to keep the code clean, and maybe standard limited subsets of Flare to discourage people from using the more powerful and dangerous features unless there's a good reason.  But the dangerous features will still be there in the full-scale 100%-implementation interpreters.  If Perl's motto is "There's more than one way to do it," Flare's motto is "There's at least one way to do it." and "There's a simple way to change it."  (More precisely, "A single thing can be changed with a single action.")

But Flare doesn't have a "goto" statement, because that would be going too far.

The point of Naturalness is that it is a legitimate reason to say that language feature X is included "Because that's the way people think about the problem."  Letting programmers do what they want to is a goal.  Saving programmer keystrokes is a goal.  Having a single action in the programmer's visualization be a single action in the code is a goal.  This is a reason why Flare often uses the idiom of transparency, of things happening automatically - references, for example, being followed automatically in most expressions.

See also "Substrate".

3: Innocence

Innocence is a higher octave of modularity.  Modularity occurs when two pieces of code interact without needing to know about each other's internals.  Innocence occurs when two pieces of code interact without needing to know about each other at all.

Modularity is about controlling dependencies - making sure that one piece of code doesn't access another's internals or become dependent on details of internal operation; it's about defining APIs, a surface membrane through which two pieces of code can interact, and change freely as long as the membrane is not affected.   Innocence is about controlling knowledge - making sure that the programmer doesn't need to know certain things in order to write working code.

The Attachment-Attachable pattern is modular; the Listener-Broadcaster pattern is modular; the Causality pattern is innocent.  The Attachment pattern is modular because Attachables don't need to know who attaches to them, but Attachments need to know what they're attached to; Broadcasters don't need to know who's Listening, but Listeners need to know who's Broadcasting.  Listeners must explicitly attach themselves to a particular Broadcaster.  Attachments must explicitly attach themselves to Attachables; must know who the Attachables are and what they are doing.  The Causality pattern defines a way for two objects to hook up without either one being aware of the other. Situations promote Innocence by providing a channel of communication that has a standard meaning that works regardless of who's on the other end and does not involve innocent bystanders.  Leaving the bystanders out of it doesn't just mean that the bystanders don't meddle with the internals, but that the programmer who codes the "bystander" doesn't need to know anything about what the other modules are doing.  The function at the top of the stack doesn't know who's looking at the Situation on the bottom; the function on the bottom may not know who modified the Situation at the top; any intervening functions don't need to know anything about the topic or pass any extra arguments along.  In the Causality pattern, the computing function doesn't know how it's computed, and any other functions called don't need to know who called them.

Modularity is a truly powerful idea that has literally dozens of benefits, from forcing good code, to improved reusability, to being able to fix something in one place without breaking things elsewhere, and so on.  The idea behind modularity is that a package should contain, in one place, everything which that package needs, and should not unnecessarily expose these internals to the eyes of other packages.  Innocence raises modularity to an even higher pitch by saying that the programmer should never need to think about those things which it is not natural to think about in a particular place.  The general principle of innocence is that if a piece of knowledge does not leap readily and naturally to mind in a particular place, then the programmer should not need to know about it.

This principle is visible in the way that Flare mixes static and dynamic typing.  From Idioms:  "In dynamic typing languages, argument types must be either remembered or recorded in comments, and either way, the interpreter doesn't know about it and won't catch any mistakes until the code runs.  On the other hand, I like the convenience of not needing to declare my variables inside a single procedure, where all the relevant information exists in one place and there is no need to remember the variable's class outside the procedure."

Invariants help to make innocence possible by making innocence safer to attempt; if you unwittingly transgress a prohibition you were a little too innocent of, it will cause an explicit error rather than an invisible cancer.  At a higher level, semantic invariants may be examined by adaptive code to determine what is or isn't safe to attempt - but that would be a very high-level trick.

Planar annotations make sure that the different planes can be innocent of each other.  Where namespace collisions are a possibility, annotations need to be aware of each other; you can't make an arbitrary annotation to an arbitrary object without worrying about whether you'll smash something already present.  Encapsulating the annotations in a plane means that the maker of an annotation and the user of an annotation can communicate even if the maker is innocent of the user, the user is innocent of the maker, the annotated object is entirely innocent of both actors and was written by a programmer who didn't know the plane being used would exist, and the plane was defined by a programmer who knew nothing about the object being annotated.  That is innocence.

Most of all, innocence is visible in the principle of annotation combined with the principle of reflectivity and explicit formalization.  Annotations - providing information about a function, object, class, module - can act as a channel of communication that has a standard meaning and that works regardless of who's on the other end.  There are many libraries written by programmers who didn't know the end users, but libraries usually also demand that the user understand the library.  The ideal of Flare is that you can take twenty different modules, written by twenty different programmers who have never heard of each other, drop them into a folder, and that's your application.  Even in Flare this will happen rarely if ever, but in Flare, it is at least possible.  As long as standardized annotations conform to the standard, and nonstandardized annotations have enough points of intersection that linkup can still occur, well-written innocent modules can self-organize.

When code needs to be adapted to the programmer's implicit knowledge about internals, the code becomes specialized, the knowledge becomes procedural rather than declarative, and innocence or even modularity are broken.  When the implicit knowledge is explicitly formalized as an annotation, the code can explicitly match up with that explicit description, and the two actors can become innocent of each other, as long as both use the same annotations.

(Discuss:  Python's doSomeFunction(foo, bar=2, baz=1) as a way of being insensitive to function argument order... raise to still higher levels by providing multiple keywords and other info, then let standard matchmaker utility handle the rest.)

The goal of innocence is increased productivity.  Object-oriented languages support modularity.  Annotative languages support innocence.  So much has been written about the benefits of modularity - code reuse, easier programming, less fragile programs, increased productivity, more natural idioms, and so on - that it seems only necessary to note that innocence is modularity tuned to a still higher pitch, and should enable even greater levels of reuse and productivity.

Innocence may also help in subtle ways to make it easier for a relatively unsmart AI to get started on making small improvements to more easily modifiable code, but this is not as certain as it might be.  Mostly it's a "next step beyond object orientation" thing.

4: Scalability

Ah, the good old days!  When we had to hand-weave our ones and zeroes from straw and chip our slide rules out of flint!  We used to send packets by carrier pigeon until the pigeons died of the plague, and then we had to invent fire before we could use smoke signals.  Our whole village only had one bit of RAM and it weighed thirty pounds, and Granpa died waiting for it to boot up.

Back in the good old days, it may have made sense to write "efficient" programming languages.  This, however, is a new age.  The age of microwave ovens and instant coffee.  The age of six-month-old companies, twenty-two-year-old CEOs and Moore's Law.  The age of fiber optics.  The age of speed.

Serial speed is always expensive.  A CPU with a clock speed twice as high always costs considerably more than twice as much.  So why not buy two CPUs?  It's a rare computational task that requires two billion operations to be performed sequentially, one after the other.  Much more likely is that you'll need to perform a hundred tasks each requiring twenty million operations.  Except that modern handling of parallelism is still in a primitive state, computers with more than a single CPU are still rare and expensive, and so we go on buying faster and faster CPUs, instead of buying more and more CPUs.

"Efficiency" is the property that determines how much hardware you need, and "scalability" is the property that determines whether you can throw more hardware resources at the problem.  In extreme cases, lack of scalability may defeat some problems entirely; for example, any program built around 32-bit pointers may not be able to scale at all past 4GB of memory space.  Such a lack of scalability forces programmer efforts to be spent on efficiency - on doing more and more with the mere 4GB of memory available.  Had the hardware and software been scalable, however, more RAM could have been bought; this is not necessarily cheap but it is usually cheaper than buying another programmer.

Scalability also determines how well a program or a language ages with time.  Imposing a hard limit of 640K on memory or 4GB on disk drives may not seem absurd when the decision is made, but the inexorable progress of Moore's Law and its corollaries inevitably bumps up against such limits.

Flare is a language built around the philosophy that it is acceptable to sacrifice efficiency in favor of scalability.  What is important is not squeezing every last scrap of performance out of current hardware, but rather preserving the ability to throw hardware at the problem.  As long as scalability is preserved, it is also acceptable for Flare to do complex, MIPsucking things in order to make things easier for the programmer.  In the dawn days of computing, most computing tasks ran up against the limit of available hardware, and so it was necessary to spend a lot of time on optimizing efficiency just to make computing a bearable experience.  Today, most simple programs will run pretty quickly (instantly, from the user's perspective), whether written in a fast language or a slow language.  If a program is slow, the limiting factor is likely to be memory bandwidth, disk access, or Internet operations, rather than RAM usage or CPU load.

As computing power increases, all constant-factor inefficiencies ("uses twice as much RAM", "takes three times as many RISC operations") tend to be ground under the heel of Moore's Law, leaving polynomial and exponentially increasing costs as the sole legitimate areas of concern.  Flare, then, is willing to accept any O(C) inefficiency (single, one-time cost), and is willing to accept most O(N) inefficiencies (constant-factor costs), because neither of these costs impacts scalability; Flare programs and program spaces can grow without such costs increasing in relative significance.  You can throw hardware at an O(N) problem as N increases; throwing hardware at an O(N**2) problem rapidly becomes prohibitively expensive.

Computationally expensive versions of Flare are likely to initially be useful for two sets of problems.  The first set of problems are those where the computations are not all that computationally expensive relative to existing hardware, even though the processes may (or may not) be conceptually tortuous.  (This is a class of problems that expands as time goes on and hardware improves.)  The second class of problems are those too large to be solved on a single serial machine, where Flare's support for parallelism and distribution enables hardware to be thrown at the problem, and the cost of hardware - how much hardware needs to be bought - is less important than having the processes work cleanly.  Obviously, the second class of problem requires a much more polished version of Flare.  On the other hand, today's machines are powerful enough that most problems go by pretty quickly and there is spare computing power to burn.  A production-quality language still needs to be polished but that polish does not need to take the form of speed; speed is rarely the limiting resource.  Small projects usually don't need it, and large projects can usually afford it if it's technically possible to throw parallel power at the problem.

Scalability often comes at a cost in efficiency.  Writing a program that can be parallelized traditionally comes at a cost in memory barrier instructions and acquisition of synchronization locks.  For small N, O(N) or O(N**2) solutions are sometimes faster than the scalable O(C) or O(N) solutions.  A two-way linked list allows for constant-time insertion or deletion, but at a cost in RAM, and at the cost of making the list more awkward (O(N) instead of O(C) or O(log N)) for other operations such as indexed lookup.  Tracking Flare's two-way references through a two-way linked list maintained on the target burns RAM to maintain the scalability of adding or deleting a reference.  Where only ten references exist, an ordinary vector type would be less complicated and just as fast, or faster.  Using a two-way linked list adds complication and takes some additional computing power in the smallest case, and buys back the theoretical capability to scale to thousands or millions of references pointing at a single target... though perhaps for such an extreme case, further complication might be necessary.  (This being Flare, of course, you can write an annotation that tells the interpreter in advance that a referent may have thousands or millions of references, and the interpreter can choose a more efficient representation, in advance or on the fly.  Or not.  Flare is a language that, being extensible, begs to be complicated; but complexity has to be kept unentangled, and optionally invisible if you don't want to see it.)

Scalability is thus an important special case of Substrate (below), since it expends computing resources, not just to buy programmer comfort, but to buy the ability to add computing resources.

In addition to Flare features for symmetric multiprocessing, and readiness for 64-bit addressing, we should also consider adding support for automatic distribution over Beowulfs and network-of-workstations.  The key language feature in this case would not be locking and synchronization, but rather distribution over Nodes, where all communication between Nodes came with a latency and a cost.  The goal would be to (a) make communication between Nodes transparent, (b) preserve the Flare language logic over distribution, and (c) minimize communication between Nodes.  Automatic or programmer-assisted distribution is something that's still being thought over, though.

The way to preserve scalability is to make sure that all atomic actions have O(C) cost if at all possible, or O(N) cost only if it's an atomic action that does N things.  Sometimes O(N) or O(N**2) operations may be unavoidable by even the best custom-coded actions, but where it is possible to avoid breaking scalability, it is worth expending computational resources or programmer efforts to do so.  Classes use "cached" parenting, with associated complexities for change propagation, to avoid imposing an O(N) cost (where N is the depth of the class hierarchy) on the atomic action of looking up a derived method.  Scalability is also preserved by carefully avoiding the imposition of arbitrary limits (only 32-bit addressable RAM, only 65536 classes, and so on).

The goal of scalability is to make Flare a language that can survive a couple of years into the future, do our part to chip away at the absurdity of still having single-CPU systems, teach programmers to think parallel as part of the coming phase transition to ever-expanding numbers of CPUs as well as ever-shrinking transistors, and ensure that the Singularity Institute can throw hardware (once they need it, and can afford it) at the very computationally intensive problem of Artificial Intelligence.  (Of course, there's a cogsci aspect to distribution too - language support is necessary but not sufficient.)

5: Substrate

Big, fast computers breed fat, lazy programmers.  And this is good!  It's one of the rare forces for goodness in an otherwise hostile universe.  Do you think that being a fat and lazy programmer comes naturally?  Not at all.  It requires discipline, experience, and hard work to stop optimizing your programs.  It requires experience to know what you can buy with vast quantities of computing power, and to know how to buy it.  It takes design effort and coding effort and creativity in the short-term to properly expend computing power so as to make programming easier in the long term.  Having a big, fast computer is like moving out of your parents' house, getting a job, and suddenly earning and spending much greater quantities of money - you need to learn how to effectively spend cycles, and the first step is realizing that you're allowed to spend cycles.  You must exercise willpower to prohibit yourself from optimizing something that looks easily optimizable, but causes no discernable speed hit to begin with.  I'm a relatively modern generation of programmer, but I still grew up programming on machines where the first version I wrote would run slowly enough to cause a perceptible delay.  I grew up optimizing my code.  Later in life, when I was programming in C++ on a 120Mhz PPC 604, I had to unlearn the reflexes that I'd learned by writing code for overloaded thousand-user MOOs and (even earlier) writing Hypertalk on the family Mac Plus.  I had to learn about code reuse.  I had to learn that it was better to start out using a standard version and specialize it only when necessary.  I had to unlearn the tendency to write my own just right container classes.  I had to learn, and this was the hardest lesson of all, that sometimes the first version is fast enough even if it's visibly inefficient.  It takes work to use up all the resources on a modern-day machine.

The most complex piece of code I ever wrote in my life, when it was completed, was something I had no excuse to optimize - because, from the user's perspective, it ran instantly.  In fact, most of the C++ code I've written in my life has run instantly.  Now that was not just due to the machine, of course, but due to the fact that I had used the Causality design pattern to tightly manage change propagation and ensure that most actions propagated in real time.  But, having used that correct design, I had no excuse to optimize the code itself.  The code was complex, but it responded instantly from the user's perspective; that was the end of it.  In retrospect, I should have written less efficient code.  Maybe I should even have used a simpler and less efficient architecture, instead of expending complexity to manage change propagation and update in real time.  I think, if I'd realized how much computing power was available, I could have saved myself a couple of months.

It is, of course, still important that you be a renegade perfectionist, a reluctant non-optimizer, rather than having a naturally sloppy and uncaring attitude.  There is a difference between spending computing power and wasting computing power.  If you use a design that is as pure and elegant as sunlight reflecting from a pool of clear water then there is often no real point to optimizing the code.  Modern megaprojects, on the other hand, are famed for producing hundreds of megabytes of code that manage to run slowly even on modern-day machines, which of course remains pure evil.

But let's say that you're a reformed optimizer and that you've learned that it is sometimes acceptable to spend computing resources.  What can you buy with them?

XML itself, of course, is perhaps the best example - not the essential tree structure, but the text-based format, brackets and all.  In the good old days, disk space was a hard constraint that users kept running up against; small files were an important convenience and optimized binary storage was worthwhile.  Time went on, disk sizes grew into the multiple gigabytes, and all files except multimedia (and large applications) turned into unnoticeable costs.  Increases in processing power meant that the one-time cost of reading and parsing a file also began to diminish into insignificance.  A human user can only generate so much data.  I once wrote an object persistence mechanism in C++, before the general dawn of XML; if I'd known about the paradigm of human readability, if I'd known about XML, I think it'd have cut my development time by two thirds, vastly increased reusability, and introduced no user-noticeable delays.  XML buys reusability, standardization, decreased development time through increased human readability of the persistent representation, ability to act and be acted on by a large and ever-increasing set of libraries... and so on.  XML is one of the best possible examples of a good way to spend disk space and a few CPU cycles.

Flare is a substrate language.  That means that it is acceptable to spend computing resources, as long as it's a one-time cost, or a constant-factor increase that doesn't break scalability, if what is being bought makes the programmer's life (or the user's life) a little easier.  Flare is not an optimized language any more than XML is an optimized representation; Flare is a language adapted to the computing resources of today and tomorrow, where the vast majority of tasks either run swiftly on a single machine or require scaling across a Beowulf network.  It may be worth optimizing a given interpreter, since the effort spent in optimization will be paid back to every program that runs on that interpreter, but Flare itself is not an optimized language.  Flare is powerful.  Flare is safe.  Flare manages whole classes of problems for you, as long as they can managed using once-off O(C) or scalable O(N) processes that don't break the ability of a single action to have a single effect in realtime.  Flare does not try to be locally efficient, in its processing of a single atomic operation (although Flare does try to be scalable in its handling of atomic operations); Flare rather tries to be a language whose atomic operations can easily support globally efficient designs and elegant architectures.

Flare is a substrate language that expends modern-day computing resources to transparently manage problems, as long as that doesn't break scalability, distort program architectures, or change the nature of actions.