Toward what end do interfaces evolve? What are the trade-offs in the problem-solving process that drives their evolution? And what can we anticipate in future interfaces based on these patterns of evolutionary exchange?

A famous image from Steve Jobs’ first public presentation, months before its launch, of the iPhone:

Because it is an Apple event, Jobs notes only those interfaces which Apple successfully markets and sells [1]; although it remains chronological, absent are many other steps and branches of general development. A slightly more complete accounting of interfaces for human-computer interaction (HCI) might include:

  • punchcards
  • screens and command-line interfaces
  • the mouse-driven graphical user interface (GUI)
  • stylus-input interfaces (Newton, Palm, etc.)
  • touch-based interfaces
  • gestural or kinetic interfaces
  • voice interfaces

But that would have made for an ungainly slide.

This list seems partially progressive, and there is a temptation among technologists to assume progress is made towards ends; there is also a mistaken tendency to assume that all progress is pure. But this latter idea is not always so; while one strains to imagine how punchcards could be preferable, the command line text-only interface has different strengths and weaknesses than the GUI, and indeed it lives on in Terminal and many other places where its strengths are useful.

In design as in everything, every solution to a problem introduces new problems [2]. For much of the history of user interfaces, the problems introduced by solutions are familiar: with each new interface might come one or more of the following:

  • lower information density;
  • slower speed of operation, especially for experts;
  • higher computational costs in rendering, effects, structure, abstraction, the masking of the machine beneath the interface (indeed, computational constraints have often gated UI progress); and/or
  • an increase in overall interface ambiguity, which can amplify development costs and design difficulty.

In general, each new interface also brings an extraordinary benefit which outweighs these drawbacks. That benefit tends to be accessibility, and thus scale of use.

It’s easy to say that this is a desirable exchange, but consider how frustrating some tasks are on an iPad relative to a laptop, particularly those involving precision and speed: typing (and thus programming!), various sorts of advanced media editing, and so on. Solutions to these problems introduce their own problems: increased UI chrome and complexity, for example, problematizing the very advantage the iPad had to begin with [4]. Solutions and problems go hand in hand.

Nevertheless, the reduced efficiency and various increased costs —both requiring solutions on the computer’s side of the human-computer interface— have historically been worth it because of the dramatically increased scale of utility thereby provided.

We can thus formulate a clear end pursued over time: user interfaces progress towards more utility for more people, which we can refer to as empowerment.

empowerment = utility x scale

The more people are provided with utility, often the more empowerment has occurred even if a diminishment of utility occurs for some individuals. We can visualize a distribution of empowerment across a few interface technologies —approximately and reductively— as follows:

Empowerment increases as tools do more, enable more, but begins to decrease as tools grow complex enough to be inaccessible without significant learning.

Again, it’s not as straightforward as this appears. Also, this progress is not driven solely —if at all— by idealism. The more people can use a tool, the more will buy it; and sometimes, the more it can be used for, the more they’ll pay (at least at first). And last: the more people can create uses for it, the more uses for it there will be.

Utility and empowerment

There are many ideas about what constitutes empowerment, from strictly political definitions to broader claims that anything useful is empowering, or even anything which brings happiness. I return, as I often do in thinking about design, to “Applied Discovery,” David Cole’s talk from Build 2013:

…[W]hen I say “empowerment” I mean it in a very broad sense. Whether that’s instantaneous communication with loved ones. Or a new way to stay healthy. Or the ability to travel across the country with unprecedented access. Or really any way to help people be happy and long-lived, closer to the ones they love, with the tools and knowledge they need to be the person they want to be. [4]

If this seems straightforward, one need only ask a pundit what s/he thinks of a new product: any gizmo is at first suspected of being a distracting nuisance, a purposeless gadget for a materialist culture; some quite wrongly considered the iPhone to be such a bauble!

There’s no need to be ideological. If we subdivide Cole’s definition into the constituent elements —utility and scale— we can readily admit that there are many forms of utility, and utility’s value is mostly up to the individual to determine. The utility of the coming wave of sensor-laden devices which will help us more closely track and understand our health could be deeply empowering, especially for people whose energy, happiness, abilities, or even longevity will be enhanced.

But there is a gradient of utility, and therefore of empowerment; some forms are further-reaching than others. Pedometers, for example, are closer to Click Wheel utility than, say, GUI-level-utility. The difference has to do not merely with how much is enabled but with what’s enabled categorically. The Click Wheel enables us to navigate collections of music ably with one hand; it allows us to solve a problem as the designers intended. The Macintosh allows us to do that, and also to experience a wider variety of media, and to make a wide variety of media (including music), but most relevantly to make software: to create tools [5].

Most tool-creation occurs in code, but it’s worth noting that the PC has seen efforts to scale this form of empowerment to more users with programs such as HyperCard. In any event, let’s call the PC + GUI a means of creative empowerment. Creative empowerment allows users to create unpredicted solutions to problems. These problems can be superficial or profound; in that sense, creative empowerment includes Click Wheel-empowerment. Another way to say this: you can more or less create an iPod on a Macintosh; you cannot create a Macintosh on an iPod. [6] Creative empowerment is the highest sort of empowerment we enable.

There is an epistemological principle involved here: platforms which are creatively empowering enable the growth and application of knowledge in domains which the platform creators need not know about. The creators of the Macintosh did not need to know anything about taxes or architecture; instead, they created an accessible system and developer tools that allowed people who did know about such things to create programs that solved relevant problems in those fields. Creatively empowering systems do not anticipate all that can be made through them; they are extensible beyond the intentions, plans, and sometimes even desires of their makers. They have reach.

There has not been a comparably creatively empowering system since the development of the PC. iOS is massively more comprehensible, and it is impossible not to be moved by the profound, transformative utility it’s brought to users, empowering them in very literal senses that we can assume were not predictable even as recently as 2007.

Nevertheless: these applications were created on a Macintosh, not on iOS. iOS helpfully illustrates the empowerment curve and the trade-offs entailed at different points on it: by integrating so many sensors and such powerful hardware with an interface more usable than that of the Macintosh, it exceeds in scale of use and in portability its own creator. The iPod solved a few problems for many users; iOS solves orders of magnitude more (including those solved by the iPod) for even more users. Users can make art of all sorts, communicate in life- and culture-changing ways, solve many daily, practical problems, live better, happier lives.

But iOS does not enable people to create software; it merely gives developers a means to deploy software created on the Macintosh into the world, where its value is inestimably greater than when it is confined to a bulky, environmentally-blind machine. Like the GUI —especially in the GUI’s infancy— it is an extension of the power of computation, but it partializes some of that power. And to this day, while nearly anyone can use an iPad, far fewer can master the Macintosh, let alone Xcode. Programming —tool-creating— remains inaccessible to ordinary users.

So:

  • utility can be deeply important without being creatively empowering; this doesn’t diminish it
  • there is a gradient of utility that runs from (a) tools which solve very specific problems to (b) “multitools” which solve many problems to (c) tools which enable the creation of tools and multitools
  • creative empowerment allows people to create a wide variety of solutions not imagined by the system’s creators, solutions embodying knowledge the creators do not possess;
  • the distribution of scale along the gradient reflects the fact that many can use simple tools, some can use more-flexible multitools, and fewer can use tool-making tools;
  • as you increase the number of people who can create, you increase the total knowledge reflected in the creations, the embodied hypotheses tested, and the rate of both conjecture and falsification.

Here’s another ludicrously imprecise but perhaps useful graph; exceptions abound, but hopefully the spectrum is clear, especially as it applies to, say, a calculator, a smartphone, and the PC.

The essence of historical progress for interfaces is in how much utility we can democratize. In pursuing this end, we increase the cumulative empowerment we’ve enabled, and therefore the quantity and diversity of knowledge brought to bear on problem-solving.

What makes the PC important is that billions of people can use it, although not all fully or ably, and far fewer can use it to create software. With iOS, more people than ever before can benefit from software and hardware, but it is nevertheless the case that the platform has not matched the Macintosh in creative empowerment, something I regard as somewhat disappointing.

iOS remains a wonderful multitool, but neither iOS nor most apps could be made on iOS.

Trade-offs and what mandates them

Design is often a matter of painful choices. As mentioned: the GUI was a triumph. It brought computation into reach for ordinary people. But there was a price to pay, literally: it required significant hardware power just to run its interface, and that alone brought its cost up. More critically, the Macintosh GUI was unpalatable to large groups of experienced users —”power users”— who preferred the efficiencies gained from use of the command line and who considered the device a toy. It’s never easy to know which portion of a market one can safely ignore, and in Apple’s case, many power users resisted the Mac until OS X brought Unix into the platform’s fold.

Command line users disliked the Mac not only for inertial reasons; as discussed, the GUI had significant drawbacks for them as users and seemed harder to develop for. This last element is common in novel interfaces, of course. Design and use patterns from previous paradigms are sometimes non-transferable. While iOS retains the notions of icons and even windows, it dispenses with the filesystem, multi-window setups, several kinds of palettes, hover-states for cursors (eliminating the cursor eliminates the use of the cursor as indicator of attention), and much more. Even if experts prefer some features of a new paradigm, developing for it needs to bring sufficient advantage that they’re incentivized to create or learn new design patterns for their software. Some old solutions will not work. Thus: there is a fundamental trade-off because new solutions must be invented, which is never trivially easy.

Generally and critically, the GUI introduced more ambiguity into design and development, and ambiguity has costs. With a command line interface, there needs to be just one way to accomplish a given task (even if there are often more). With a GUI, mouse clicks and keyboard shortcuts call forth the same functions. Windows can be in many positions, sometimes obscuring one another. Scrolling can hide key interface elements, dialogs can halt flows in problematic ways, and in general there are more potentialities and modalities to be addressed. Compared to a command line, it’s quite chaotic; the state of the user’s desktop will vary, and how they mouse and click and drag and drop will too. Such a system must account for much more varied user behavior and user perception, so while it’s easier to use, it’s harder to build and develop for.

Such an interface is like a funnel for user behavior and intentions: it accepts a much wider range of inputs than a command line, but feeds to the machine only what the machine can understand. The translation of so many varying inputs into the same old comprehensible machine code requires lots of work and a tolerance for ambiguity, which doesn’t come naturally to computers.

In sum, abstraction has costs. Abstraction enables users to do more with less learning, but requires creators to do more and account for more, accept more and translate more. It also requires more computational power and more development and design labor (at first, before the relevant patterns are standardized), and it has some unavoidable drawbacks relative to less abstract interfaces.

To recapitulate: the design and creation of new interfaces is axiomatically a matter of trade-offs; the point is to empower the most people to do the most possible. It is axiomatic largely because for systems with reach, what is easier to use is harder to design, build, and power, a fact which reflects the the nature of knowledge and constitutes an unavoidable challenge of design. The added complexity must either fall to the user or to the machine.

The purpose of user interface evolution is to push more of the complexity to the machine, despite the painful costs we incur. The reason we want to push more to the machine, again, is epistemological: we want more minds working on more problems so more solutions can be created and tested.

Speculative examples and future directions We are now able to consider possible futures for interfaces to illustrate some of these idea. I’ve claimed here that iOS does not in itself constitute a “tool creator.” This doesn’t have to be the case, and friends and I have often discussed how badly we wish a powerful, multi-touch version of HyperCard existed on iOS. If users could physically manipulate interface elements, connect them with actions and data sources —as, for example, IFTTT permits— and export real apps, I think that novel solutions could emerge. (If nothing else, the occasionally dispiriting passivity of using iOS would be addressed, and perhaps more people could be inspired to consider design and development, to take an active role in building the future, so to speak. But those are tangential benefits).

Leaving iOS for the moment, an indecently speculative proposal might be illustrative of the points I’ve tried to make: consider the possibilities of a real voice interaction interface with advanced natural language processing (NLP).

In most discussions of voice interfaces, things get depressing quickly:

  • we want voice interfaces to exhibit artificial general intelligence, which we are not any closer to than we were in Turing’s time more than half a century ago; our voice “interaction” systems progress towards ever-more-sophisticated chatbot capabilities, and we train ourselves to learn their keywords
  • we fall back to using voice interfaces to replicate current human-computer interactions, but only of a very simple sort: dictation, search, commands which correspond to graphically-displayed elements on screens; the only empowerment I’ve seen from Siri is that some folks who cannot type well for a variety of reasons can dictate messages and stay in touch more than they’d dreamed: not insignificant, but a simpler kind of utility
  • most in the tech industry truly shrink at the difficulties involved in NLP: the machine learning required, the data analysis, the computational costs, the ambiguity of inputs (voices, syntax, diction, etc.), the complexity of flows, patterns, and events to account for, the pain of building generalizable abstractions of sufficient variation to capture most cases, etc.

If this sounds familiar, I suspect it is because where voice interaction is concerned, we are for various reasons closer to punchcard operators being asked to consider the GUI than we are to GUI operators considering multitouch. That is: we aren’t very close, and we’re GUI-power users: we think of voice interaction as being way harder and more complex than it’s worth, and we think mainly of how inefficient it would be to make and use.

But voice interfaces have an inevitable advantage: whereas the PC + GUI requires significant learning from users before they can even create objects and has done rather little to enable the creation of tools by the masses —the utility most important of all— nearly everyone can describe problems and, in conversation, approach solutions.

It’s natural to scoff that, in fact, most people are very clumsy at recognizing and detailing both problems and solutions. But what’s needed for tool-creation isn’t a capacity to write specs or even essays about what one wants; in conversation, especially dialogic conversation, most people can work towards description. Indeed, most of us have had conversations with relatives that demonstrate as much: they can tell us what they want, after all.

The question to ask is simply this: is the PC the last step in the democratization of utility at the level of creative empowerment? If it were: the most important work to be done would largely consist of making it easier to use as a creative device, making what we create on it —tools and multitools— easier to use, and extending the reach of the PC ever-further into the world, via new tools and multi-tools. In other words: making iWatches and HyperCards and so on.

But if there is another frontier, it seems to me to involve abstracting the processes of creation even further, perhaps with voice interfaces so that anyone who can use language can explain what they’d like done and, in a natural, conversational way, detail it to the level required for the machine to develop it. Two extremely facile examples —preferable, I think, to too many more pages!— follow:

Make me an app that, when I’m listening to music and open it, keeps the music playing as I record a short movie, try maybe 10 seconds, and fades the music in and out, and loops it all. I’d like to be able to share the resulting clip to my networks.

The resulting clip will be too large to share.

Too large?

Too large in file size. Do you want to know more?

No. Is there anything you can do to make it smaller? I want to share it with people. How does Vine do that?

Vine compresses its movies. I can compress these clips.

Okay, do that.

How much compression?

I don’t really understand what compression is. Can we just try whatever amount works? Maybe show me some example?

Later:

Sweet, this is fun. Send this app to Lana. Maybe I should share it publicly?

Okay. We can’t share apps publicly that use copy-protected media. This may fall under fair use, but I don’t know about that. Would you like me to find out?

Or more personally:

Is there data correlating physical things you can measure with these watch sensors to mood fluctuations?

There is.

Great, can you monitor my mood as best you can with the watch? Also, can you monitor things like the messages I’m sending and my rate of speech? I’m trying to keep the lid on hypomania.

Yes. Hypomania is often seasonal. It is summer. Would you like me to notify you when you’re in the sun a lot?

Yeah, that would be great. Whenever I seem hypomanic, would you please delay all my outbound communications and ask me to confirm them after about ten minutes?

I can do that with messages, but not with voice communication.

Sure, of course. That works.

Etc. These are unimaginable interactions for innumerable reasons today, but conceptually they involve abstractions of concepts that are, in fact, wielded by humans via the GUI and code; that is: there are already similar abstractions from the machine, masking complexity, and there is no reason in principle not to mask more. And again: my purpose is only to illustrate how abstraction, utility, scale, and empowerment interrelate.

If one recalls the trade-offs involved in moving from the command line to the GUI described above, one can see that they’re not radically different from those involved in moving from a visual-metaphorical interface to a natural-language interface. They only seem easier because we’re on the other side of the revolution.

Conclusions Regardless of the plausibility or stupidity of my chosen examples, in future interfaces we should expect the difficult work of abstracting complexity to increase empowerment to continue. The hope is that this work will not merely produce new Vine-clones or more personalized software, although those are both perfectly satisfactory outcomes in their way. What we seek is to enlist more and more people in the problem-solving processes software allows. There are of course many ways to do this, from products like Quora which allow people to learn anything they want to political efforts to popularize STEM education and get people coding.

For interface designers, however, the deepest aim seems to be enabling creative empowerment. Making new tools and multitools is often incredibly important work, as is making such tools easier to use, but nothing can have higher impact that increasing by orders of magnitude the people —and the diversity of people— who can join in that process.

And as mentioned, this is fortunately enough not mere idealism: people want not only to have their problems solved but to tailor solutions to their liking. When we devolve creative power, we see market success, so long as we do not also devolve complexity. iOS brought much of the Macintosh’s utility to regular users and has outsold its predecessor-partner by a wide margin. But it’s worth noting that even an outstanding creatively-empowering HyperCard-for-iOS type app would not, in fact, move the sales needle much.

That’s because thinking in visual interfaces, designing screen UIs, considering APIs and data sources, and so on, is simply not abstracted enough yet. That is: it wouldn’t make problem-solving by software-creation easy enough for people to perceive its value. It’s not 1s and 0s, but it’s just not how ordinary people think, nor should it be. To insist that ours is the necessary level of sophistication for creating software is simple parochialism, and an abrogation of the point of creating interfaces; we might as well say that anything higher-order than machine code is too vague.

The dream is very obviously to meet the population of the world where they are, enable them to describe to computers as they describe to themselves what they need, to close the gap in user research and step out of the way as much as possible. It’s not a dream we should expect to see materialize soon, but I think it can be useful to recall, in roughly the way that Plato intended the city of The Republic not to be a blueprint for urban planners but for the individual:

“You mean that he will be a citizen of the ideal city, which has no place upon earth.” But in heaven, I replied, there is a pattern of such a city, and he who wishes may order his life after that image. Whether such a state is or ever will be matters not; he will act according to that pattern and no other… [7]

Notes

  1. In this case, it is especially amusing that the Click Wheel makes the cut; we can perhaps assume it is because the iPod was Apple’s most successfully dominant product to date, extremely polished execution in a mildly consequential domain. The mouse and the GUI enabled the democratization of computing, but Microsoft sullied that achievement (as, perhaps, did the fact that Apple invented neither the mouse nor the GUI). The Click Wheel, then, is Apple executing on innovation, even if in a relatively superficial category. And it made lots and lots of money, something that I suspect Jobs considered validating.
  2. Here and in many other places in this essay, basic epistemological ideas are taken from the physicist David Deutch; Deutsch himself constructs his framework for understanding knowledge largely from the philosopher Karl Popper. For more, seeThe Beginning of Infinity: Explanations That Transform the World, David Deutschand Conjectures and Refutations: The Growth of Scientific Knowledge, Karl Popper.
  3. For a very contemporary illustration of this phenomenon, see the current rumors and debate over the iPad’s possible inclusion of split-screen functionality; I have no insight into whether this will be introduced, but as I already regard the app-switcher in iOS 7 as too complex, I am pessimistic of any such solution being worth the cognitive trouble for regular users.
  4. Cole believes that designers should pursue empowerment at scale, whereas for my purposes I will include scale in my definition of empowerment; what empowers few is not very empowering in a net sense, a species sense, although I hasten to add that mine is the unusual extension of the formal definition, his the accurate usage.
  5. I am not sure what property —speaking epistemologically— we can ascribe to software to account for the fact that it can create additional creative tools. There is, in other words, a philosophical dimension of this that remains elusive to me, and which seems important. The PC can create tools, multi-tools, and tool-creation tools. It seems to have something like infinite reach.
  6. You cannot build an iPod on a Macintosh, of course. Most software at present, no matter its reach, interfaces rather distantly with creation in the physical world. One can hope that 3D printers and the like will make this easier in time.
  7. From the ever-ignored end of Book IX.

Mills Baker