The Future of Interaction

Shortly after finishing my treatise on Marzipan, I started thinking about what lies beyond. Some of those initial thoughts made it into a thread on Twitter.

This post can be considered an addendum or a hell of a long footnote: in either case, you’ll want to start by reading my thoughts on Marzipan. Because what’s happening this year is just the start of a major shift in how we’re going to build apps.

So while everyone else is making predictions about WWDC 2019, what you’ll find below are the ones I’m making for 2020 and beyond.

Update June 7th, 2019: It turns out I was making predictions for 2019, after all.

What is Apple’s Problem?

Before we get into thinking about the future, let’s look at a problem that Apple has today: too many products on too many platforms.

Historically, a person only had one computer to deal with. For several generations it was a mainframe; more recently it was a PC. One of the disruptive changes that started with the iPhone was the need to juggle two computers. Now we have watches generating and displaying data: another computer. Increasingly, the audio and video devices in our living rooms are added to the mix.

Syncing and cloud services help manage the data, but we all know the challenges of keeping a consistent view on so many machines.

If you’re an iMessage developer, you have to think about a product that works on iOS, macOS, and watchOS. You get a pass on tvOS, but that’s small consolation. The same situation exists in various combinations for all of Apple’s major apps: Music, Calendar, Reminders, Notes, Mail, etc.

It’s likely that all of these apps share a common data model, probably supported by an internal framework that can be shared amongst platforms. That leaves the views and the controllers as an area where code can’t be shared.

Marzipan is About Views

With this insight, it’s easy to see Marzipan as a way towards views that share code. A UIView can be used on your TV, on your desktop, on your wrist, and in your pocket. That’s a big win for developer productivity.

It’s also a win for designer productivity: you can share app design elements. We already see this in Apple’s cross-platform apps when colors in Calendar match, speech bubbles in Message have the same shape, and Notes shares a special glyph for the “A”.

Everyone’s excited to know what the Dark Mode on iOS is going to look like. My guess is that people who have been running a dark user interface on their Mac have already seen it. It’s hard to find a balance of readability and contrast with dark elements and I don’t see Apple’s designers making any major changes in next week’s announcement. There will certainly be refinements, but it makes no sense to throw out the huge amount of work that’s already been done.

I also see the Mac leading the way with techniques that provide a more vibrant interface and allow a customer to customize their device. The accent color in System Preferences would be a welcome addition in the iOS Settings app.

All of this leads to a common appearance across platforms. In the near future, we’ll be in a nice place where our architecture can be shared in models, and our designs shared in views.

That leaves us with one final problem to solve: how do we share our interactions and controllers?

The Arrival of New Interactions

Apple ties interactions to platforms and their associated hardware. The Mac has interactions that are different than iOS. And watchOS has ones that are different than iOS. On tvOS, you can be limited to four arrow keys and two buttons.

There are some interactions, such as a swipe, that appear on multiple platforms, but as a whole each platform is different. This approach lets the customer get the most out of the device they purchase.

As we start to think about how interactions are shared amongst platforms, it’s wise to consider new hardware might be arriving soon.

For the past few years, Apple has been putting a lot of effort into augmented reality (AR). And I have no doubt that this hard work is not for our current devices.

AR is a great demo on an iPhone or iPad, but the reality is that you can’t hold a device in front of your face for an extended period of time: your arm gets tired after just a few minutes. I made this argument over a decade ago when everyone was getting excited about multi-touch displays coming to their desktop. It still holds true because it’s a dumb idea, just like AR on a mobile phone.

Apple will solve this problem with new hardware. And these devices will run “headgearOS” with new and completely different interactions. If you’re that Messages developer, it means you’ll be writing new code for yet another platform. Yay.

There are other twists to this story: rumors about iPads with mice and Macs with touch screens. And let’s not forget about interactions with voice commands using Siri technologies.

It all adds up to a situation where the complexity of products is increasing exponentially as new devices and interactions are introduced. There has to be a better way.

How Not to Do It

Cross-platform frameworks have a long history of sucking. If you ever used a Java app during the early days of Mac OS X, you know immediately what I’m talking about: the interactions were from a different universe. The design of the system was a “least common denominator” where only a limited set of capabilities was exposed. It just felt wrong.

More recent attempts have had more success but they still fail to address the problem of ever-expanding interactivity.

Apple’s been down this road before and I don’t see them making the journey again. Instead, I see them taking a new and forward thinking direction. A bold and pragmatic change that Apple is famous for: they’d be setting themselves up for the next decade of user interaction.

So what could they do that no one else in the industry is doing?

Declarative Interactions

As Matt Gallagher notes, we’ve slowly been heading towards a declarative programming style.

Declarative programming is describing a system using set of rules and relationships. The rules and relationships cannot be changed during the lifetime of the system (they are invariant), so any dynamic behavior in the system must be part of the description from the beginning.

Syntactically, declarative programming is often about assembling a whole system of rules as either a single expression or domain-specific language whose structure reflects the relationship between the rules in the system.

Layout is an inherently declarative task. Layout is a set of rules (which Auto Layout calls “constraints”) that apply to the contents of a view, ideally for the entire lifetime of the contents. Constraint programming itself is sometimes considered a sub-discipline of declarative programming.

He pulls this together with his own experiences into a prediction about a Swift-only framework for “declarative views”.

Independently, John Gruber has made similar observations.

The general idea is that rather than writing classic procedural code to, say, make a button, then configure the button, then position the button inside a view, you instead declare the button and its attributes using some other form. HTML is probably the most easily understood example. In HTML you don’t procedurally create elements like paragraphs, images, and tables — you declare them with tags and attributes in markup.

In my opinion, limiting this thinking to just views and layout is short-sighted.

That’s because it doesn’t address the interaction problem with an ever increasing set of platforms. But what if this new framework not only let you declare views, but also the behaviors they enable?

The developer would describe the interactions an app supports. There would be relationships between those declared interactions. All this immutable information would then be processed by user interface frameworks. Your app’s behavior would be determined at runtime, not when it was compiled.

Think about how this would work using auto layout as a point of comparison. Until that declarative system came along, we all worried about frame placement. Now we just worry about the relationships between those frames and let the system pick what’s best.

With our interactions, we still have this tight coupling between a user action and app’s behavior. Tapping on a button or a swipe gesture invokes some code directly. A declarative interaction would be a layer of abstraction between what a customer does and how your app reacts.

Again, using auto layout to help form our thinking, what if there were “interaction classes” that functioned like size classes? Your iPad would behave as it always has until you plugged in a mouse. At that point, how you interact with the device adapts:

  • Controls could get smaller because of the increased pointing accuracy
  • Views could gain a hover state to display additional information
  • Drag & drop could change because it no longer depends on two fingers
  • A mechanical wheel could replace a finger for scrolling

This kind of adaptability would work across platforms – your app would behave differently when it was running in augmented reality or on a TV screen. As a developer you wouldn’t have to worry about what kind of hardware is available, you’d have to worry about what to do when a customer used it to perform a task.

I’m not going to predict how this would be accomplished. Yes, it could draw inspiration from React or other similar technologies. The only thing I’m confident of at this point is that Apple knows it has a problem and is working actively to solve it in a platform-independent fashion.

Marzipan is our first step. And what I’ve described above is Amber, the next step.