History Repeats

I’ve been developing on Apple products for a long time: typing PEEK and POKE code from magazines into an Apple ][, figuring out how QuickDraw worked using the Inside Macintosh pre-prints, having my mind blown by Mac OS X and every new thing prefixed with “NS”, and then jailbreaking the first iPhone so I could write an app that eventually won an Apple Design Award.

It’s been an exciting adventure. Until now.

The engineering behind Apple products continues to be amazing: Swift and SwiftUI have made it easier than ever to create products. The App Store continues to be the easiest and most vibrant marketplace to sell those products (in spite of the company’s attempt to screw that up.) Fricken’ amazing hardware, too.

So what’s wrong?

The problems we’re solving and the apps we’re writing haven’t changed in years. After almost two decades of iOS, everything is iterative. And while maturity is a good thing, it’s not the thing that gets developers excited.

We’re at the point where a big change is putting a new coat of paint on our creations. Sure, it looks nice, and customers will love it. But it’s a lot of work and none of it sparks our imaginations.

But what is exciting these days?

Large Language Models: a huge body of statistical data that can be leveraged to solve problems that have heretofore been intractable. It’s the most exciting technology in decades because it lets our imaginations run wild and create new things.

And that’s a problem for developers in Apple’s ecosystem. Because while the company has done a significant amount of research with these models, and includes one on every iPhone, iPad, and Mac, the core capabilities of the mechanism are out of reach.

It’s like if Apple’s products didn’t provide direct access to the camera. There would be no Instagram, no Zoom, no Halide, just the Camera app. Developers don’t get a shutter button: they can only access photos that have already been taken. Apple knows what’s best for customers, of course.

Developers have been in this situation before: at the introduction of the iPhone. We all saw a wildly innovative piece of hardware that immediately gave thousands of developers a revolutionary idea for a piece of software.

Maybe it was emulating a glass of beer, turning the device into a musical instrument, a game that could only be played by touch, or a way to connect millions of people using photos and filters.

Then Apple told us we couldn’t write native apps and had to make web pages instead. There was no way for developers to do the same things Apple was doing. This was, indeed, a shit sandwich.

Eventually, the company came to its senses and opened up the platform, dropped the ridiculous non-disclosure agreements, and allowed developers to do what they wanted. That led to a period of innovation like I’ve never seen: developers had something revolutionary and magic happened.

Now history is repeating itself. We have a new shit sandwich that’s called Apple Intelligence.

Instead of building our own ideas on top of an LLM, we’re supposed to provide the internal details of our apps to Apple so they can do it on our behalf.

Providing those details is a lot of busy work for developers and not nearly as much fun as the coat of new paint: at least with visuals you can see and feel the results of your efforts. And from a business point-of-view, managing their internal details is why customers pay us. If Apple starts doing that on our behalf, what perceived value do we provide?

The internal details, called App Intents, are abstract and not something where you can immediately see the results of your efforts. It’s a “trust us Siri will be great at this” situation. Given the company’s track record in this area, there are few developers who think this will be successful. Worse, the improvements will be tied to lengthy release cycles: other companies drop language models with the frequency of new Emoji, not WWDC keynotes.

(I would not be surprised to learn that this whole situation is based on a fever dream of charging monthly service fees to use Siri and Apple Intelligence. These folks are seriously underestimating the reputational damage that Siri has incurred in the past decade.)

Some developers are working around this problem by providing their own models. This is unsatisfactory because it’s a waste of device resources: the downloads duplicate a tremendous amount of memory and storage. In many cases products are relying on cloud-based LLMs and losing all the privacy and security benefits of on-device processing.

All of this feels like Safari and mobile web apps in 2008: valiant attempts that everyone knows are wrong. Doing the best you can with a shit sandwich.

There are so many transformative ideas forming in developer’s minds right now that will never see the light of day. In our case, Tapestry has megabytes of textual information that describes a person’s interests and social connections. There’s no way for us to explore mining that data in a way that benefits the customer and respects their privacy.

(The developers who are making the greatest strides in this area are all doing it on the Mac. Ideas like Sky can thrive in a more open environment. Those of us in the jailbreak scene all saw how iOS borrowed heavily from its desktop sibling. Time will tell if that can happen again, but I suspect it will not given the locked down nature of mobile.)

So where does this lack of developer creativity lead?

It feels like developers are now part of the supply chain and being optimized accordingly. We are expected to refine and improve Apple’s ideas year-over-year. Our own needs and desires aren’t even secondary (where customers sit) or tertiary (our normal place in the hierarchy). We are just expected to deliver the products when Apple needs them.

I fear that this will lead to history repeating itself again, in a much more drastic way.

I remember how Microsoft’s response to the mobile revolution was to protect their existing desktop products. That looks a lot like Apple with its iOS franchise now. Instead of setting developers free, letting us experiment, and reaping the benefits, accountants and lawyers are fighting to keep us in line. We are all tired of the bullshit and many will happily move onto something better when it comes along.

Apple has been the lucky recipient of developer attention for a long time and they act like it will last forever.

It won’t.

The Next 40

Last week’s 40th anniversary of the Mac got me thinking. I’ve also been contemplating this week’s release of Apple Vision Pro.

It feels like we’re at a crossroads for platforms, but one that’s impossible to pass.

I was one of the folks who bought a Mac in 1984. At the time I was a member of a team building a Unix workstation from the ground up. We had bigger displays, better networking, faster processors, more memory, and larger disks.

But we were all jealous of what the team at Apple had done. That first Mac and its system software was brimming with new user interface ideas and techniques. Better ways of doing everything we had done.

And you can say the same thing about Apple Vision Pro and visionOS.

Except there is a problem.

Processes

If you’re a software developer, the Apple Vision Pro cannot be used standalone for your work. You’ll be able to use it as much as you do an iPad. You can experiment in Playgrounds and build some simple apps, but you’ll quickly hit a wall.

That’s because developers use a lot of processes. And these processes talk to each other in very creative ways. Maybe it’s as simple as creating child processes to handle work. Maybe it’s a more complicated process like a Docker container running a web server that talks to a database process via a Ruby on Rails process. There are processes everywhere you look.

And in an Apple sandbox, you get one process. You can’t fork and exec a child. And if you query the Mach kernel for information about another process, you get back KERN_FAILURE.

(To get a very good idea of what’s possible at the fringes of a sandbox, take a look at a-Shell on your mobile device. It does an amazing number of things, but you’ll quickly feel frustrated that ps, kill, top, and anything else that deals with processes is missing.)

There is a good reason for apps only having visibility of their own state. Imagine the kind of fingerprinting that Google and Facebook could do by seeing what apps you’re using. We’ve already seen apps trying to do the same thing using URL schemes.

When my pal John Gruber talks about Macs doing the heavy lifting, it’s not just about complex and resource-intensive tasks. It’s also about the security exposure: the Mac is the only “dangerous” Apple platform.

Windows

There is a thing that developers love almost much as processes: windows. We have so damn many. Hundreds on a good day. Thousands on a really good day.

And this is why I get frustrated every time I see a demo of Apple’s headset. I can easily imagine fitting my work into a space with an infinitely large interaction surface.

As it is now, you get to see a screen or two streamed from your Mac. That will surely improve; probably to the point where you have individual windows in your spatial environment.

But you’ll still be carrying the Mac around to get any work done. Somewhat ironically, the Apple Vision Pro is not doing the heavy lifting, but it will be the thing that’s cumbersome in your daily life.

Here’s a comparison of the headset’s carrying case and a MacBook Air:

11.69″ × 8.78″ × 6.5″ vs. 11.97″ × 8.46″ × 0.44″

The Apple Vision Pro is almost 15 times taller than the MacBook Air. Even worse, I can’t even close my backpack, much less fit in a laptop:

“You’re going to need a bigger boat.”

After the Mac was introduced, you didn’t have to carry around an Apple ][ or Lisa to do software development.

Yet here we are because the Apple Vision Pro is locked down. It’s being relegated to being a fancy display for software developers. That’s not necessarily a bad thing and there’s no extra cost for a display stand.

But…

This isn’t a sustainable situation for the next 40 years. Without some low-level structural changes in visionOS, it will never thrive as a developer platform. Just as the iPad has not.

It also doesn’t bode well for the Mac. I’m sure Apple can continue to add incremental changes to satisfy developers, but there won’t be anything revolutionary with how we work. There is also little incentive for Apple to change here: you are buying an Apple Vision Pro along with a MacBook, after all.

One of the extrordinary things that happened back in 1984 was the ability to have more than one terminal window. Even though my Mac had to be connected to a VAX 11/780 over a serial cable (sound familiar?), this was a completely new way of working. We were suddenly free of working within the confines of a single 24×80 character display.

Once we broke free of those limitations, things like visual development environments took hold. I’m pretty sure Apple understands the productivity benefits that came along with these changes.

And here’s the thing: developers don’t come up with these ideas unless they have a place to experiment. Seeing multiple windows that contained code, debugging, and other tools led some folks to start thinking about integrating this environment using the new interaction mechanisms.

Those same kind of folks may find inspiration in spatial computing, but will ultimately get thwarted by the restrictions of a single process. An architecture developed for mobile devices with only one app on the screen is now being used for apps on an infinitely large screen.

Apple Vision Pro is a technical marvel, but ultimately falls short in ways that satisfy the natural curiosity of developers.

That’s a shame. I just hope some smart folks at Apple feel the same frustration I do, because we need a future beyond the Mac.

Lame, Until it Isn’t

Where there’s smoke, there’s fire. And as we approach WWDC 2022, there’s a lot of smoke around AR and VR. In some ways, this is going to be a huge inflection point, in other ways, it’s probably going to be a letdown.

Remember when the iPod was announced? Some folks called it lame because it didn’t meet their expectations.

The same thing will be true of anything Apple wants us to put on our face. It’s going to less impressive technically than any of the currently shipping products. And that’s good, because you don’t make fundamental changes by tweaking existing technologies. A Nomad audio player was a tweak. An Oculus headset is a tweak.

Everything we’ve seen to date with VR has been an attempt to bring information to a 3D world. Headsets are just a means to project that 3D environment so our eyes can see it.

I think Apple’s approach with AR will be completely different: they will bring 3D to an information world.

We all have the greatest source of information humankind has ever known in our pocket or purse. Much of that information relates to the world around us: weather, transportation, shopping, dining, etc. Relating that data to our physical space will be a powerful tool.

All the AR examples we’ve seen on Apple’s devices hint at this direction. They take the information on our phone and place it at derived 3D coordinates. Where people get tripped up in these demos is where the results are shown: on a standard screen.

I don’t think that’s Apple’s final goal, because any current screen technology will block your view of the real world. It’s also why I think 3D headsets will remain a niche technology: people have innate need to see what’s going on around them.

Our current  screens also use a lot of power. And that means batteries. And that means weight. Not what I want on my face, for sure.

Apple knows this and that’s why I think a new display system is the thing they’re taking time to get right. We may or may not see this new display at WWDC. I can remember a time when all we had for an iPad was a simulator.

The changes caused by a new display will be incremental. There will certainly be technical limitations in the product that are imposed by size and weight: Apple will improve on those things as components allow.

Other changes will happen because no one, including Apple, really knows how this display will be used by normal folks (we, I should note, are not normal folks). The first Apple Watch tried to do a lot of things: iteration got rid of things no one used, and improved the things everyone wanted.

It’s likely that a first iteration will also be a “satellite device” where the iPhone does the heavy lifting. Much like the original iPod relied on a Mac. The realityOS could be nothing more than widgets for a new display on your face.

That will feel lame until you realize something else: after two decades, the basic form factor and functionality of that first iPod is now an essential part of our lives and we call it an iPhone. Don’t underestimate Apple’s ability to iterate.

The Future of Interaction

Shortly after finishing my treatise on Marzipan, I started thinking about what lies beyond. Some of those initial thoughts made it into a thread on Twitter.

This post can be considered an addendum or a hell of a long footnote: in either case, you’ll want to start by reading my thoughts on Marzipan. Because what’s happening this year is just the start of a major shift in how we’re going to build apps.

So while everyone else is making predictions about WWDC 2019, what you’ll find below are the ones I’m making for 2020 and beyond.

Update June 7th, 2019: It turns out I was making predictions for 2019, after all.

What is Apple’s Problem?

Before we get into thinking about the future, let’s look at a problem that Apple has today: too many products on too many platforms.

Historically, a person only had one computer to deal with. For several generations it was a mainframe; more recently it was a PC. One of the disruptive changes that started with the iPhone was the need to juggle two computers. Now we have watches generating and displaying data: another computer. Increasingly, the audio and video devices in our living rooms are added to the mix.

Syncing and cloud services help manage the data, but we all know the challenges of keeping a consistent view on so many machines.

If you’re an iMessage developer, you have to think about a product that works on iOS, macOS, and watchOS. You get a pass on tvOS, but that’s small consolation. The same situation exists in various combinations for all of Apple’s major apps: Music, Calendar, Reminders, Notes, Mail, etc.

It’s likely that all of these apps share a common data model, probably supported by an internal framework that can be shared amongst platforms. That leaves the views and the controllers as an area where code can’t be shared.

Marzipan is About Views

With this insight, it’s easy to see Marzipan as a way towards views that share code. A UIView can be used on your TV, on your desktop, on your wrist, and in your pocket. That’s a big win for developer productivity.

It’s also a win for designer productivity: you can share app design elements. We already see this in Apple’s cross-platform apps when colors in Calendar match, speech bubbles in Message have the same shape, and Notes shares a special glyph for the “A”.

Everyone’s excited to know what the Dark Mode on iOS is going to look like. My guess is that people who have been running a dark user interface on their Mac have already seen it. It’s hard to find a balance of readability and contrast with dark elements and I don’t see Apple’s designers making any major changes in next week’s announcement. There will certainly be refinements, but it makes no sense to throw out the huge amount of work that’s already been done.

I also see the Mac leading the way with techniques that provide a more vibrant interface and allow a customer to customize their device. The accent color in System Preferences would be a welcome addition in the iOS Settings app.

All of this leads to a common appearance across platforms. In the near future, we’ll be in a nice place where our architecture can be shared in models, and our designs shared in views.

That leaves us with one final problem to solve: how do we share our interactions and controllers?

The Arrival of New Interactions

Apple ties interactions to platforms and their associated hardware. The Mac has interactions that are different than iOS. And watchOS has ones that are different than iOS. On tvOS, you can be limited to four arrow keys and two buttons.

There are some interactions, such as a swipe, that appear on multiple platforms, but as a whole each platform is different. This approach lets the customer get the most out of the device they purchase.

As we start to think about how interactions are shared amongst platforms, it’s wise to consider new hardware might be arriving soon.

For the past few years, Apple has been putting a lot of effort into augmented reality (AR). And I have no doubt that this hard work is not for our current devices.

AR is a great demo on an iPhone or iPad, but the reality is that you can’t hold a device in front of your face for an extended period of time: your arm gets tired after just a few minutes. I made this argument over a decade ago when everyone was getting excited about multi-touch displays coming to their desktop. It still holds true because it’s a dumb idea, just like AR on a mobile phone.

Apple will solve this problem with new hardware. And these devices will run “headgearOS” with new and completely different interactions. If you’re that Messages developer, it means you’ll be writing new code for yet another platform. Yay.

There are other twists to this story: rumors about iPads with mice and Macs with touch screens. And let’s not forget about interactions with voice commands using Siri technologies.

It all adds up to a situation where the complexity of products is increasing exponentially as new devices and interactions are introduced. There has to be a better way.

How Not to Do It

Cross-platform frameworks have a long history of sucking. If you ever used a Java app during the early days of Mac OS X, you know immediately what I’m talking about: the interactions were from a different universe. The design of the system was a “least common denominator” where only a limited set of capabilities was exposed. It just felt wrong.

More recent attempts have had more success but they still fail to address the problem of ever-expanding interactivity.

Apple’s been down this road before and I don’t see them making the journey again. Instead, I see them taking a new and forward thinking direction. A bold and pragmatic change that Apple is famous for: they’d be setting themselves up for the next decade of user interaction.

So what could they do that no one else in the industry is doing?

Declarative Interactions

As Matt Gallagher notes, we’ve slowly been heading towards a declarative programming style.

Declarative programming is describing a system using set of rules and relationships. The rules and relationships cannot be changed during the lifetime of the system (they are invariant), so any dynamic behavior in the system must be part of the description from the beginning.

Syntactically, declarative programming is often about assembling a whole system of rules as either a single expression or domain-specific language whose structure reflects the relationship between the rules in the system.

Layout is an inherently declarative task. Layout is a set of rules (which Auto Layout calls “constraints”) that apply to the contents of a view, ideally for the entire lifetime of the contents. Constraint programming itself is sometimes considered a sub-discipline of declarative programming.

He pulls this together with his own experiences into a prediction about a Swift-only framework for “declarative views”.

Independently, John Gruber has made similar observations.

The general idea is that rather than writing classic procedural code to, say, make a button, then configure the button, then position the button inside a view, you instead declare the button and its attributes using some other form. HTML is probably the most easily understood example. In HTML you don’t procedurally create elements like paragraphs, images, and tables — you declare them with tags and attributes in markup.

In my opinion, limiting this thinking to just views and layout is short-sighted.

That’s because it doesn’t address the interaction problem with an ever increasing set of platforms. But what if this new framework not only let you declare views, but also the behaviors they enable?

The developer would describe the interactions an app supports. There would be relationships between those declared interactions. All this immutable information would then be processed by user interface frameworks. Your app’s behavior would be determined at runtime, not when it was compiled.

Think about how this would work using auto layout as a point of comparison. Until that declarative system came along, we all worried about frame placement. Now we just worry about the relationships between those frames and let the system pick what’s best.

With our interactions, we still have this tight coupling between a user action and app’s behavior. Tapping on a button or a swipe gesture invokes some code directly. A declarative interaction would be a layer of abstraction between what a customer does and how your app reacts.

Again, using auto layout to help form our thinking, what if there were “interaction classes” that functioned like size classes? Your iPad would behave as it always has until you plugged in a mouse. At that point, how you interact with the device adapts:

  • Controls could get smaller because of the increased pointing accuracy
  • Views could gain a hover state to display additional information
  • Drag & drop could change because it no longer depends on two fingers
  • A mechanical wheel could replace a finger for scrolling

This kind of adaptability would work across platforms – your app would behave differently when it was running in augmented reality or on a TV screen. As a developer you wouldn’t have to worry about what kind of hardware is available, you’d have to worry about what to do when a customer used it to perform a task.

I’m not going to predict how this would be accomplished. Yes, it could draw inspiration from React or other similar technologies. The only thing I’m confident of at this point is that Apple knows it has a problem and is working actively to solve it in a platform-independent fashion.

Marzipan is our first step. And what I’ve described above is Amber, the next step.

Benchmarking in your pants – 10th Anniversary Edition™

One of my favorite posts is one that’s over ten years old: Benchmarking in your pants.

In that essay, I compared the original iPhone to my iMac, both with native and web apps. One of the reviewers of my treatise on the iPhone SDK thought it would be fun to see how those numbers stack up to an iPhone X.

The code still runs, so why not?

Test Original iPhone iPhone X Faster by
100,000 iterations 0.015 secs. 0.000408 secs. 36x
10,000 divisions 0.004 0.000043 93x
10,000 sin(x) calls 0.105 0.000107 981x
10,000 string allocations 0.085 0.000367 230x
10,000 function calls 0.004 0.000040 100x

These numbers should be considered very approximate. I only used three digits of precision in the original measurements, this time over 5 were needed. Also, there was no attempt to use more than one core.

Still, it’s easy to see why today’s apps are much more sophisticated. They run code hundreds of times faster.

They also have screens that are a bit larger than 320 × 480 :-)