User interface

Last week the BBC reported on a new addition to Google Earth: the ability to explore ancient Rome.

The 3D models and virtual tours of the Rome of 2,000 years ago were created by Past Perfect Productions. I hope they move on to creating more environments for people to explore. Once there is is a market for this, standard ways of modeling and showing the past, things will get very interesting.

It might be that we will be able to explore any place on Earth at any point in history.

My home page from 10 years ago stored at

My home page from 10 years ago stored at

Today Google software continually explores the web and catalogues text and image content for search purposes. The search page lets millions of people all over the world the explore the current representation of the world on the web.

One day similar software will use XML-tagged information to build a model of the Earth at any point in the past. Combining all pictures, words, models, sounds used to represent the past will mean that we’ll be able to search based on the world as it was 2,000 years, 100 years or ten weeks ago.

Imagine looking at the world the day you were born. You could look at your favourite websites filled with the news as it was back then. Once architect and municipal plans are combined in the model with photos taken in the weeks before your birth day, you’ll be able to walk down the street you were born. Or the street where you live today as it was back then.

Witness the model of the world as it was thirty years ago become clearer as governments and organisations release documents kept secret up until now. Once you combine accounts, memos and all the documents kept in world archives, the model will become more accurate.

If you could go to any place and time, where would you go first?

Engadget pointed to a video on Vimeo that shows ‘the first major step in computer interface since 1984’:

They’re referring to the introduction of the Mac user interface (almost 25 years ago). That UI was a revision of the Lisa user interface for home users. The elements that made this work were the mouse, icons and overlapping windows. They were around for many years before 1984.

The stuff in this video is the equivalent of the generic concept of a pointing device. A 3-D mouse.

There is no next generation representational abstraction, i.e. a replacement for icons. The 2.5 D interface (the 0.5D being the layers of windows on screen) is now a 3D interface.

There’s no point having a multi-touch 3D mouse unless you have better ideas for what you’ll be manipulating with it. They even had to fake automatic keying of a truck and a man from a couple of shots that were then combined in a third. Anyone who has done that kind of keying and composition knows that you need to do a lot more than point at what you want to get things done. Just because you are compositing some 2D footage in a shallow-depth 3D-space doesn’t make the job of compositing that much more intuitive.

They didn’t even use eye-parallax – if you need to collaborate with others, you still need cursors. How twentieth-century of them…

My new input device

Following up Google’s voice-operated iPhone search application, maybe it’s time we started to think about non-visual interfaces for our technology. We’ve seen them depicted in Star Trek and sci-fi stories for decades. They show heroes of the future engaged in conversations with technology.

I think that children born ten years from now will find our obsession with visual interfaces quaint. UIs are still centered around ‘the document’ – the system used by corporations in the 18th and 19th centuries to organise colonial empires, and by educational institutions to formalise schooling.

It may be that technology will eventually help us come up with a new technique to pass on and store knowledge. Do you conceive of what you know in terms of words and pictures written on documents? That’s not the form I use to maintain my model of the way my world works. Documents (such as this blog) are a transmission method. We may be able to come up with something more effective in the coming decades.

The late 19th and early 20th century and introduced electricity-powered motors to middle-class people’s lives. Clothes are washed and dried using spinning motors. Refrigeration works using heat pumps. The reason why alternating current was chosen as the method for delivering electricity to people’s homes was that motorised devices need AC to work. As the decades went by, electric motors became hidden, less noticeable in everyday use. Technological methods recede into the background as the services they deliver evolve into utilities. Few families have their own electricity generator, water pump and sewage treatment works any more.

In the same way, computers eventually will fade few view, and our connection to the rest of the world will through a voice whispered in our ears and our instructions will be whispered so no-one else can hear. Nearby surfaces will be used as displays for images and video, but probably won’t be the primary method for technology interaction.

There are a few trends that may lead us in this direction.

The idea behind ‘cloud computing’ is partly about getting people and organisations to let go of having a specific place for a document or unit of computing power. We pay for a service that handles making sure that the documents we have are safely backed up and instantly available where we are in the world. The cloud also provides computing power; when an online service starts getting bogged down with consumer requests, it can call on Google’s cloud of computing power to help out for a few hours. We don’t need to know which power station produced the electricity that is keeping out lights on at night, as long as the power is there when we want it, eventually we’ll trust that the cloud holds all the information we’d like to have access to anywhere. It might be easier for us to tell our technology to do what is needed to get us through the day: “Tell this new bank what it needs to know for me to open the new account.”

The natural language interfaces that have been evolving for the last ten years will eventually become that ‘personal digital assistants’ that will spend their time looking after us. For example I Want Sandy currently uses email to communicate, I assume they’re working on a voice-operated version for mobile technology.

Think about how important needing to find or create ‘the right document’ is for us all today. Eventually something will come along to replace this need. Such is the the nature of technology: in the long run it makes every generation feel out of date.

It is time to turn to the educationalists and see if they can come up with something better…

Apple have had their success with iTunes partially because the pricing model is so simple: 79p per track, £7.99 per album. They delayed launching video because they wanted something as simple for movies and TV shows.

People don’t want to have to remember more than one price for a TV show or a movie. When they are about to choose which to buy, they want to be sure how much they’ll be paying.

To those owning the films and programmes, they want to charge more if they think they’ll get people to pay. Recent releases are worth more than catalogue titles. Recent releases need to be paid for too.

However I think there will be a market for pricing based on the size of the potential audience of the video. A video kept for reference and watched every once in a while by an individual could be priced lower than one shown to over 200 people at a private club.

If that is so, why not charge based on screen size instead of resolution. Imagine paying less for a video than can only be shown on an iPod Touch or iPhone than one that that those devices output to TV.

The tradeoff between the content owners and consumers could be based on the implied audience size associated with a screen size. It would be uncomfortable for many more than one person at a time to watch an iPod movie. Not more than 30 would want to watch a consumer-based HD display at the same time…

I vote for cheap movies for people with no friends, they deserve something to make up for the loneliness!

In a recent article at InfoWorld, Neil McAllister reports that Microsoft have released a software development kit that shows how future applications can use a webcam input to replace a mouse or pen input. It works by recognising an object in your hand and tracking it as you move it across the screen.

There are upsides and downsides to not having a surface that you are touching when interacting with a user interface. The software will have great problems determining the equivalent of pressure. Mice have two levels of pressure: button pressed or button not pressed. Pen and finger-based devices can discriminate between many levels of pressure. The iPhone can tell how hard you are pressing its screen. That gives more options when it comes to interpreting what you want to achieve.

Alternatively, the advantage of an ‘air-based’ input technique is that you can deal with different scales of input. This is done simply a mouse: moving 3 mm using a mouse can move a cursor many pixels. If you run out of mouse mat, all you need do is pick up the mouse and move it to the middle of the mat again – as far as the computer in concerned, you haven’t moved the mouse at all. With pen- and finger- based interfaces, your gestures are always at a ratio of 1 to 1: you need enough space to move your pen or finger that matches your screen size.

A limitation of Microsoft’s ‘Touchless’ software is that it doesn’t track the operator’s eye. That means it must position a cursor showing you where your finger is. The advantage of eye tracking is shown here:

To prevent arm ache, moving objects across multiple large screens is a matter of moving your fingers closer to your eye. For more precise control, you can move your fingers closer to the screen. In the picture, the index fingers of the user’s hands are the same distance apart in each case, but define very different-sized areas on the screens shown. This fixes the problem of multi-touch scale.

To fix multi-touch pressure, there will have to be some sort of gesture that defines where in 3D space the virtual screen is. When needing to make big gestures like the upper picture above, you’ll need to define the screen as being close to your eye. When performing precise operations, you’ll to push the virtual screen further away. The ‘pressure’ will be calculated by the position of your fingers relative to the virtual screen.

The pressure problem will start to go away when we modify our user interfaces so that we are manipulating ideas more like clay than sheets of paper.

Here is a clip showing how realistic 3D rendering can be when the computer knows where your eyes are:

The catch is that the 3D effect doesn’t work for anyone else looking at the same screen. A 3D monitor will be needed for each viewer.

Over the last decade Google has been top of the heap through two technologies: its page-ranking search technology and the ability to place relevant advertising right next to the content on pages all over the web. The patent awarded yesterday might hand a similar technology to Apple: the ability to insert relevant advertising into all other forms of media at the point of playback.

If Apple or someone else comes up with a better search algorithm to select the adverts that appear during podcasts, movies and radio shows, Google might face some serious competition.

In fact many people might welcome the intrusion of advertising into digital media – if it means that they get that media for free.

We have to pay for our media one way or another: movie tickets, DVDs, official downloads, TV licenses and Pay TV are obvious payment points. Advertising, PR and sponsorship are less obvious.

One day we’ll be able to live by our preferences – we’ll be able to pay for our media directly and avoid messages from corporations, governments and individuals. On the other hand, we might want to have other people pay for our media:

An imaginary ‘media payment preferences’ control.

It seems as if Apple have been granted a patent that will bring this customisation a little closer. It is in the nature of patents that they are framed to cover as many possible future inventions as possible. They sometimes need to hide their true nature:

1. A method for presenting media by a media playback device, the method comprising: receiving a playback request to play a media group, the media group including a plurality of media items; determining whether auxiliary media is also to be played back; playing back media items from the media group; and playing the auxiliary media if the determining data determined that the auxiliary media is also to be played back.

It may be that patent 612029 granted to Apple today patents the ability to incorporate advertising into media content on playback. This means that every time you listen to a piece of music, a podcast, watch TV show or movie, a different advertisement appears:

In one implementation, presentation of a media group can involve not only presentation of media items of the media group but also presentation of auxiliary media. Another aspect pertains to how and when auxiliary media data is to be presented (e.g., played) by an electronic device. Another aspect pertains to updating or refreshing auxiliary media data. Still another aspect pertains to restricting presentation of primary media by an electronic device unless auxiliary data is also presented.

(my emphasis)

The patent gives examples of ‘a media group’ as any of the content that can be played on an iPod. It implies that a variable amount of content is automatically stored on a device and a method for choosing which media is played as ‘auxiliary media’ before, during or after playing a media group:

the method further comprises:storing a plurality of auxiliary media items on the media playback device; and determining one or more of the auxiliary media items that are to be played by the playing of the auxiliary media.

Here is where advertising is mentioned:

20. A method as recited in claim 1, wherein the media items are selected from the group consisting of: songs, audiobooks, podcasts, and videos.

21. A method as recited in claim 1, wherein the auxiliary media is advertising content.

In one example, the auxiliary media data can pertain to advertising. Advertising information can pertain to specific products, services, shows or events. When advertising is able to be refreshed or updated, improved advertising results can be achieved.

A functional flow diagram showing \'Present Playback Denied\' message\'
A functional flow diagram from the patent showing a ‘Present Playback Denied’ message if the secondary media (advertising) playback is disabled.

In one embodiment, since presentation of auxiliary data can be ensured, the cost to the user for an electronic device can be lowered. For example, the ability for advertisements or news to generate revenue can be used to offset the cost for the electronic device. For example, the presentation of auxiliary data can be used to subsidize the cost for the electronic device.

Different aspects, embodiments or implementations of the invention may yield one or more of the following advantages. One advantage is that a media playback device can present not only media items but also auxiliary media. The auxiliary data can be automatically provided and integrated (e.g., interspersed) with playback of media items. The auxiliary media can be media such as advertisements or news. For example, advertisements can be audio or video (i.e., multimedia) commercials or promotional segments, and news can pertain to national news headlines, sports highlights, international news, local news, etc. Another advantage is that auxiliary data can be automatically delivered to a media playback device so as to remain current and effective. Still another advantage is that the manner by which auxiliary media is interjected in playback of media can be controllable, such as by: user selections, user preferences, user actions, media item content providers, auxiliary media content providers, online media store, or media playback device manufacturers. Yet still another advantage is that a media playback device can require playback of auxiliary media in order to playback media items.

In a patent of over 10,000 words, the letters ‘advert’ are only used 14 times, but I think this is a major part of this patent.

This means that advertising-supported media will have the option to incorporate different advertising each time that it is played on a device (iPod, iPhone, TV, etc.). The advertising will be streamed automatically if a wireless connection is available. Previous advertising will be stored on the device so that it can be played if there is no connection to the internet.

Keeping the ads fresh
A functional flow diagram from the patent showing how secondary media data can be updated automatically.

The example I usually give is when we use media content to support our day-to-day conversations. If I mention a piece of music, it is very handy to have my iPod with me – I can use it to play back the specific track I’m talking about. I’d like to do that with TV shows and movies.

Imagine if someone I’m chatting with refers to a specific scene in an episode of Friends. It would be great to support that part of the conversation by viewing that scene. Using a 1TB iPod, I could hold almost any piece of music I could think of. However it will take a while before every movie or TV show I can think of will be storable on a single device.

Soon all media will available to us wirelessly. We’ll be able to pick any nearby surface and stream any media to it. How will we pay for this? It depends on the media. If you are a fan, you could buy the right to watch or listen to the media for the rest of your life (the equivalent of buying a DVD). On the other hand, if you want to see that media a single time, you probably won’t mind a short advertising message playing for a few seconds before or during the film or TV show (the equivalent of commercial TV).

Maybe advertising will be a lot less irritating when we have the option to pay for it not to be shown to us. I wonder if this sort of thing should be patented. It seems a little obvious…

Just when the world’s advertising and media companies think they’ve got a handle on using the internet to build and maintain relationships with millions of people, a new disruption might be on the way. Everyone might have to get into the software business.

An example: Soon Major League Baseball fans will be able to download an application to their phones that will keep them up to date with games as they happen. Using pictures as well as text. And video clips showing replays of the action moments after it happens. This is an application that will be available on the iPhone within the next few months.

Instead of going all over the web to buy music, you can visit one of the very few digital online music retailers – such as iTunes. The more these services act like software applications, the more successful they are. iTunes and similar software may go to the internet to find songs and videos, but the places it goes doesn’t matter to those searching for music.

These are the sort of services people will start to expect on their phones. If Apple’s iPhone, Google’s Android (an operating system for telephones – coming soon) and Microsoft’s (they won’t be able to stay out of this) products take off, people won’t expect to jump from page to page any more. They’ll expect government, corporate, social and individual entities to provide services that represent the relationships the entities want to have with their audience.

Governments will provide a service to manage citizens’ relationship with their society. How much tax you pay, how many expenses you can claim, which state you spend the night, what benefits you deserve, what rights you have.

Corporations would have to express themselves by what they can usefully do for their audiences – if anything.

I imagine that some blogs will evolve into avatars. The things we write, the pictures we like, the way we turn a phrase might one day be converted into a digital representation of ourselves. We’ll have the option for our blogs to speak for us if we are too busy to get involved in a conversation.

Will the web of linked pages that most people identify as the internet still be around in five years time? If not, will applications replace it?

The BBC currently provide a a commentary track for the current edition of Doctor Who. Imagine if these were available for new movies.

A recent patent filing by Apple proposes public very local networks for iPhones and iPod Touches.

The idea is that shops, public buildings and restaurants set up location-specific applications that appear on iPhones when customers wander into the area of their wireless networks. Restaurants could provide custom menus (for those with food intolerances or on restricted diets), museums and art galleries could provide extra information. Shops could provide customer-specific offers.

If this works, why not provide commentary tracks for moviegoers? If you’ve see the film before, yet want to go and see it with friends, you can turn up and download an alternate soundtrack. You are more likely to see the film a second time in a short period of time. The cinemas sell more tickets.

You could even have alternate language soundtracks or an audio description track for the visually impaired. Imagine if the audio playing software on your iPhone could be triggered by a wireless signal to sync with the film when it starts (or even whenever the person arrives to watch the film).

The software could also ask permission to deny incoming texts and telephone calls until the film is over!

%d bloggers like this: