Today Apple was awarded an interesting patent. It was applied for in 2004, so developments over the intervening years may have superseded some of its concepts, however it shows how gestures can be used in non-multitouch enabled situations.

Here’s a long edited excerpt from Apple’s patent:

The first part explains why using gestures is a good idea:

The use of gestures to control a multimedia editing application provides a more efficient and easier to use interface paradigm over conventional keyboard and iconic interfaces.

First, since the gestures can replace individual icons, less screen space is required for displaying icons, and thereby more screen space is available to display the multimedia object itself. Indeed, the entire screen can be devoted to displaying the multimedia object (e.g., a full screen video), and yet the user can still control the application through the gestures.

Second, because the user effects the gestures with the existing pointing device, there is no need for the user to move one of his hands back and forth between the pointing device and keyboard as may be required with keystroke combinations. Rather, the user can fluidly input gestures with the pointing device in coordination with directly manipulating elements of the multimedia object by clicking and dragging. Nor is the user required to move the cursor to a particular portion of the screen in order to input the gestures, as is required with iconic input.

Third, the gestures provide a more intuitive connection between the form of the gesture and the associated function, as the shape of the gesture may be related to the meaning of the function.

Fourth, whereas there is a relatively limited number of available keystroke combinations–since many keystroke combination may already be assigned to the operating system, for example–there is a much larger set of available gestures that can be defined, and thus the user can control more of the application through gestures, then through keystrokes. 


FIG. 1: The user interface includes three primary regions, the canvas (102), the timing panel (106), and the file browser (110). The canvas is used to display the objects as they are being manipulated and created by the user, and may generally be characterized as a graphics window. In the example of FIG. 1, there are shown three multimedia objects (104), Square A, Circle C, and Star B, which will be referred to throughout as such this disclosure when necessary to reference a particular object. In the preferred embodiment, the multimedia application is a non-linear video editing application, and allows the creation of multimedia presentations, including videos. Accordingly, the objects displayed in the canvas at any given time represent the “current” time of the multimedia presentation. 

Read More

%d bloggers like this: