In a recent article at InfoWorld, Neil McAllister reports that Microsoft have released a software development kit that shows how future applications can use a webcam input to replace a mouse or pen input. It works by recognising an object in your hand and tracking it as you move it across the screen.
There are upsides and downsides to not having a surface that you are touching when interacting with a user interface. The software will have great problems determining the equivalent of pressure. Mice have two levels of pressure: button pressed or button not pressed. Pen and finger-based devices can discriminate between many levels of pressure. The iPhone can tell how hard you are pressing its screen. That gives more options when it comes to interpreting what you want to achieve.
Alternatively, the advantage of an ‘air-based’ input technique is that you can deal with different scales of input. This is done simply a mouse: moving 3 mm using a mouse can move a cursor many pixels. If you run out of mouse mat, all you need do is pick up the mouse and move it to the middle of the mat again – as far as the computer in concerned, you haven’t moved the mouse at all. With pen- and finger- based interfaces, your gestures are always at a ratio of 1 to 1: you need enough space to move your pen or finger that matches your screen size.
A limitation of Microsoft’s ‘Touchless’ software is that it doesn’t track the operator’s eye. That means it must position a cursor showing you where your finger is. The advantage of eye tracking is shown here:
To prevent arm ache, moving objects across multiple large screens is a matter of moving your fingers closer to your eye. For more precise control, you can move your fingers closer to the screen. In the picture, the index fingers of the user’s hands are the same distance apart in each case, but define very different-sized areas on the screens shown. This fixes the problem of multi-touch scale.
To fix multi-touch pressure, there will have to be some sort of gesture that defines where in 3D space the virtual screen is. When needing to make big gestures like the upper picture above, you’ll need to define the screen as being close to your eye. When performing precise operations, you’ll to push the virtual screen further away. The ‘pressure’ will be calculated by the position of your fingers relative to the virtual screen.
The pressure problem will start to go away when we modify our user interfaces so that we are manipulating ideas more like clay than sheets of paper.
Here is a clip showing how realistic 3D rendering can be when the computer knows where your eyes are:
The catch is that the 3D effect doesn’t work for anyone else looking at the same screen. A 3D monitor will be needed for each viewer.