The Task-Oriented Revolution

TheBlackCat posted this on an earlier post on this blog, and I thought it was worth sharing more prominently:

What I think will be a key revolution KDE will bring about is the task-oriented desktop. Plasma, Akonadi, Nepmuk, these are all parts of that. It will make computers smarter. Up until now computers did not care what you are doing, they cared about what you were using to do it. They organized themselves around what program you are using, not what you are doing with that program.

But people have different programs to accomplish the same task, and tasks often involve multiple programs. A computer that knows what you are doing and reorganizes itself to make that task easier is a huge leap forward in the way we work with computers. The Office 2007 ribbon interface is another example of that, but it is still embedded in the application-oriented desktop paradigm we have had up until now. It can even be taken a step further, allowing a computer to learn how you like to do certain tasks and organize itself appropriately. For instance such a computer could realize when you are chatting with your IT guy for a certain amount of time you generally pull up certain configuration programs, send him an email with an attachment, and check certain system monitoring applets. Let me get that all ready for your and stick them on a virtual desktop so you can get to it easily. KDE 4 provides the potential for a computer that automatically adapts itself to your work flow instead of you having to adapt yourself to its work flow. Imagine the benefit to businesses if you don’t have to train users to work with the system, the system will train itself to work with the users.

Everybody has different ways they like to do different things, but up until now the best they could do is try go set up their their desktop as best they can to make their most common tasks as easy as possible within the limits imposed by the system. Most people do not even bother to take advantage of the limited abilities their system provides, they simply use the default configuration. They never learn how they can modify and improve their computer experience, their efficiency, and their enjoyment. But a system that knows what users are trying to do, how they like to do it, and knows how take advantage of its own abilities to make those tasks easier would not need to rely on users spending the time and effort to learn the intricacies of the system, it would simply provide what they need when they need it.

Such a system requires four parts, I feel. First it needs a flexible and easily adaptable desktop. Plasma provides that. Second it needs something to track the relationships between data. Nepomuk and akondi provide that, or will soon. Third it needs programs that are able to understand how you perceive their relationship with the data and with each other. As I understand it is this is a major goal of KDE 4 over the long run.

Finally it needs to understand how you like to physically interact with the computer’s hardware. This, I feel, is still where KDE has serious limitations, and I think it is holding back the flexibility found in the rest of the desktop. The ability to configure the UI is amazing, but the ability to configure how the computer’s hardware interacts with and impacts the UI is very limited. Essentially we have keyboard shortcuts, that is it. Little else can be configured by the user.

The way we interact using the mouse is not flexible at all, we have three buttons and a scroll wheel. Modern mice generally have at least 5 buttons and a tilt wheel. The ability to dictate how the mouse interacts with the computer is pretty much limited to touching screen edges to activate a couple of effects, dragging windows across virtual desktops, and a few button presses on windows titles. Shortcuts involving mouse buttons are essentially unsupported. The ability to dictate how certain modes of interaction using the mouse effect the desktop environment is limited, in the relatively few cases where mouse interaction is configurable at all it has at most a couple of options.

Compiz has fairly extensive mouse interaction configuration, allowing pretty much any mouse button to be combined with a modifier key to control most aspects of the window manager. Windows 7 has some interesting ideas about moving windows, like the shaking windows to minimize others and dragging windows to screen edges to maximize them across half the screen. Of course certain people may not like these specific interactions, and in Windows 7 they do not appear very flexible, even compiz does not really support combining keyboard and mouse button presses beyond the use of modifier keys. But in KDE 4 the ability to dictate what effect a certain mouse interaction with a window or with a desktop will have is practically non-existent if you compare it to those examples, and is even more striking next to the extreme flexibility of the rest of the KDE 4 experience. So I think it important to be able to tell the system things like “shaking a window will have this effect”, “moving it to a screen edge will have this effect”, “tilting the scroll wheel left on the desktop will have this effect”, “meta+C+mouse button 5 on an applet will have this effect”.

This is even more limited when it comes to other types of devices. For instance there is no way at all to dictate what pushing a button on a joystick or a bluetooth device will do. They are simply not integrated into the KDE desktop interaction framework at all.

Another biggie that KDE, and Linux in general, essentially does not have at all is voice interaction. But I think this is an extremely natural way for people to interact, giving vocal commands is something people learn from a very early age. It is something that Microsoft has been working hard on supporting, and even most modern cell phones have it, but Linux in general and KDE in particular does not. Things like launching programs, switching desktops, and organizing windows seem particularly suited to voice commands since they are fairly simple and generally do not do anything terrible if there is a mistake.

The output side of hardware interaction is important as well. An example is having the computer know which printer you like to use when doing certain tasks (for instance a black and white office copier when printing PDFs, a color inkjet printer when printing photos). Or knowing that when you go into full screen when viewing a photo you want it to go full screen on your monitor, while if you set go into full screen mode with a video you want it to go full screen on your TV. Phonon and Solid seems to be trying to provide this to the Audio side of things, but it has applications for just about any output device.

Once you have the framework for being able to have a flexible method of interaction between input devices, output devices, programs, the desktop, and windows, it should become much easier for the computer to learn how you like to interact with it and adapt appropriately. For instance it could learn that when you are working with a text document and push the “play” button on your lirc remote you want amarok to open and start playing music, but if you stick a DVD in the drive and immediately after push the same button you want to open Dragon Player and play the DVD. It is extending the task-oriented desktop to the hardware side of things, to learn not only what you do and the process you use to do it but also what devices you use in that process and how you use them.

About these ads

Tags: , , ,

13 Responses to “The Task-Oriented Revolution”

  1. Lincoln Says:

    Very insightful, thanks.

  2. GeniusDex Says:

    I really do not agree on using voice commands for computers. Given that people use computers virtually everywhere and usually with multiple computers near each other, working with voice commands would in my opinion result in a lot of audible garbage. Multiple people talking to their computers in one room is not something i want to sit in.

    I could make a much neater and nicer story of this, but I lack inspiration ;)

  3. Tom Chance Says:

    Interesting, nice to see you moving these ideas forward! Some thoughts from my office desk…

    There’s a big chunk of work to do on making applications/desktop components more aware of each other and of your state to eliminate repetitive interactions. For example, when you launch a presentation or video the screensaver turns off and notifications except very urgent ones are silent. Or getting more complex, when you’re working your personal email notifications stop and work ones switch on.

    Voice interaction is going to be embarassing in a public space :-) People are also very used to the humble mouse, and Apple’s direction was to make it *more* simple! I’m not sure that we need to complicate our interaction, we need applications to be intelligent enough that they need less so we can get on with typing etc.

    Computers trying to anticipate what you’re up to sounds really, really hard unless you use your computer in very formulaic ways. You get into the realms of AI research and realise just how complicated every day decisions we make are.

    Your examples are quite application-specific still. Given that I just do work on this computer (well, mostly!) my tasks are things like:

    – work on email
    – check/update TODO list
    – work on x (requires 3 docs, a spreadsheet and some emails open)
    – research y (a few web sites, a PDF and a mind map)

    What I really need is a quick way to dive in and out of tasks. Maybeit’s by saving a “session” so that with a quick menu launcher I can get all my “x work” windows back open. Maybe a KDE developer can think of a better way.

    I wrote about the idea of “sessions” way back in 2005, and hold out hope that KDE can find a clever way to help me with this:

    http://tom.acrewoods.net/blog/2005/oct/life-wasted-my-computer

  4. T0m Says:

    Very interesting.

    It would be a big paradigm shift and it will take a long time to figure out how to configure desktops like that (or make them learn.)

    So you better start in 4.3 ;)

  5. Andreas Says:

    Really, this is a *big* blob of text, not very well suited to the typical blog layout made for somewhat short entries. The first paragraph didn’t grab my attention despite my interest in the topic. For those resons…
    tl/dr http://en.wiktionary.org/wiki/TLDR
    (I’m waiting for comments about interesting points of the text though :) )

  6. doopa Says:

    Good article. Its nice to see people thinking about where kde4 is going and explaining whats possible with kde4. haven’t really seen this sort of article elsewhere, despite paying attention to all the kde blogs.

  7. TheBlackCat Says:

    I really do not agree on using voice commands for computers. Given that people use computers virtually everywhere and usually with multiple computers near each other, working with voice commands would in my opinion result in a lot of audible garbage. Multiple people talking to their computers in one room is not something i want to sit in.

    A lot of people use computers in their homes where there aren’t a lot of people talking.

    eople are also very used to the humble mouse, and Apple’s direction was to make it *more* simple! I’m not sure that we need to complicate our interaction, we need applications to be intelligent enough that they need less so we can get on with typing etc.

    And look where that led them, they ended up being forced to replace it with their Might Mouse, which is much more like the mice their competitors were making all along. On their laptops they ended up replicating the function of a multi-button mouse with their multi-touch system. Even apple was forced to acknowledge that people need more options when interacting with a computer.

    Pretty much every mouse has at least back and forward buttons, but we can’t even use those in KDE. Including some sort of modifier button for the scroll wheel (like to make it a zoom or an application switcher) and a tilt wheel seems to be becoming the new norm for mice. And there is a huge market for even more advanced mice for more advanced users, but these are totally useless with KDE. And there is a huge body of hardware out there that simply does not integrate with KDE at all. Even setting up multimedia keyboards is difficult. The list of keyboard available is extremely limited compared to the full range available even from major manufactures like Logitech and Microsoft.

    If you look at the flexibility and customization options for things like kwin, the Plasma desktop, colors, and other aspects of kde and then compare that to the ability to customize hardware, the discrepancy is pretty striking in my opinion.

    Computers trying to anticipate what you’re up to sounds really, really hard unless you use your computer in very formulaic ways. You get into the realms of AI research and realise just how complicated every day decisions we make are.

    I realize it is extremely difficult, but I think KDE 4 provides a framework that could make it possible. I am not aware of another desktop environment that provides the underlying capabilities to even be thinking about this sort of thing while KDE 4 does.

  8. randomguy3 Says:

    Voice commands are nice for TV, and may even be useful for simple commands like “stop playing music”, providing we can get computers to distinguish between commands and general conversation.

    But for everyday work? Computers may be able to just about work out what is being said, providing they are trained enough in a particular person’s voice, but they’re a long way off being able to interpret that (except in a very deterministic way) or even being able to tell whether the voice is directed at them.

    And I leave you with news that Apple are introducing yet another revolutionary new interface: http://www.theonion.com/content/video/apple_introduces_revolutionary

    Not that I necessarily agree with this view on Apple’s interface design, but it’s amusing and relevant to the discussion.

  9. TheBlackCat Says:

    Yes, as I pointed out voice commands are best suited for simple tasks with a low cost of error. You wouldn’t want to try reformatting your hard drive using voice commands, but things like launching programs and organizing windows (like “tile windows”, “launch amarok”, or “maximize openoffice”) are short, one time-commands that don’t have that big of a danger if it misunderstands you. Further, things like this require you to stop your existing work flow, open an application launcher, then start again. Using voice commands allows you to accomplish them without interfering with your current task.

  10. Michael "Touch" Howell Says:

    Personally, I like touch-screens better than voice. Interacting with the hands is about as common, and learned as early, as the voice. Also, voices are generally directed at people, whereas it is more natural to interact with things using the hands.

  11. Tom Chance Says:

    If you look at the flexibility and customization options for things like kwin, the Plasma desktop, colors, and other aspects of kde and then compare that to the ability to customize hardware, the discrepancy is pretty striking in my opinion.

    Striking but maybe reasonable! Or maybe not. There’s no prima facie reason why hardware should be as flexible as software.

    I guess an important point here is that this flexibility is great for a lot of people, but it shouldn’t be required. I know almost nobody who goes further than changing their background. Even at KDE conferences you see most KDE developers going with something close to the default KDE settings.

    KDE 4 has started to get this right: get the defaults right, make basic/important configuration simple, allow power users to screw around if they really want to.

    The same goes for interaction methods. They need to be easily discoverable, and fit into the overall desktop metaphor that Plasma is aiming for. You also need to be able to work efficiently without knowing about them, because a lot of people will just miss them. They’re much less interested in computers than you or I!

    KDE also need to solve pressing problems, not create more bells and whistles for KDE geeks. For example: needing to fire up 8 programs to get stuck into your work; or trying to dig through endless windows and folders to find the data you’re looking for; or wasting time firing up calculators and web browsers and mail clients to do tasks that the Plasma dashboard can do in seconds; or having to lean over to your computer to turn off the screensaver every 15 mins when watching a movie then needing to lean over again to pause it for a moment, etc.

    This is, for me, the crux of a task-oriented desktop.

  12. TheBlackCat Says:

    @ Tom Chance: This is a matter of bells and wistles, this is a matter of basic interaction. You cannot use the back and forward buttons that are found on pretty much every modern mouse. When there is no support for basic features found on pretty much every modern version of an fundamental piece of hardware, then I think that should tell us something.

  13. Ronald Dehuysser Says:

    Perhaps a good place to start with this is mylyn.

    “Mylyn is a task-focused interface for Eclipse that reduces information overload and makes multi-tasking easy. It does this by making tasks a first class part of Eclipse, and integrating rich and offline editing for repositories such as Bugzilla, Trac, and JIRA. Once your tasks are integrated, Mylyn monitors your work activity to identify information relevant to the task-at-hand, and uses this task context to focus the Eclipse UI on the interesting information, hide the uninteresting, and automatically find what’s related. This puts the information you need to get work done at your fingertips and improves productivity by reducing searching, scrolling, and navigation. By making task context explicit Mylyn also facilitates multitasking, planning, reusing past efforts, and sharing expertise.”

    For more info, see: http://tasktop.com/videos/mylyn/webcast-mylyn-3.0.html

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.

%d bloggers like this: