cs4m/slides.org

30 KiB

Computational Symbiosis: Methods That Meld Mind and Machine

Slides [0/4]

Introduction   B_note

Hello, everyone!

My name is Mike Gerwitz. I am a free software hacker and activist with a focus on user privacy and security. I'm also a GNU Maintainer and software evaluator, and hold other various other administrative duties within GNU. I have about twenty years of programming experience, half of that professionally, And I've been a computer user for longer.

So I've been around long enough to see a decent evolution in how we interact with computers. I've gotten a sense of what feel right and wrong as both a user and a hacker. And interestingly, what I've settled on for my computing is really a toolset that was devised decades before I was even born, with some light modernization.

And those tools don't work for everyone. But I think a subset of them can.

So I'm here today to try to explore a healthy balance, and walk you through what I see as an efficient means of computing, based on the problems that I've encountered, and the problems I've seen others encounter over the years.

Choreographed Workflows   B_fullframe

Choreographed Workflows

Practical Freedom   B_fullframe

Practical Freedom

Practical Example: Web Browser [0/9]

Example: Web Browser   B_frame

/mikegerwitz/cs4m/src/commit/be44088c95b8f64817ea61c83a8521102d885af4/images/web-browser.png

Notes   B_noteNH

One of the only GUIs I use on a day-to-day basis is my web browser. In this case, GNU Icecat, which is a Firefox derivative. This is a screenshot of an admittedly staged session, and contains a number of addons. Perhaps most prominent is Tree-Style Tab, which displays tabs in a hierarchy off to the left rather than flatly at the top. I find this visual enhancement very useful when I have dozens or hundreds of tabs open.

There are text-mode web browsers, and I do occasionally use them. But it provides a very different experience. I don't want to spend too much time on my rationale, which as a professional web developer and activist focused on privacy and security run quite deep. For the sake of this talk, let's just recognize that most people who browse the internet use a graphical web browser; that's a simple fact.

I chose a web browser as an example because I feel that it's something that most everyone can relate to using, and most everyone can recognize the utility in—most people using Internet-connected devices use one at least a few times a week, if not every day.

Finding Text (Mouse-Driven GUI Interaction)   B_frame

Images   B_columns
Left   B_column

/mikegerwitz/cs4m/src/commit/be44088c95b8f64817ea61c83a8521102d885af4/images/ff-find-menu.png

Right   B_column

/mikegerwitz/cs4m/src/commit/be44088c95b8f64817ea61c83a8521102d885af4/images/ff-find-matches.png

Notes   B_noteNH

The Web is used for many different things today, but its original purpose is to render documents. Take Wikipedia for instance. Or the LibrePlanet conference website. If you are looking for something specific on a page, a common operation is to search for a word or phrase, like shown here.

Now, how exactly to do this with a mouse varies depending on what program you're using, but here I highlighted the steps in a modern Icecat or Firefox. You start by clicking on the little hamburger, hotdog, or whatever-you-want-to-call-it menu in the upper-right, and then click on "Find in This Page" within the popup. This then opens a bar at the bottom of the page with an area to type the word or phrase you're searching for. It highlights and scrolls to the first match as you type, and has a button to highlight all results. It also shows the number of results off to the right. It's a simple, yet powerful mechanism that is pretty easy to use.

So does a GUI provide the right tool for the job? If you're just searching for a name or concept, sure, that seems to be true. A GUI is useful here.

But notice how I had to convey these steps to you. I had to take screenshots and highlight where to click with the mouse. Since a GUI is inherently very visual, so are the instructions on how to use it. There is no canonical representation for these instructions, because it involves clicking on elements that have no clear name to the user. Unless you're in software or UX development, you may not know what to call that menu in the upper-right. Further, what do you call the bar at the bottom of the page? You have to describe it in a way that reproduces what the user sees.

GUIs Change Over Time   B_frame

Images   B_columns
Left   B_column

/mikegerwitz/cs4m/src/commit/be44088c95b8f64817ea61c83a8521102d885af4/images/ff-find-menu.png

Right   B_column

/mikegerwitz/cs4m/src/commit/be44088c95b8f64817ea61c83a8521102d885af4/images/ff-edit-find.png

Ctrl+F

Notes   B_noteNH

Another difficult thing is: GUIs change over time. I'm sure there are people here who remember earlier versions of Firefox that didn't have the hamburger menu, where the Find menu option was in the Edit menu. By the way, those old menus do still exist if you hit Alt. I miss the old menus, personally, because it did make it easier to convey actions in text. Saying "Go to Edit - Find" is pretty clear, and those menu positions were always in the same place across the entire desktop environment. Now individual programs may vary in the their user experience.

But do you notice something in common between these two screenshots? There's something that hasn't changed over time—something that has been the same for decades! Ctrl+F.

Ctrl+F—Just Works   B_frame

  • Most GUI programs that offer search
  • Context-sensitive—Do The Right Thing
Notes   B_noteNH

When you type Ctrl+F, it immediately opens that search bar and gives focus to the textbox, so you can just start typing. Further, it works in any browser. Not only that, but Ctrl+F is so universal that it works in nearly every GUI program that offers some type of search! And it's context-sensitive! The program will just Do The Right Thing depending on where you are or what you're doing.

Muscle Memory   B_fullframe

Muscle Memory

Visual ⇒ Tactile

Notes   B_noteNH

But there's something more profound that has happened here, that many users don't even think about. We have switched our mode of interaction.

With a mouse and a GUI, interaction is driven by visual indicators. The position of your hand isn't meaningful, because your mouse cursor could be anywhere on the screen at any given time; your eyes provide the context. It's hard to use a GUI with your eyes closed.

But by hitting Ctrl+F, we've completely changed how we interact with the system. It's now tactile. You associate a finger placement; a motion; and the feeling of the keys being pressed beneath your fingers; with an action—finding something. You develop muscle memory. You can trigger this feature with your eyes closed.

(Repeatedly make motion with hand and fingers like a madman during the above paragraph.)

But that's a pretty trivial example.

A Research Task   B_fullframe

Research Task:

Given a list of webpage URLs

find all that do not contain ``free software''

Notes   B_noteNH

Let's explore a fairly simple research task together. Let's say I email you a handfull of URLs—say, maybe 5 or 10 of them—that are articles about software or technology. And I want you to come up with a list of the webpages that do not contain the phrase ``free software'' so that I can get a better idea of ones to focus my activism on.

How might we approach this problem as an average user?

Executing the Research Task   B_frame

Approaches   B_columns
Mouse   B_column

Mouse

  1. Click `+' for each new tab, enter URL
  2. Menu → Find in This Page
  3. Type ``free software''
  4. If found, go to #9
  5. If not found, highlight URL, right-click, copy
  6. Click on text editor
  7. Right-click, paste URL, hit RET for newline
  8. Click on web browser
  9. Click `X' on tab, go to #2
Notes   B_noteNH

(Perhaps I should demonstrate this right away rather than reading through the list first, to save time?)

Let's first use the mouse as many users probably would. To set up, let's open each URL in a new tab. We click on the little `+' icon for a new tab and then enter the URL, once for each webpage, perhaps copying the URL from the email message. Once we're all set up, we don't care about the email anymore, but we need a place to store our results, so we open a text editor to paste URLs into.

Now, for each tab, we click on the little hamburger menu, click on ``Find in This Page'', and then type ``free software''. If we do not see a result, we move our mouse to the location bar, click on it to highlight the URL, right-click on it to copy it to our clipboard, click on the text editor to give it focus, right-click on the editor and click on ``Paste'' to insert the URL, and then hit the return key to move to the next line. We then go back to the web browser. If we do see a result, we skip copying over the URL. Then we close the tab by clicking on the `X'.

We repeat this for each tab, until they have all been closed. When we're done, whatever is in our text editor is the list of URLs of webpages that do not reference ``free software'', and we're done.

Simple enough, right? But it's a bit of a pain in the ass. All this clicking around doesn't really feel like we're melding mind and machine, does it?

What if we used our Ctrl+F trick? That saves us a couple clicks. But what if we could save even more clicks?

Keyboard   B_column

Keyboard

  1. Ctrl+T for each new tab, enter URL
  2. Ctrl+F to find
  3. Type ``free software''
  4. If found, go to #9
  5. If not found, Ctrl+L Ctrl+C to copy URL
  6. Alt+Tab to text editor
  7. Ctrl+V RET to paste URL and add newline
  8. Alt+Tab to web browser
  9. Ctrl+W to close tab, go to #2
Notes   B_noteNH

Fortunately we have many more keybindings at our disposal!

We'll start with opening each new tab with Ctrl+T instead of clicking on `+' with the mouse. (Maybe show copying the URL from the email without the mouse?)

To open our text editor, we'll use Alt+F4, which is a common keybinding for many window managers and operating systems to open a dialog to enter a program to run.

Once we're all set up, we start with the first tab and use Ctrl+F as we've seen before, and then type ``free software''. If we do not find a match, we're ready to copy the URL. Hitting Ctrl+L will take us to the location bar and highlight the URL. We can then hit Ctrl+C to copy the URL to the clipboard. Alt+Tab is supported by a wide variety of window managers on a variety of operating systems to switch between windows of running programs, usually in the order of most recently focused. So hitting it once should take us to our text editor. We then paste with Ctrl+V and hit return to insert a newline. We can then go back to the web browser by hitting Alt+Tab again. Once again, if there was a match, we skip that copy step. We then close the tab with Ctrl+W.

Repeat, and we're done all the same as before. As a bonus, save with Ctrl+S.

What's interesting about this approach is that we didn't have to use the mouse at all, unless maybe you used it to highlight the URL in the email. You could get into quite the rhythm with this approach, and your hands never have to leave the keyboard. This is a bit of a faster, more efficient way to convey our thoughts to the machine, right? We don't have to seek out our actions each time in the GUI—the operations are always at our fingertips, literally.

GUIs of a Feather   B_fullframe

Same Keybindings Across (Most) GUIs!

Browser, Editor, Window Manager, OS, \ldots

Notes   B_noteNH

Another powerful benefit of this approach is—these same exact keybindings work across most GUIs! If we switch out Icecat here with nearly any other web browser, and switch out gedit with many other text editors or even word processors, this will work all the same! There are some notable text editors for which these keybindings won't work, for those of you screaming in your head. We'll get to that.

If you use Windows instead of GNU/Linux—which I discourage, but if you do—then it'll work the same.

This may not seem like a huge deal, but it has liberating consequences—users don't have to learn how to use specific programs to do the job. I can sit down at a completely different system and let that muscle memory take over and wind up with the same thing.

It's liberating. We have started to break free from those choreographed workflows.

Let's look at those keybindings a bit more concisely, since that last slide was a mess, to put it nicely.

Macro-Like Keyboard Instructions   B_fullframe

Macro-Like

Ctrl+T ``https://...'' <N times>

Ctrl+F ``free sofware''
[ Ctrl+L Ctrl+C Alt+Tab Ctrl+V RET Alt+Tab ]
Ctrl+W
<N times>
  • <2> Requires visual inspection for conditional
  • <2> Still manual and tedious—what if there were 1000 URLs?
Notes   B_noteNH

If we type out the workflow keybindings like this, in an isolated format, it looks a bit more like instructions for the machine, doesn't it? Some of you may be familiar with macros—with the ability to record keypresses and play them back later. If we could do that, then we could fully automate this task away!

Unfortunately, we can't. At least, not with the tools we're using right now. Why is that?

Well, for one, it requires visual inspection to determine whether or not a match has occurred. That drives conditional logic—that bracketed part there. We also need to know how many times to repeat, which requires that we either count or watch the progress. We also need to be able to inspect the email for URLs and copy them into the web browser.

This also scales poorly. While using the keyboard is certainly faster than using the mouse, we're only dealing with a small set of URLs here. What if I gave you 100 of them? 1000? More? Suddenly this doesn't feel like a very efficient way to convey our intent to the machine. I don't wish that suffering upon anyone.

And to get around that, we need to change how we think about our computing a bit. And that's why I've dragged you through this drawn-out example—to make sure you understand the significance of these progressive enhancements to our workflow.

Thank You   B_fullframe

Thank you.

Mike Gerwitz

mtg@gnu.org">mtg@gnu.org

\bigskip

Slides and Source Code Available Online

<https://mikegerwitz.com/talks/cs4m>

\bigskip

\vfill

Licensed under the Creative Commons Attribution ShareAlike 4.0 International License