Program Composition: Time estimates
parent
f372be6d29
commit
596fff5295
265
slides.org
265
slides.org
|
@ -123,63 +123,63 @@
|
|||
|
||||
** Summary :noexport:
|
||||
#+BEGIN: columnview :hlines 2 :maxlevel 3 :indent t :id slides
|
||||
| ITEM | DURATION | TODO | ENVIRONMENT |
|
||||
|-------------------------------------------------------+----------+---------+-------------|
|
||||
| Slides | 0:23:54 | DEVOID | |
|
||||
|-------------------------------------------------------+----------+---------+-------------|
|
||||
| \_ Summary | | | |
|
||||
|-------------------------------------------------------+----------+---------+-------------|
|
||||
| \_ Introduction | | RAW | note |
|
||||
|-------------------------------------------------------+----------+---------+-------------|
|
||||
| \_ Choreographed Workflows | | DEVOID | fullframe |
|
||||
|-------------------------------------------------------+----------+---------+-------------|
|
||||
| \_ Practical Freedom | | DEVOID | fullframe |
|
||||
|-------------------------------------------------------+----------+---------+-------------|
|
||||
| \_ Practical Example: Web Browser | 0:09:32 | DRAFT | |
|
||||
| \_ Browser Topics | | | |
|
||||
| \_ Example: Web Browser | 0:00:40 | DRAFT | frame |
|
||||
| \_ Finding Text (Mouse-Driven GUI Interaction) | 0:01:39 | DRAFT | frame |
|
||||
| \_ GUIs Change Over Time | 0:00:45 | DRAFT | frame |
|
||||
| \_ Ctrl+F---Just Works | 0:00:25 | DRAFT | frame |
|
||||
| \_ Muscle Memory | 0:00:40 | DRAFT | fullframe |
|
||||
| \_ A Research Task | 0:00:25 | DRAFT | fullframe |
|
||||
| \_ Executing the Research Task | 0:03:00 | DRAFT | frame |
|
||||
| \_ GUIs of a Feather | 0:00:40 | DRAFT | fullframe |
|
||||
| \_ Macro-Like Keyboard Instructions | 0:01:19 | DRAFT | fullframe |
|
||||
|-------------------------------------------------------+----------+---------+-------------|
|
||||
| \_ A New Perspective | 0:14:22 | DRAFT | |
|
||||
| \_ Perspective Topics | | | |
|
||||
| \_ Secrets? | 0:01:19 | DRAFT | fullframe |
|
||||
| \_ Lifting the Curtain | 0:01:00 | DRAFT | frame |
|
||||
| \_ Web Page Source Code | 0:00:35 | DRAFT | block |
|
||||
| \_ Text | 0:00:35 | DRAFT | fullframe |
|
||||
| \_ Text is a Universal Interface | 0:01:19 | DRAFT | fullframe |
|
||||
| \_ The Shell Command Prompt | 0:00:45 | DRAFT | frame |
|
||||
| \_ Eliminating the Web Browser | 0:01:00 | DRAFT | frame |
|
||||
| \_ Browser vs. =wget= Comparison | 0:00:40 | DRAFT | frame |
|
||||
| \_ Finding Text on the Command Line | 0:01:00 | DRAFT | frame |
|
||||
| \_ A More Gentle Reply | 0:01:00 | DRAFT | frame |
|
||||
| \_ Writing to Files (Redirection) | 0:00:55 | DRAFT | frame |
|
||||
| \_ Starting Our List | 0:01:10 | DRAFT | fullframe |
|
||||
| \_ Command Refactoring | 0:02:00 | DRAFT | fullframe |
|
||||
| \_ Again: Text is a Universal Interface | 0:00:20 | DRAFT | againframe |
|
||||
| \_ Pipelines | 0:00:15 | DRAFT | fullframe |
|
||||
| \_ Summary of the Unix Philosophy | 0:00:30 | DRAFT | fullframe |
|
||||
|-------------------------------------------------------+----------+---------+-------------|
|
||||
| \_ Program Composition | | LACKING | |
|
||||
| \_ Composition Topics | | | |
|
||||
| \_ Clarifying Pipelines | | RAW | fullframe |
|
||||
| \_ Tor | | RAW | fullframe |
|
||||
| \_ LP Sessions | | RAW | fullframe |
|
||||
| \_ Interactive, Incremental, Iterative Development | | RAW | fullframe |
|
||||
| \_ Discovering URLs | | RAW | fullframe |
|
||||
| \_ Go Grab a Coffee | | RAW | fullframe |
|
||||
| \_ Async Processes | | RAW | fullframe |
|
||||
| \_ Executable Shell Script and Concurrency | | RAW | fullframe |
|
||||
| \_ Again: A Research Task | | RAW | againframe |
|
||||
| \_ A Quick-n-Dirty Solution | | LACKING | frame |
|
||||
|-------------------------------------------------------+----------+---------+-------------|
|
||||
| \_ Thank You | 00:00:01 | | fullframe |
|
||||
| ITEM | DURATION | TODO | ENVIRONMENT |
|
||||
|-------------------------------------------------------+----------+--------+-------------|
|
||||
| Slides | 0:36:19 | DEVOID | |
|
||||
|-------------------------------------------------------+----------+--------+-------------|
|
||||
| \_ Summary | | | |
|
||||
|-------------------------------------------------------+----------+--------+-------------|
|
||||
| \_ Introduction | | RAW | note |
|
||||
|-------------------------------------------------------+----------+--------+-------------|
|
||||
| \_ Choreographed Workflows | | DEVOID | fullframe |
|
||||
|-------------------------------------------------------+----------+--------+-------------|
|
||||
| \_ Practical Freedom | | DEVOID | fullframe |
|
||||
|-------------------------------------------------------+----------+--------+-------------|
|
||||
| \_ Practical Example: Web Browser | 0:09:32 | DRAFT | |
|
||||
| \_ Browser Topics | | | |
|
||||
| \_ Example: Web Browser | 0:00:40 | DRAFT | frame |
|
||||
| \_ Finding Text (Mouse-Driven GUI Interaction) | 0:01:39 | DRAFT | frame |
|
||||
| \_ GUIs Change Over Time | 0:00:45 | DRAFT | frame |
|
||||
| \_ Ctrl+F---Just Works | 0:00:25 | DRAFT | frame |
|
||||
| \_ Muscle Memory | 0:00:40 | DRAFT | fullframe |
|
||||
| \_ A Research Task | 0:00:25 | DRAFT | fullframe |
|
||||
| \_ Executing the Research Task | 0:03:00 | DRAFT | frame |
|
||||
| \_ GUIs of a Feather | 0:00:40 | DRAFT | fullframe |
|
||||
| \_ Macro-Like Keyboard Instructions | 0:01:19 | DRAFT | fullframe |
|
||||
|-------------------------------------------------------+----------+--------+-------------|
|
||||
| \_ A New Perspective | 0:14:22 | DRAFT | |
|
||||
| \_ Perspective Topics | | | |
|
||||
| \_ Secrets? | 0:01:19 | DRAFT | fullframe |
|
||||
| \_ Lifting the Curtain | 0:01:00 | DRAFT | frame |
|
||||
| \_ Web Page Source Code | 0:00:35 | DRAFT | block |
|
||||
| \_ Text | 0:00:35 | DRAFT | fullframe |
|
||||
| \_ Text is a Universal Interface | 0:01:19 | DRAFT | fullframe |
|
||||
| \_ The Shell Command Prompt | 0:00:45 | DRAFT | frame |
|
||||
| \_ Eliminating the Web Browser | 0:01:00 | DRAFT | frame |
|
||||
| \_ Browser vs. =wget= Comparison | 0:00:40 | DRAFT | frame |
|
||||
| \_ Finding Text on the Command Line | 0:01:00 | DRAFT | frame |
|
||||
| \_ A More Gentle Reply | 0:01:00 | DRAFT | frame |
|
||||
| \_ Writing to Files (Redirection) | 0:00:55 | DRAFT | frame |
|
||||
| \_ Starting Our List | 0:01:10 | DRAFT | fullframe |
|
||||
| \_ Command Refactoring | 0:02:00 | DRAFT | fullframe |
|
||||
| \_ Again: Text is a Universal Interface | 0:00:20 | DRAFT | againframe |
|
||||
| \_ Pipelines | 0:00:15 | DRAFT | fullframe |
|
||||
| \_ Summary of the Unix Philosophy | 0:00:30 | DRAFT | fullframe |
|
||||
|-------------------------------------------------------+----------+--------+-------------|
|
||||
| \_ Program Composition | 0:12:25 | DRAFT | |
|
||||
| \_ Composition Topics | | | |
|
||||
| \_ Clarifying Pipelines | 0:00:45 | DRAFT | fullframe |
|
||||
| \_ Tor | 0:00:20 | DRAFT | fullframe |
|
||||
| \_ LP Sessions | 0:02:50 | DRAFT | fullframe |
|
||||
| \_ Interactive, Incremental, Iterative Development | 0:01:10 | DRAFT | fullframe |
|
||||
| \_ Discovering URLs | 0:02:50 | DRAFT | fullframe |
|
||||
| \_ Go Grab a Coffee | 0:00:15 | DRAFT | fullframe |
|
||||
| \_ Async Processes | 0:01:00 | DRAFT | fullframe |
|
||||
| \_ Executable Shell Script and Concurrency | 0:01:50 | DRAFT | fullframe |
|
||||
| \_ Again: A Research Task | 0:00:15 | DRAFT | againframe |
|
||||
| \_ A Quick-n-Dirty Solution | 0:01:10 | DRAFT | frame |
|
||||
|-------------------------------------------------------+----------+--------+-------------|
|
||||
| \_ Thank You | 00:00:01 | | fullframe |
|
||||
#+END:
|
||||
|
||||
** RAW Introduction :B_note:
|
||||
|
@ -1698,7 +1698,7 @@ We start to think of how to decompose problems into small operations that
|
|||
We think of how to chain small, specialized programs together,
|
||||
transforming text at each step to make it more suitable for the next.
|
||||
|
||||
** RAW Program Composition [0/10]
|
||||
** DRAFT Program Composition [0/10]
|
||||
*** Composition Topics [6/6] :noexport:
|
||||
- [X] Clarify how pipelines work with existing =wget | grep=.
|
||||
- [X] More involved pipeline with more than two programs.
|
||||
|
@ -1719,7 +1719,7 @@ We think of how to chain small, specialized programs together,
|
|||
- [X] Extract =url-grep= into script.
|
||||
- [X] Demonstrate running jobs in parallel with =xargs=.
|
||||
|
||||
*** RAW Clarifying Pipelines :B_fullframe:
|
||||
*** DRAFT Clarifying Pipelines :B_fullframe:
|
||||
:PROPERTIES:
|
||||
:BEAMER_env: fullframe
|
||||
:END:
|
||||
|
@ -1761,12 +1761,10 @@ Expat</a>. The JavaScript is free software with
|
|||
**** Notes :B_noteNH:
|
||||
:PROPERTIES:
|
||||
:BEAMER_env: noteNH
|
||||
:DURATION: 00:00:45
|
||||
:END:
|
||||
|
||||
Let's observe the profound consequences of these design decisions.
|
||||
First,
|
||||
let's make sure you understand what outputting to standard out is doing
|
||||
with our existing =wget= command.
|
||||
Remember that standard out is displayed to us on the terminal by default.
|
||||
If we were to just run that =wget= command and nothing else,
|
||||
we'd be spammed with output.
|
||||
|
@ -1780,11 +1778,11 @@ We can pipe it to =wc= instead,
|
|||
and tell it to count the number of newlines with =-l=.
|
||||
|
||||
What about the number of lines that contain the string ``free software''?
|
||||
|
||||
Or how about the last such line?
|
||||
It's all a simple matter of composing existing programs.
|
||||
|
||||
*** RAW Tor :B_fullframe:
|
||||
It's all a simple matter of composing existing programs with pipes.
|
||||
|
||||
*** DRAFT Tor :B_fullframe:
|
||||
:PROPERTIES:
|
||||
:BEAMER_env: fullframe
|
||||
:END:
|
||||
|
@ -1796,6 +1794,7 @@ $ alias fetch-url='torify wget -qO-'
|
|||
**** Notes :B_noteNH:
|
||||
:PROPERTIES:
|
||||
:BEAMER_env: noteNH
|
||||
:DURATION: 00:00:20
|
||||
:END:
|
||||
|
||||
By the way,
|
||||
|
@ -1807,7 +1806,7 @@ You can easily send all these requests through Tor,
|
|||
Since we abstracted our fetching away into the =fetch-url= alias,
|
||||
our previous examples continue to work as-is.
|
||||
|
||||
*** RAW LP Sessions :B_fullframe:
|
||||
*** DRAFT LP Sessions :B_fullframe:
|
||||
:PROPERTIES:
|
||||
:BEAMER_env: fullframe
|
||||
:END:
|
||||
|
@ -1943,10 +1942,11 @@ $ fetch-url https://libreplanet.org/2019/speakers/ \
|
|||
**** Notes :B_noteNH:
|
||||
:PROPERTIES:
|
||||
:BEAMER_env: noteNH
|
||||
:DURATION: 00:02:50
|
||||
:END:
|
||||
|
||||
How about something more involved.
|
||||
I noticed that some talks had multiple speakers,
|
||||
I noticed that some LibrePlanet sessions had multiple speakers,
|
||||
and I wanted to know which ones had the /most/ speakers.
|
||||
|
||||
The HTML of the speakers page includes a header for each speaker.
|
||||
|
@ -1958,8 +1958,8 @@ Let's get just the talk titles that those speakers are associated with.
|
|||
Looking at this output,
|
||||
we see that the talks titles have an =em= tag,
|
||||
so let's just go with that.
|
||||
Pipe to =grep= instead of =head=.
|
||||
|
||||
Uh oh.
|
||||
It looks like at least one of those results has /multiple/ talks.
|
||||
But note that each is enclosed in its own set of =em= tags.
|
||||
If we add =-o= to =grep=,
|
||||
|
@ -1973,19 +1973,30 @@ That's exactly what we want!
|
|||
But we have to modify our regex a little bit to prevent it from grabbing
|
||||
everything between the first and /last/ =em= tag,
|
||||
by prohibiting it from matching on a less than character in the title.
|
||||
Don't worry if you don't understand the regular expression;
|
||||
they take time to learn and tend to be easier to write than they are to
|
||||
read.
|
||||
This one just says ``match one or more non-less-than characters between =em=
|
||||
tags''.
|
||||
=grep= actually supports three flavors of regular expressions;
|
||||
if you used Perl's with =-P=,
|
||||
it'd be even simpler to write,
|
||||
but I show the POSIX regex here for portability since Perl regexes
|
||||
aren't available on all systems.
|
||||
|
||||
Now assuming that the talk titles are consistent,
|
||||
we can get a count.
|
||||
=uniq= has the ability to count consecutive lines that are identical,
|
||||
as well as output a count.
|
||||
We also use =-d= to tell it to only output duplicates.
|
||||
But =uniq= requires sorted input,
|
||||
We also use =-d= to tell it to only output duplicate lines.
|
||||
But =uniq= doesn't sort lines before processing,
|
||||
so we first pipe it to =sort=.
|
||||
That gives us a count of each talk!
|
||||
|
||||
But I want to know the talks with the most speakers,
|
||||
But I want to know the talks with the /most/ speakers,
|
||||
so let's sort it /again/,
|
||||
this time numerically and in reverse order.
|
||||
this time numerically and in reverse order,
|
||||
and take the top five.
|
||||
|
||||
And we have our answer!
|
||||
|
||||
|
@ -1997,6 +2008,11 @@ Using =sed=,
|
|||
replacement.
|
||||
So we can reformat the =uniq= output into an English sentence,
|
||||
like so.
|
||||
=sed= is actually a turing complete programming language,
|
||||
but it is often used in pipelines with inline scripts like this.
|
||||
I chose the pound characters delimit the match from the replacement.
|
||||
The numbers in the replacement reference the parenthesized groups in the
|
||||
match.
|
||||
|
||||
And then we're going to pipe it to the program =espeak=,
|
||||
which is a text-to-speech synthesizer.
|
||||
|
@ -2004,7 +2020,7 @@ Your computer will speak the top five talks by presenter count to you.
|
|||
Listening to computers speak is all the rage right now,
|
||||
right?
|
||||
|
||||
*** RAW Interactive, Incremental, Iterative Development :B_fullframe:
|
||||
*** DRAFT Interactive, Incremental, Iterative Development :B_fullframe:
|
||||
:PROPERTIES:
|
||||
:BEAMER_env: fullframe
|
||||
:END:
|
||||
|
@ -2018,6 +2034,7 @@ Interactive REPL, Iterative Decomposition
|
|||
**** Notes :B_noteNH:
|
||||
:PROPERTIES:
|
||||
:BEAMER_env: noteNH
|
||||
:DURATION: 00:01:10
|
||||
:END:
|
||||
|
||||
Notice how we approached that problem.
|
||||
|
@ -2025,6 +2042,7 @@ I presented it here just as I developed it.
|
|||
I didn't open my web browser and inspect the HTML;
|
||||
I just looked at the =wget= output and then started to manipulate it in
|
||||
useful ways working toward my final goal.
|
||||
This is just /one/ of the many ways to write it.
|
||||
And this is part of what makes working in a shell so powerful.
|
||||
|
||||
In software development,
|
||||
|
@ -2047,10 +2065,11 @@ They aren't experts in these commands;
|
|||
And the shell is perfect for this discovery.
|
||||
If something doesn't work,
|
||||
just keep trying different things and get immediate feedback!
|
||||
This is also really helpful when you're trying to craft a suitable regular
|
||||
expression.
|
||||
|
||||
*** RAW Discovering URLs :B_fullframe:
|
||||
And because we're working with text as data,
|
||||
a human can replace any part of this process!
|
||||
|
||||
*** DRAFT Discovering URLs :B_fullframe:
|
||||
:PROPERTIES:
|
||||
:BEAMER_env: fullframe
|
||||
:END:
|
||||
|
@ -2058,7 +2077,7 @@ This is also really helpful when you're trying to craft a suitable regular
|
|||
#+BEAMER: \begin{onlyenv}<+>
|
||||
#+BEGIN_SRC sh
|
||||
|
||||
$ grep -o 'https\?://[^ ]\+' email-of-links.msg
|
||||
$ grep -o 'https\?://[^ ]\+' email-of-links.txt
|
||||
https://en.wikipedia.org/wiki/Free_software
|
||||
https://en.wikipedia.org/wiki/Open_source
|
||||
https://en.wikipedia.org/wiki/Microsoft
|
||||
|
@ -2073,7 +2092,7 @@ https://opensource.org/about
|
|||
#+BEAMER: \begin{onlyenv}<+>
|
||||
#+BEGIN_SRC sh
|
||||
|
||||
$ grep -o 'https\?://[^ ]\+' email-of-links.msg \
|
||||
$ grep -o 'https\?://[^ ]\+' email-of-links.txt \
|
||||
| while read URL; do
|
||||
echo "URL is $URL"
|
||||
done
|
||||
|
@ -2088,7 +2107,7 @@ URL is https://opensource.org/about
|
|||
#+BEAMER: \begin{onlyenv}<+>
|
||||
#+BEGIN_SRC sh
|
||||
|
||||
$ grep -o 'https\?://[^ ]\+' email-of-links.msg \
|
||||
$ grep -o 'https\?://[^ ]\+' email-of-links.txt \
|
||||
| while read URL; do
|
||||
fetch-url "$URL" | grep -q 'free software' \
|
||||
|| echo "$URL" >> results.txt
|
||||
|
@ -2103,7 +2122,7 @@ $ grep -o 'https\?://[^ ]\+' email-of-links.msg \
|
|||
#+BEAMER: \begin{onlyenv}<+>
|
||||
#+BEGIN_SRC sh
|
||||
|
||||
$ grep -o 'https\?://[^ ]\+' email-of-links.msg \
|
||||
$ grep -o 'https\?://[^ ]\+' email-of-links.txt \
|
||||
| while read URL; do
|
||||
fetch-url "$URL" | grep -q 'free software' \
|
||||
|| echo "$URL" | tee -a results.txt
|
||||
|
@ -2118,7 +2137,7 @@ https://opensource.org/about
|
|||
#+BEAMER: \begin{onlyenv}<+>
|
||||
#+BEGIN_SRC sh
|
||||
|
||||
$ grep -o 'https\?://[^ ]\+' email-of-links.msg \
|
||||
$ grep -o 'https\?://[^ ]\+' email-of-links.txt \
|
||||
| while read URL; do
|
||||
fetch-url "$URL" | grep -q 'free software' \
|
||||
|| echo "$URL" | tee -a results.txt
|
||||
|
@ -2133,7 +2152,7 @@ $ grep -o 'https\?://[^ ]\+' email-of-links.msg \
|
|||
#+BEAMER: \begin{onlyenv}<+>
|
||||
#+BEGIN_SRC sh
|
||||
|
||||
$ grep -o 'https\?://[^ ]\+' email-of-links.msg \
|
||||
$ grep -o 'https\?://[^ ]\+' email-of-links.txt \
|
||||
| while read URL; do
|
||||
fetch-url "$URL" | grep -q 'free software' \
|
||||
|| echo "$URL" | tee -a results.txt
|
||||
|
@ -2193,15 +2212,20 @@ $ xclip -i -selection clipboard < results.txt
|
|||
**** Notes :B_noteNH:
|
||||
:PROPERTIES:
|
||||
:BEAMER_env: noteNH
|
||||
:DURATION: 00:02:50
|
||||
:END:
|
||||
|
||||
Okay, back to searching webpages at URLs.
|
||||
Now that we have a means of creating the list,
|
||||
how do we feed the URLs to our pipeline?
|
||||
Okay, back to searching webpages.
|
||||
Now that we have a means of creating the list of results,
|
||||
how do we feed the URLs into our pipeline?
|
||||
Why not pull them right out of the email with =grep=?
|
||||
|
||||
Let's say you saved the email in =email-of-links.msg=.
|
||||
This simple regex should grab most URLs for both HTTP and HTTPS protocols.
|
||||
Let's say you saved the email in =email-of-links.txt=.
|
||||
This simple regex should grab most URLs for both HTTP and HTTPS protocols,
|
||||
but it's far from perfect.
|
||||
For example,
|
||||
it'd grab punctuation at the end of a sentence.
|
||||
But we're assuming a list of URLs.
|
||||
Here's some example output with a few URLs.
|
||||
|
||||
For each of these,
|
||||
|
@ -2211,7 +2235,7 @@ It's time to introduce =while= and =read=.
|
|||
=read= will read line-by-line into one or more variables,
|
||||
and will fail when there are no more lines to read.
|
||||
|
||||
So if insert our =fetch-url= pipeline into the body,
|
||||
So if we insert our =fetch-url= pipeline into the body,
|
||||
we get this.
|
||||
But if we just redirect output into =results.txt=,
|
||||
we can't see the output unless we inspect the file.
|
||||
|
@ -2221,12 +2245,12 @@ For convenience,
|
|||
it'll send output through the pipeline while also writing the same
|
||||
output to a given file.
|
||||
The =-a= flag tells it to /append/ rather than overwrite.
|
||||
So now we can both observe the results and have them written to a file!
|
||||
So now we can both observe the results /and/ have them written to a file!
|
||||
|
||||
But we were just going to reply to an email with those results.
|
||||
Let's assume we're still using a GUI email client.
|
||||
Wouldn't it be convenient if those results were on the clipboard for us
|
||||
already so we can just paste it into the message?
|
||||
Wouldn't it be convenient if those results were already on the clipboard for
|
||||
us so we can just paste them into the message?
|
||||
We can accomplish that by piping to =xclip= as shown here.
|
||||
There's also the program =xsel=,
|
||||
which I typically use because its arguments are far more concise,
|
||||
|
@ -2246,10 +2270,12 @@ Well,
|
|||
Instead of saving our mail to a file,
|
||||
we can just copy the relevant portion and have that piped directly to
|
||||
=grep=!
|
||||
If you have a list of URLs and you just copy that portion,
|
||||
then you can just get rid of =grep= entirely.
|
||||
|
||||
Because we're writing to =results.txt=,
|
||||
another option is to just let it run and copy to the clipboard at a later
|
||||
time.
|
||||
another option is to just let this run and copy to the clipboard at a
|
||||
later time.
|
||||
We can do that by reading =results.txt= in place of standard input to
|
||||
=xclip=,
|
||||
as shown here.
|
||||
|
@ -2258,10 +2284,11 @@ And while we're at it,
|
|||
here's a special notation to get rid of =echo= for the =tee= in the body
|
||||
of =while=:
|
||||
three less-than symbols provides the given string on standard in.
|
||||
This is a bash-specific feature.
|
||||
|
||||
Phew!
|
||||
|
||||
*** RAW Go Grab a Coffee :B_fullframe:
|
||||
*** DRAFT Go Grab a Coffee :B_fullframe:
|
||||
:PROPERTIES:
|
||||
:BEAMER_env: fullframe
|
||||
:END:
|
||||
|
@ -2272,6 +2299,7 @@ Go Grab a Coffee
|
|||
**** Notes :B_noteNH:
|
||||
:PROPERTIES:
|
||||
:BEAMER_env: noteNH
|
||||
:DURATION: 00:00:15
|
||||
:END:
|
||||
|
||||
Remember when I said I could go grab a coffee and play with the kids while
|
||||
|
@ -2283,7 +2311,7 @@ The Internet is fast nowadays;
|
|||
ideally, we wouldn't have to wait long.
|
||||
Can we do better?
|
||||
|
||||
*** RAW Async Processes :B_fullframe:
|
||||
*** DRAFT Async Processes :B_fullframe:
|
||||
:PROPERTIES:
|
||||
:BEAMER_env: fullframe
|
||||
:END:
|
||||
|
@ -2308,6 +2336,7 @@ $ while read URL; do
|
|||
**** Notes :B_noteNH:
|
||||
:PROPERTIES:
|
||||
:BEAMER_env: noteNH
|
||||
:DURATION: 00:01:00
|
||||
:END:
|
||||
|
||||
Indeed we can.
|
||||
|
@ -2322,7 +2351,8 @@ Shells have built-in support for backgrounding tasks so that they can run
|
|||
So in this example,
|
||||
we sleep for one second and then echo ``done''.
|
||||
But that sleep and subsequent echo is put into the background,
|
||||
and the shell proceeds to first execute =echo start=.
|
||||
and the shell proceeds to execute =echo start= while =sleep= is running in
|
||||
the background.
|
||||
One second later,
|
||||
it outputs ``done''.
|
||||
|
||||
|
@ -2337,9 +2367,10 @@ Sure,
|
|||
But what if we have 1000?
|
||||
Do we really want to spawn 1000s of processes and make 1000 network requests
|
||||
at once?
|
||||
That isn't efficient.
|
||||
That isn't efficient,
|
||||
and it's a bit rude to DOS servers.
|
||||
|
||||
*** RAW Executable Shell Script and Concurrency :B_fullframe:
|
||||
*** DRAFT Executable Shell Script and Concurrency :B_fullframe:
|
||||
:PROPERTIES:
|
||||
:BEAMER_env: fullframe
|
||||
:END:
|
||||
|
@ -2408,40 +2439,41 @@ $ xargs -n1 -P5 ./url-grep 'free software' > results.txt
|
|||
**** Notes :B_noteNH:
|
||||
:PROPERTIES:
|
||||
:BEAMER_env: noteNH
|
||||
:DURATION: 00:01:50
|
||||
:END:
|
||||
|
||||
Before we continue,
|
||||
we're going to have to write our pipeline in a way that other programs can
|
||||
run it.
|
||||
Up to this point,
|
||||
the program has just been embedded within the shell.
|
||||
But one of the nice things about shell is that you can take what you entered
|
||||
into the command line and paste it directly in a file and,
|
||||
the program has just been embedded within an interactive shell session.
|
||||
One of the nice things about shell is that you can take what you entered
|
||||
onto the command line and paste it directly into a file and,
|
||||
with some minor exceptions,
|
||||
it'll work all the same.
|
||||
|
||||
Let's take our pipeline and name it =url-grep=.
|
||||
Aliases only work in interactive sessions,
|
||||
Aliases only work in interactive sessions by default,
|
||||
so we're going to just type =wget= directly here.
|
||||
Alternatively,
|
||||
you can define a function.
|
||||
We use the positional parameters =1= and =2= here to represent the
|
||||
respective arguments to the =url-grep= command.
|
||||
|
||||
The comment at the top of the file is called a ``she-bang'' and contains the
|
||||
path to the executable that will be used to interpret this script.
|
||||
This is used by the kernel so that it knows how to run our program.
|
||||
The comment at the top of the file is called a ``shebang''.
|
||||
This is used by the kernel so that it knows what interpreter to use to run
|
||||
our program.
|
||||
|
||||
To make it executable,
|
||||
we use =chmod= to set the executable bit on the file.
|
||||
we use =chmod= to set the executable bits on the file.
|
||||
We can then invoke it as if it were an executable.
|
||||
If it were in our =PATH=,
|
||||
which isn't something I'm going to get into here,
|
||||
you'd be able to run it like any other command without having to prefix it
|
||||
with =./=.
|
||||
|
||||
We can also do a primitive form of error handling by modifying our
|
||||
positional parameters like so,
|
||||
We can also do a primitive form of error handling and documentation by
|
||||
modifying our positional parameters like so,
|
||||
which will show an error message if we don't specify one of them.
|
||||
|
||||
Now we replace the =while= loop with =xargs=.
|
||||
|
@ -2457,7 +2489,7 @@ Here we specify =5=,
|
|||
meaning =xargs= will run five processes at a time.
|
||||
You can change that to whatever number makes sense for you.
|
||||
|
||||
*** RAW Again: A Research Task :B_againframe:
|
||||
*** DRAFT Again: A Research Task :B_againframe:
|
||||
:PROPERTIES:
|
||||
:BEAMER_env: againframe
|
||||
:BEAMER_ref: *A Research Task
|
||||
|
@ -2467,6 +2499,7 @@ You can change that to whatever number makes sense for you.
|
|||
**** Notes :B_noteNH:
|
||||
:PROPERTIES:
|
||||
:BEAMER_env: noteNH
|
||||
:DURATION: 00:00:15
|
||||
:END:
|
||||
|
||||
So this was the research task that we started with.
|
||||
|
@ -2477,7 +2510,7 @@ If I were to approach this problem myself,
|
|||
So,
|
||||
let's combine everything we've seen so far:
|
||||
|
||||
*** RAW A Quick-n-Dirty Solution :B_frame:
|
||||
*** DRAFT A Quick-n-Dirty Solution :B_frame:
|
||||
:PROPERTIES:
|
||||
:BEAMER_env: frame
|
||||
:END:
|
||||
|
@ -2503,6 +2536,7 @@ sys 0m4.877s
|
|||
**** Notes :B_noteNH:
|
||||
:PROPERTIES:
|
||||
:BEAMER_env: noteNH
|
||||
:DURATION: 00:01:10
|
||||
:END:
|
||||
|
||||
I'd first echo the pipeline into =url-grep=.
|
||||
|
@ -2516,15 +2550,17 @@ And then to top it all off,
|
|||
we can just pipe the output to the =mail= command to send that URL list
|
||||
directly to me.
|
||||
|
||||
It only takes a minute or so to come up with this script.
|
||||
It only takes a minute or two to come up with this script.
|
||||
But how long does it take to run?
|
||||
|
||||
I took a few URLs for the FSF, Wikipedia, and Google and just repeated them
|
||||
in a file so that I had 1000 of them.
|
||||
I took a few URLs and just repeated them in a file so that I had 1000 of
|
||||
them.
|
||||
Running the =xargs= command,
|
||||
it finishes in under 18 seconds on my system at home.
|
||||
Obviously YMMV,
|
||||
and certain sites may be slower to respond than others.
|
||||
|
||||
So in well under two minutes,
|
||||
So in only a couple of minutes,
|
||||
the task has been automated away and completed,
|
||||
all by gluing together existing programs.
|
||||
You don't need to be a programmer to know how to do this;
|
||||
|
@ -2532,6 +2568,7 @@ You don't need to be a programmer to know how to do this;
|
|||
which comes with a little bit of practice.
|
||||
|
||||
This is certainly an efficient means of communicating with the machine.
|
||||
We've come a long way from using the web browser and a mouse.
|
||||
|
||||
|
||||
** Thank You :B_fullframe:
|
||||
|
|
Loading…
Reference in New Issue