Nixers Book Club - Book #1: The UNIX Programming Environment

Nixers Book Club - Book #1: The UNIX Programming Environment - Community & Forums Related Discussions

Users browsing this thread:

	venam Offline \| 05-12-2020, 11:26 AM \| #30

Chapter 4 end (awk section)

It gives a good overview of AWK as a "pattern scanning and processing language". They introduce it like a sed with a more C-like syntax.

However, in my opinion, it brings back a sort of record-like view of the file instead of a byte-like view. This is the impression I got and from the way it was explained it seemed to make this association with record systems.
It makes a lot of sense when you think about it like that, especially at the conclusion of the chapter when they talk about FFG, flat file system database. Here's the paper about it and a sneak peek:

Quote:It consists of a single, unformatted text file in which each line corresponds to a record. K-1 occurence of a separator character divide each record into k variable length fields;

The chapter is filled with examples that slowly introduce the AWK inner workings.

The first example sells it and shows it's main behavior, you can make an egrep clone:

Code:
awk '/regex/ { print }' filenames

awk -F: '$2 == ""' /etc/passwd

awk -F: '$2 ~ /^$/' /etc/passwd

awk -F: '$2 !~ /^$/' /etc/passwd

awk -F: 'length($2) == 0' /etc/passwd

So you get the same pattern and action as sed but in a different syntax, and similartly you can write scripts and load them (-f).

The big difference is in how it splits the file as records and those records into fields, so this is then introduced. You can specify the separation character(-F), the fields themselves, use special variables related to this perception of the file, and has pre and post record processing hooks (BEGIN and END).

Quote:$1, $2, ..., $NF (Only field variables begin with $ and needs to be escaped if wanted explicitly with another $)
NF: number of fields, and also $NF the last field on the line
NR, current record/line
'BEGIN { FS = ":" }' # set the separator
'END { print NR }'
'#' is a comment

Another advantage of AWK is that it has C-like functions, such as printf, substr, etc. (table 4.5), variables, conditional statements, loops, and associative arrays. Thus, it's a language on its own.

Code:
awk '{ printf "%4d %s\n", NR, $0 }'

{ s = s + $1 } END { print s }

Here I wasn't aware of the fold(1) command to wrap line, normally I use fmt(1). So that's a new finding.

The chapter culminates with a last calendar example, which isn't particularly clean in my opinion, and as phillbush said, today we don't need the solution they gave.

Chapter 5

This chapter is about shell programming, with a learn through example approach. It emphasize that shell should be used for quickly writing solutions — being productive — mostly for personal use and customizing your environment, by making program cooperate together instead of rewriting things from scratch.

It starts by making the point that the shell is a programming language like any other, and not only an interactive prompt. However, it's not in denial that the design of this language is clunky, mostly shaped by history.

The selling point of the shell is that it gives direct feedback, and so it allows interactive experimentation. Probably something that was uncommon during these times.

The chapter then dives into multiple examples that show the syntax of the shell.

Some weird features like the shell built-in variables in table 5.1.

Code:
$#, $*, $@, $-, $?, $$, $!

People don't like Perl because of these, but it was inspired by the shell to begin with.

It warns that the pattern matching in the shell isn't the same as sed and awk, so beware when using case match. (table 5.2)

Code:
case word in

pattern) commands ;;

pattern) commands ;;

...

esac

Interestingly, there's a lot of discussion about the efficiency of different ways to do something, especially in conditions.
They advice using ":" built-in instead of calling true as an external command, or to rely on case match instead of external calls, especially within loops. Calling a program was something you had to think about.

The test command is introduced.

Code:
test -w # check if file exists

test -f # not a directory

etc..

And exit status are introduced too.

There's a whole section about how to install scripts in your user PATH or globally, and how to know "which" version of the command you are using.

Another section is about variable syntax and extra possibilities you can have by using special characters inside it (table 5.3).

Code:
${var?error!}

${var-thing} evaluates to thing if var isn't defined

${var=thing} same but sets var as thing

${var+thing} thing if var is defined otherwise nothing

The next section is about trapping signals and handling them to cleanly exit, especially in long running programs.
It mentions these popular signals:

Code:
shell exit

hangup

quit

kill

terminate - default by kill(1)

An equivalent of nohup would be:

Code:
(trap '' 1; command) &

It shows a fun example by creating a "zap" command combining pick(1) to interactively kill a process based on its name.

One thing that caught my attention that I didn't know about was that you don't have to give for loops a value, it by default loops over `$*`.

There's a section talking about the $IFS in the shell, the field separator and how it can be overridden to allow reading files in different way. Which reminds us of AWK view of files as records.

One point about the read shell command, apparently at the time it only worked on standard input. So, the following wouldn't work, but it does today:

Code:
read greeting </etc/passwd

One of the nicest example is the pick command:

Code:
for i

do

    echo -n "$i? " > /dev/tty

    read response

    case $response in

    y*)  echo $i ;;

    q*)  break

    esac

done </dev/tty

Again showing echo -n, the non-portable way, but advocate it in the book. It prints on /dev/tty because standard output is almost certainly not the terminal but most probably a shell variable.

In the news command example you have the well-known clunky way to add a char on the left to avoid empty comparison.

The final example is quite interesting, a CVS system with "get" and "put" as commands and using diff. It takes us back to how annoying keeping track of changes must have been at the time. I get the same impression as phillbush on this.