Shell tricks - Programming On Unix

Users browsing this thread: 1 Guest(s)
Long time nixers
There are the Rich's sh (POSIX shell) tricks. And now we have Nixers' sh tricks.

Replacing A with B in text

As simple as it may sound, there is not a lot of ways to do this, as quite every tool is oriented toward regexes. There is no -F flag in sed!

The only two ways I know is through parameter expansion and sed:

- Parameter expansion

text='original [text] [text]'

# do the changes one by one in a loop.  yeah, that's slow
while [ -z "${text##*"$change"*}" ]
do text=${text%%"$change"*}$into${text#*"$change"}

printf '%s\n' "$text"

Quite a bit complex! As I am aware of, the quotes expansion occurs it 2 context: while executing commands (so required in printf '%s\n' "$text") and in parameter extensions (so required in the ugly ${text%%"$change"*}).

- Sed <3

I can't remember where I did see that (link to the source appreciated), but it is a lot simpler than what I saw.

text='original [text] [text]'
change=$(printf '[text]' | sed 's/./[&]/g')
into=$(printf 'text' | sed 's/[&/\]/\\&/g')
printf '%s\n' "$text" | sed "s/$change/$into/g"

This is quite as lengthy, but less black magic is involved: put every character from the pattern in its own character class:

change became [[][t][e][x][t][]]
Therefore, even the '[' and ']' are interpreted as plain characters an not as part of a regex.
Long time nixers
Inserting raw data in a script

How to embed a text object in a shell script?

You can use quotes:

printf 'Oh noes
this "\t" won't look like a backslash and a "t"!'

Two problems:

- The ' often used in various languages clashes with the quotes.

Plan 9's rc shell uses '' to mean ' within the quoted string: 'It''s great and easy on the eyes'

- printf(1) (like echo, sometimes...) interprets the \t.

printf '\\t\n'     # not convenient
printf '%s\n' '\t' # a little bit better but still...
printf '%s\n' "'single quote' but what about $this"
cat << EOF         # har, har...
$(rm -rf /home)
cat << 'EOF'       # wait what? /home is still here and I see $(rm -rf /home)
$(rm -rf /home)

The last one is our solution: you can quote 'EOF', and this stops all expansion on the following text. So you can have any text including mixed ' and " quotes along with $, \ and whatnot.

And yes, that is POSIX.
(13-11-2017, 05:39 AM)josuah Wrote:
cat << 'EOF'       # wait what? /home is still here and I see $(rm -rf /home)
$(rm -rf /home)
I didn't know about that one.
Those are nifty tricks, thanks for sharing.
Long time nixers
How to quote

(skip this if you are used to ' and ")

POSIX sh and derivates (about all of them (but hey! rc's great!)) may be tricky with quoting.

The double quotes (") have the most complex syntax:
- \n get converted to n
- \\ get converted to \
- $(command) and `command` get converted to the result of command
- $var get converted to the value of var

So if we want exactly "$( to be printed, with double quotes, the whole thing would be:

printf '%s\n' "\"\$("

This is cumbersome in some cases, and very useful in other cases.

The single quotes (') are the strongests: nothing can escape the quotes!

$ printf '%s\n' 'this\n is " str\\ong!'
this\n is " str\\ong!

These are convenient for printf format strings (as above), sed or awk scripts... in which we want the \n not to be transformed into n by the shell, but interpreted by printf, awk...
Long time nixers
when to quote

More interesting question: when are quotes necessary?

At first I quoted everything, and a few days ago I discovered that quotes are not necessary at many places.

Answer: when commands are interpreted, at "for var in >>here<<".

Commands arguments

If the variable a have the content '1 2 3'
$ printf '%s\n' $a

As you may know, the shell did cut the arguments from the space in the string. Still the convenient/cumbersome balance of the shell.

$ printf '%s\n' "$a"
1 2 3


Variables assignation do not need quoting, though that does not hurt to add quotes every time.

$ b=$a/*
$ printf '%s\n' "$b"
1 2 3/*
$ b="$a /*"
$ printf '%s\n' "$b"
1 2 3 /*
$ b=$a /*
/bin/sh: /bin: cannot execute - Is a directory

Yes, we need quotes when we insert a space directly in the variable, as the syntax "MANPAGER=ul man test" is to call the man command with the environment variable MANPAGER set to ul.

The case statements do not need quoting neither, so you can safely do:

$ a='1 2 3'
$ case $a in
\ 1) echo 1 ;;
\ 2) echo 2 ;;
\ *) echo '*' ;;
\ esac

The for statements are special: they need quoting if you want a variable to be considered as a single item:

$ a='1 2 3'
$ for i in "$a"; do printf '%s\n' "$i"; done
1 2 3
$ for i in  $a;  do printf '%s\n' "$i"; done

The if and while statements

The tests do need quoting. The [ and ] after the if: the /bin/[ file is often a symlink to /bin/test

$ if [ -z "$a" ]; then echo true; else echo false; fi
$ if grep "$a" < $HOME/.profile; then echo true; else echo false; fi
$ while [ -z "$a" ]; do echo niet; done

Shell redirection

The <, >, >> operators do not need quoting.

$ a='1 2 3'
$ echo content > $a
$ cat < *
/bin/sh: cannot open *: No such file or directory
$ cat < $a

Parameters expansion

In the ${var#pattern}, ${var##pattern}, ${var%pattern}, ${var%%pattern} syntax, you have pattern that can contain globs or plain text for removing text from var. The pattern is interpreted just like with double quotes ("):

$ var='1 2 3'
$ pattern='1 2 '
$ printf '%s\n' "${var#$pattern}"
Long time nixers
Alignment with sed

If you ever tried to align the output of a command? column(1) is the tool you need, but it is not in POSIX, and only takes one-character separator, and you might want to only align to the first separator.

A `while read' with printf is a bit cumbersome and slow, let's use the almighty stream editor with two expressions:

s/SEPARATOR/            /        # replace the SEPARATOR (any sed expression) with enough spaces
    s/(.{12}[^ ]*) */\1SEPARATOR/    # cut excessive spaces up to the 12th column, adding the separator back

To be used this way (but use `column -t -s : /etc/passwd' instead in this case):

$ sed -r -e 's/:/            /' -e 's/(.{12}[^ ]*) */\1 /' /etc/passwd
_slaacd     :*:115:115:SLAAC Daemon:/var/empty:/sbin/nologin
nobody      :*:32767:32767:Unprivileged user:/nonexistent:/sbin/nologin
josuah      :*:1000:1000:Josuah:/home:/bin/sh
_rsync      :*:669:669:rsync Daemon:/var/empty:/sbin/nologin
_gitdaemon  :*:778:778:GIT Daemon:/nonexistent:/sbin/nologin

There are a few caveats: if a long input contains spaces before the separator:

$ printf 'w w w w w w w w w w w w:<- separator\n' | sed -r -e 's/:/            /' -e 's/(.{12}[^ ]*) */\1:/'
w w w w w w w:w w w w w            <- separator

But for known inputs, it works ok.

[edit]: typo
Grey Hair Nixers
It's a nice quick and dirty trick to align knkwn data, but that's a bit unfair to compare it with column(1)... In this case you have to assume your first column to be shorter than 12 chars. It is totally input dependant, as opposed to column(1) which aligns up to the longest string of each column.

You'd have to soak up your whole input before rendering anything, and measure your longuest input in some way (which I don't think sed can do).

I know you manipulate awk(1) auite well, and I think it would be better suited for this task!

Thanks for sharing though! Still interesting to see that kind of solution!
Long time nixers

You are right, column(1) does a better job for single character delimiters like with csv/tsv, /etc/passwd... (use column!).

Oh yes sure, one could even implement column in plain awk...

P.S. to all: Take not that I did not came up with every trick on my own, sometimes I found the idea on StackOverflow or existing scripts.
Long time nixers
playing with "$@"

As you may know, in sh, $0, $1, $2, $3... $9, $* and $@ are special variables holding the command line arguments.

$0 is the name of the program and $1 ... $9 the program parameters.

$# is a number: the count of all arguments (starting from $1, unlike argc in C).

$* is a concatenation of $1 ... $9 with spaces.

$@ is an array, not just one string, but a list of strings:

$ func() { printf '"%s"\n' "$@"; }
$ func '1 2' 3 4
"1 2"

Even though `$@' was quoted, printf received multiple arguments.

You can edit $1 -> $9 with the `set' built-in command:
  • `set "$@"' does not change anything.
  • `set new "$@"' insert `new' before all the arguments,
  • `set "$1" "$2"' keeps only $1 and $2,

And the `shift' built-in command:
  • `shift' removes the first argument from the argument stack (or error).
  • `shift 2' removes the first 2 arguments from the argument stack.

The first argument is $1, but what about the last?

for last in "$@"; do continue; done

This last will take all $@ values until the end and keep the last value after the loop.

If you are using these a lot, you probably should use a programming language. ;)
Long time nixers
fallbacks in pipes

On frustrating thing with busybox ps is that it fails when you add the usual `-ax' formatters to it. This might happen with other situations too.

The sh `{ ...; }` syntax permit you to have fallbacks:

{ ps -ax || ps; } | less

If `ps -ax' fails, ps is executed, and the output of either is piped into less.
Although different commands may behave differently so be cautious with this.

You may also want to check if a command exist before:

for browser in mothra netscape midori surf jumanji uzbl x-www-browser w3m links elinks lynx retawk curl
do command -v "$browser" && break
"$browser" "$url"

$browser will take the value of each one in the list and `command -v' will check if it exist and break the loop if so. At the end $browser will hopefully be an existing browser command.

P.S.: If you use ps in scripts, you might want to consider pgrep(1) and pkill(1), also in busybox to look for a process, rather than `ps + grep "ugly stuff"'.
Long time nixers
Ok I know a few tricks about the shell syntax. That does not mean I'm a great hacker: I know a very few about how system and networking works. :P Networking tricks anyone?

[EDIT] Sure! Go get some...