Making the best CLI programs

Making the best CLI programs - Programming On Unix

Users browsing this thread:

	venam Offline \| 23-05-2016, 02:13 AM \| #1

(This is part of the podcast discussion extension)

Making the best CLI programs

And here are the episode links

Link of the recording [ http://podcast.nixers.net/feed/download....11-251.mp3 ]

Unix is known for its set of small command line utilities. What are your ideas and features for the best CLI programs. What makes a complete utility.

Show notes
---
https://en.wikipedia.org/wiki/Command-line_interface
http://eng.localytics.com/exploring-cli-best-practices/
http://harmful.cat-v.org/cat-v/unix_prog_design.pdf
http://programmers.stackexchange.com/que...-arguments
http://www.cs.pomona.edu/classes/cs181f/supp/cli.html
http://julio.meroh.net/2013/09/cli-desig...sages.html
http://julio.meroh.net/2013/09/cli-desig...ap-up.html
http://www.catb.org/~esr/writings/taoup/...11s07.html
http://www.catb.org/~esr/writings/taoup/...11s06.html
http://www.catb.org/~esr/writings/taoup/...10s01.html
http://www.catb.org/~esr/writings/taoup/...10s05.html
http://www.catb.org/~esr/writings/taoup/...11s04.html
Writing manpages
Perl6 Parsing command line args
Python Unix CLI template
Docopt
Getopt history

Music: https://dexterbritain.bandcamp.com/album...s-volume-1

	jkl Offline \| 23-05-2016, 04:49 AM \| #2

A complete utility is one that does exactly what it says with no extra whistles and bells. (Unlike e.g. most GNU tools.)

--
<mort> choosing a terrible license just to be spiteful towards others is possibly the most tux0r thing I've ever seen

	ninjacharlie Offline \| 23-05-2016, 10:08 AM \| #3

I kind of wish coreutils had more standardized flags (i.e. -v will always be version, -V would always be verbose, -s would be silent or quiet so there would be no -q flag). Apart from that, the important thing is that the output is easy to parse with grep or cut, there is no extraneous debug information, errors go to stderr, and overall it just works nicely in a pipeline.

Also, I agree with jkl's point that the GNU tools have way too much extra garbage built in. `ls` (for instance), does not need a sort flag. That's what `sort` is for.

	z3bra Offline \| 23-05-2016, 10:20 AM \| #4

BEWARE: big post ahead.

TL;DR: act as a filter, easy to parse > easy to read

Command line interfaces (for now) only have to ways for interfacing with the user: stdin and stdout. Both are immutable, as in, what's written to it can't be erased.
To me, a good CLI program should act as a data filter. Taking a continuous stream of data on it's input, and output the same data, "filtered". All kinds of metadata and "behavior tweaking" methods (such as flag) should be given as an argument to modify the default behavior of the tool.

When you write a tool, you are supposedly aware of what the filter should be, so the hardest part is to figure out what the data should be. Here are a few hints I've gathered,
either by taking advices from other people or failing miserably (sometimes both!):

input:
0. Do not mess "data" and "metadata"

Let's take "grep" as an example. Its default behavior is to "filter out" all lines where a PATTERN doesn't appear within a data STREAM.
Here are a few "misimplementations" of how "grep" could have been written:

Code:
$ echo PATTERN | grep FILE0 FILE1

$ echo FILE0 FILE1 | grep PATTERN

$ printf "%s\n%s\n" "PATTERN" "FILE0" | grep

The last one is over-exagerated, but the first two could have been valid implementations. They both suffer the same issue though: they don't make the tool act like a filter (or to reformulate it: they can't operate on STREAMs).

1. Limit the number of flags

I often hear people complaining about one-letter flags limiting you to ONLY 26 flags. But I personally think that 26 flags is already too much!
If your tool can handle this much behavior changes, it is likely too complex, and certainly should be splitted in multiple different tools.
Take "ls" as an example. POSIX specifies 23 flags for it. GNU ls has 40 flags! (not even counting --long-opt only flags):

Code:
$ ls --help | grep -o "\s-[a-zA-Z0-9]" | sort | uniq | wc -l                   

40

That is way too much. For example, the flag -r could be dropped:

Code:
$ ls -1 -r

$ ls -1 | tac

The flags -g and -o don't make much sense either, as they filter the output internally (removing either the OWNER or GROUP column).

These flags were indeed added to help the user, but IMO they get in the way as they make the tool more complex to use (more docs and more code can't be a way to go for "simpler" tools)

2. Assume data will be messed up

As much as possible, avoid rigid protocols and format for your input data, and try to simplify your input as much as possible.
If you write, for example, and MPD client that would read commands on stdin, prefer simple inputs:

Code:
$ echo "volume 50" | ./pgm

$ ./pgm <<COMMAND

{

    "command": {

        "type":  "set",

        "key":   "volume",

        "value": "50"

    }

}

COMMAND

One might argue that JSON is easy to parse, well defined and such. That doesn't make it a nice input format, really.

output:

0. Prefer "parseable" to "readable"

Make your output easy to parse, rather than easy to read. Most unix tools work with lines, and play with separators (the famous $IFS variable). Try to format your output as such (tabulations make a great separator!).
Prefer good documentation to verbosity.
Consider a program that would print out the weather in different city. It could have the two following outputs (reading city names on stdin):

Code:
$ printf '%s\n%s\n' "Mexico" "Oslo" | weather

City: Mexico

Wind speed: 10mph

Humidity: 17%

Raining probability: 2%

Temperature: 35°C

City: Oslo

Wind speed: 22mph

Humidity: 44%

Raining probability: 37%

Temperature: 8°C

$ printf '%s\n%s\n' "Mexico" "Oslo" | weather

Mexico    10    17    2    35

Oslo    22    44    37    8

The second output is pretty "hard" to read from a user POV. Now try to write a quick tool that would display only the city name and raining probability...
As I'm a cool guy, here is the one for the second output:

Code:
$ printf '%s\n%s\n' "Mexico" "Oslo" | weather | cut -f1,4 

Mexico 2

Oslo 37

It's easier to format RAW data than strip down a formatted one.

1. Your output is an API

Think well about how you format your output data, because some people will rely on it.
If you take the example above, some people will have a hard time if you suddenly decide to swap the temperature with the humidity value.
Be consistent, document your tools and remain backward compatible as much as possible.

2. Be stupid

Don't try to look cool by checking wether the user is writting to a terminal or not, and change to output accordingly.
This will only confuse the users that will not understand why their parsing tools can't parse what they usually get.

The simpler, the better.

	jkl Offline \| 23-05-2016, 10:51 AM \| #5

(23-05-2016, 10:08 AM)ninjacharlie Wrote: Also, I agree with jkl's point that the GNU tools have way too much extra garbage built in. `ls` (for instance), does not need a sort flag. That's what `sort` is for.

My favorite example is GNU's "true" which can - if compiled correctly - return "false". Including --help and, most important, --version. Because, you know, a different version of "true" can lead to different results.

--
<mort> choosing a terrible license just to be spiteful towards others is possibly the most tux0r thing I've ever seen

	pranomostro Offline \| 28-05-2016, 10:50 AM \| #6

@ninjacharlie:

You are kind of wrong.

If your programs have similar flags, you should consider making it an own utility.

One example for this is the option -h: it is used in du, ls and df, but can be abstracted into one
utility (see git.z3bra.org/human).

Of course, this is not always possible. Your examples with -v (for version), -V (for verbose) -h (help, -s (silent) make sense.
But I think that duplication of flags always carries the danger of replicating features.

I heard an interesting idea a while ago, namely that 3 standard files are not enough.
stdin, stdout and stderr all have distinct purposes.

But very often stderr is misused for status information which doesn't have anything to do with errors.

The author gave a proposal of two other standard file descriptors: usrin and usrout, which can be used for user in- and output
while the program is running. If somebody here knows pick(1), he/she knows the truth (pick lets you select the lines that get sent
down the pipeline):

Some ugly hacking with the tty has to be done.

Here is the medium rant (only read the first half, the second one is about images in the terminal): https://medium.com/@zirakertan/rant-the-...45bb29dac8

I liked the first half. The second is totally infected by "Integrate everything."

Edit: also, z3bra is right with everything. As always (sigh).

	z3bra Offline \| 29-05-2016, 10:52 AM \| #7

(28-05-2016, 10:50 AM)pranomostro Wrote: Edit: also, z3bra is right with everything. As always (sigh).

Couldn't agree more with this line!

	josuah Offline \| 01-06-2016, 10:40 AM \| #8

(23-05-2016, 10:20 AM)z3bra Wrote:
Code:
$ printf '%s\n%s\n' "Mexico" "Oslo" | weather Mexico 10 17 2 35 Oslo 22 44 37 8

This would be just as easy (or even easier) by adding a header and with alignment, like for <code>ps</code>:

Code:
City      Wind  Humidity Rain Temperature

Mexico    10    17       2    35

Oslo      22    44       37   8

I liked these long posts!

	z3bra Offline \| 01-06-2016, 01:15 PM \| #9

(01-06-2016, 10:40 AM)sshbio Wrote: This would be just as easy (or even easier) by adding a header and with alignment, like for ps:

I'm not quite sure. To be honest, I feel like headers/alignment makes it harder to parse the output. For example, in order to get all the temperature, you'll need:

Code:
$ weather | sed 1d | tr -s ' ' | cut -d\  -f5

35

8

You first need to filter out the header, then, get rid of the alignment, and finally output the 5 field, using a custom separator.
With my solution, all you need is

Code:
$ weather | cut -f5

35

8

Way easier to parse actually, and thus, it should be easy to make a "readable output" right?

Code:
$ weather | sed '1iCity\tWind\tHumidity\tRain\tTemperature' | column -t

	henriqueleng Offline \| 01-06-2016, 09:43 PM \| #10

Quote:I'm not quite sure. To be honest, I feel like headers/alignment makes it harder to parse the output. For example, in order to get all the temperature, you'll need:

Code:
$ weather | sed 1d | tr -s ' ' | cut -d\ -f5 35 8

But if you're intending to use awk, the first input shouldn't be a problem. It could be like:

Code:
weather | sed 1d | awk '{print $5}'

awk easily formats columns.

obs.: There must be some bug in the javascript because I can't quote z3bra's last post automatically with the quote button. Trying to quote other posts works fine.

	z3bra Offline \| 02-06-2016, 03:27 AM \| #11

If you intend to use awk, you shouldn't use any other tool, as it is a full blow programming language:

Code:
$ weather | awk '{if (NR!=1) {print $5}}'

The point here was to discuss CLI programs, while awk is a language interpreter. That would be like using python or perl to filter the output.

	jaagr Offline \| 04-06-2016, 03:10 AM \| #12

(02-06-2016, 03:27 AM)z3bra Wrote: If you intend to use awk, you shouldn't use any other tool, as it is a full blow programming language

This way of thinking is what's wrong with big parts of development communities.

	z3bra Offline \| 04-06-2016, 05:01 AM \| #13

Why so? Because you shouldn't "force yourself" and simply use what's available? It makes sense to do so. My point was more about the fact that using awk only to print specific columns is not efficient at all. We came to a point where people use awk ONLY for this specific purpose. That's what is bugging me there.

	venam Offline \| 14-09-2016, 01:58 PM \| #14

I've found a nice addition to this topic:
A series of posts about CLI design.

	jkl Offline \| 14-09-2016, 02:03 PM \| #15

Edit. :)

--
<mort> choosing a terrible license just to be spiteful towards others is possibly the most tux0r thing I've ever seen

	venam Offline \| 14-09-2016, 02:05 PM \| #16

(14-09-2016, 02:03 PM)jkl Wrote: Well done after I have posted it in another topic.

I crawl the web faster than spiders.
☆*･゜ﾟ･*\(^O^)/*･゜ﾟ･*☆

EDIT: It deals with the programming aspect of the CLI and other stuffs I've completely missed in the talk but that were mentioned in the discussion here.

Also, this topic is always recurrent, the usual "no news is good news".
Pranomostro mentioned it:

(28-05-2016, 10:50 AM)pranomostro Wrote: I heard an interesting idea a while ago, namely that 3 standard files are not enough.
stdin, stdout and stderr all have distinct purposes.

But very often stderr is misused for status information which doesn't have anything to do with errors.

The author gave a proposal of two other standard file descriptors: usrin and usrout, which can be used for user in- and output
while the program is running. If somebody here knows pick(1), he/she knows the truth (pick lets you select the lines that get sent
down the pipeline):

Some ugly hacking with the tty has to be done.

Here is the medium rant (only read the first half, the second one is about images in the terminal): https://medium.com/@zirakertan/rant-the-...45bb29dac8

I liked the first half. The second is totally infected by "Integrate everything."

And the post author in this section.
Obviously, this goes along with the textuality of Unix-like systems, you want to be able to join together the programs and not mess up the output, make it as simple as possible and easy to parse.

Here's more:
A Guidelines for Command Line Interface Design (or https://clig.dev/ )which discusses in depth the generic view of CLI, the what they are and so what they should do to make it easier, directly leaking from the definition.
A typical answer to what a Unix CLI should adhere to. This is the usual boring stuffs but if you haven't gone through it check it out.
The unfamous Program design in the UNIX environment which discusses the philosophy and generic design of Unix programs. This is more of a styling guide, an analysis of trends that should be avoided or favored.

	venam Offline \| 12-10-2016, 12:52 AM \| #17

This article was on HN.

	apk Offline \| 12-10-2016, 12:34 PM \| #18

check out venams hit new single

"i crawl the web faster than spiders"

get it on itunes and the play store today k

emerging underground artist guys

	acg Offline \| 12-10-2016, 09:57 PM \| #19

(12-10-2016, 12:52 AM)venam Wrote: This article was on HN.

That's a good post, it's been some time since I stopped checking HN daily. Wonder how you found something relevant.

argonaut · musician · developer · writer · https://www.betoissues.com

	pranomostro Offline \| 13-10-2016, 05:18 AM \| #20

(04-06-2016, 05:01 AM)z3bra Wrote: Why so? Because you shouldn't "force yourself" and simply use what's available? It makes sense to do so. My point was more about the fact that using awk only to print specific columns is not efficient at all. We came to a point where people use awk ONLY for this specific purpose. That's what is bugging me there.

To come back to this: I think there is a reason why we have small programming languages as filters, sed, awk, heck, even regular expressions are a small language. Yes, using awk just to print columns means not using it's full capabilities, but I think that is okay. In the book 'The Awk programming language' the authors stated that awk was mostly used for quick one-liners, and then continued to say that it was also possible to write full-blown programs in it. So first purpose was really a tool for flexible one-liners.

And if you mean efficient in the sense of 'awk is not fast', I want to discuss that with you. Because I had a really hard time beating
a awk 4-liner with a optimised and specialized C program. awk is fast. Give it a try yourself and try to beat awk with C. I found it very hard.

awk is useful for simple tasks like printing columns, validating data and so on, for creating filters on the fly. In my opinion it plays well together with the rest of the environment.

I took a look at my history to show that awk can be really useful for that:

Code:
$ ./leyland | awk '{ print($2"^"$3"+"$3"^"$2"=="$1) }' | bc | g -v '^1$'

# this is a good example for awk's usage: rearranging fields and verifying a commands output

$ zcat data/test06.gz | ./genin | ./rcn | awk '$3<0.5'

# again, we use awk as a filter. sed would be too complicated here, and bc just doesn't cut it (mainly because of the lack of -e)

$ ./ffstats bzip2 test/*.png | awk '$1>1 { print($1*10) }'  | sed 's/\..*//' | stag

# filtering and transforming the input would have been quite tedious in C, at least more than here

	z3bra Offline \| 13-10-2016, 08:53 AM \| #21

I'm definitely not saying that awk is neither fast nor efficient. It is indeed!
In my case, the only thing I can do with awk is printing a column. Looking at your history, I must admit I have no idea what these one-liners are doing. So in my case, I feel guilty for using awk because I really have no *real* reason to use it. It is, IMO the same thing as using sed to replace all occurences of 'a' by 'A' (assuming you have no other use for sed on your system). `tr` would be better suited because it does only this.

I do agree, however, that nowadays memory/diskspace is cheap. So installing sed/awk instead of tr/cut is irrelevant. It's just that, I don't know, if I was offerend a magic cooking robot with hundred feature, but that I would only use it to mix eggs, I'd feel like I'd waste it. And I don't like wasting things.

(Ok, the idea of "wasting" awk is stupid, but really, that's close to what I feel!)

	pranomostro Offline \| 13-10-2016, 02:12 PM \| #22

(13-10-2016, 08:53 AM)z3bra Wrote: Looking at your history, I must admit I have no idea what these one-liners are doing.

They were quite specific temporary filters. And awk came in quite handy.

(13-10-2016, 08:53 AM)z3bra Wrote: So in my case, I feel guilty for using awk because I really have no *real* reason to use it. It is, IMO the same thing as using sed to replace all occurences of 'a' by 'A' (assuming you have no other use for sed on your system).

Ah, okay. You have no use for it, so there is no reason for you to use it most of the time.

	josuah Offline \| 17-10-2016, 05:16 PM \| #23

(13-10-2016, 08:53 AM)z3bra Wrote: (Ok, the idea of "wasting" awk is stupid, but really, that's close to what I feel!)

That is also how I feel sometimes: "I can't really call awk for this. It would be an overkill".

I also feel dumb to use scripts that are 80% awk. Why not a script fully in awk with a <code>#!/usr/bin/awk -f</code> shebang...

I also like ip-style scripts: it has a top-level command (ip) that has subcommands (address, link...), that also have subcommands...

It is very easy to implement it in shell script: https://raw.githubusercontent.com/josuah...bin/config

I do a top-level config command, and in the same directory, config-* commands that can be called from the config command.

Each config-* command has a first line with a comment with description for the command.

Without argument, the config command lists the subcommands with these descriptions, with arguments, it selects the scripts (partial name allowed: 'config-git' == 'config g') and run it with the aditionnal arguments.

So I can type:

Code:
$ config build install tmux

$ config b install tmux

	venam Offline \| 25-11-2016, 04:07 PM \| #24

I've recorded a new updated version of this episode, as this was lacking quality.

Here's the transcript:

--(Show Notes)--
Show notes
---
https://en.wikipedia.org/wiki/Command-line_interface
http://eng.localytics.com/exploring-cli-best-practices/
http://harmful.cat-v.org/cat-v/unix_prog_design.pdf
http://programmers.stackexchange.com/que...-arguments
http://www.cs.pomona.edu/classes/cs181f/supp/cli.html
http://julio.meroh.net/2013/09/cli-desig...sages.html
http://julio.meroh.net/2013/09/cli-desig...ap-up.html
http://www.catb.org/~esr/writings/taoup/...11s07.html
http://www.catb.org/~esr/writings/taoup/...11s06.html
http://www.catb.org/~esr/writings/taoup/...10s01.html
http://www.catb.org/~esr/writings/taoup/...10s05.html
http://www.catb.org/~esr/writings/taoup/...11s04.html
Writing manpages
Perl6 Parsing command line args
Python Unix CLI template
Docopt
Getopt history
Command Line Interface Guidelines

And here are the episode links
http://podcast.nixers.net/feed/download....11-251.mp3

Music: https://dexterbritain.bandcamp.com/album...s-volume-1

	freem Offline \| 04-12-2020, 11:27 AM \| #25

(28-05-2016, 10:50 AM)pranomostro Wrote: The author gave a proposal of two other standard file descriptors: usrin and usrout, which can be used for user in- and output
while the program is running. If somebody here knows pick(1), he/she knows the truth (pick lets you select the lines that get sent
down the pipeline):

Some ugly hacking with the tty has to be done.

Interestingly enough, I have a set of POCs in my ~/devel/snippet directory which does just that. In the end, it's not that ugly, really. The solution is simply to open your controlling terminal an use it for user interaction instead of the naive way of reading/writing on stdin/stdout.
I hope that's what pick(1) does, and I wish that's what dialog did.

Note that the thread is quite long, so I have not read carefully the whole thing (linear forums make it hard to follow interesting parts of a thread).

	venam Offline \| 21-06-2021, 02:32 AM \| #26

Thread Promotion

This is an ever evolving topic, there's a lot to cover that hasn't been already. While we're reading taoup for the bookclub I realized that there's a lot more to add.
For example, in the type of interaction, I've listed above: input, output, sink, pipe. But I'd like to add phillbush's idea of "interactive filters" instead of non-interactive ones.
And also emphasize too the grouping/bundling of related commands to reduce the cognitive load on users. What some call the subcommand interface.

What do you think? What other criteria or ideas that should be emphasized to make a good CLI?

	seninha Offline \| 21-06-2021, 06:23 AM \| #27

One tool to make the best CLI programs is to abstract out common usage patterns from different programs. For example, some utilities produce file sizes, either to be read by another program or by a human. Instead of a -h option for different programs to produce human readable values, use a utility, such as z3bra's human, that converts values from its input into human-readable sizes.

Another practice to make good CLI programs is to use idiomatic functions:

A double-hyphen (--) on the argument list stops option parsing. Rather than implementing that, use getopt(3), that handles it and much more.
When a program in a script writes to stdout, it is hard to guess which program failed when it does not identify itself. Using the err(1) family of functions writes "progname: errno string: comment", a idiomatic error string that identifies the program by its progname (argv[0]).

I made this table for a post on the book club thread.
It lists and describes the types of CLI that can be used in a script. roguelike interface is ignored, because it does not scriptable.

Code:
┌──────────────────────────────────────────────────────────────────────────┐

│ Interface   Read input   Write output   Change environment   Examples    │

├──────────────────────────────────────────────────────────────────────────┤

│ Filter      ✓            ✓              ✗                    cat, grep   │

│ Cantrip     ✗            ✗              ✓                    rm, touch   │

│ Source      ✗            ✓              ✗                    ls, ps      │

│ Sink        ✓            ✗              ✓                    lpr, mail   │

│ Compiler    From file    To file        ✓                    tar, cc,    │

│ ed          From user    To user        ✓                    sh, ed, gdb │

└──────────────────────────────────────────────────────────────────────────┘

I think there are more (sub)categories of command-line interfaces that can be used in a pipeline.

The pretty-printer.
Pretty-printers are a subcategory of filters or sources whose output is not parseable, instead, the output is meant to be read by the user, not by another program. Pretty printers often use the number of $COLUMNS in a terminal to tabulate the output. ls(1) is a pretty-printer: you should not parse the output of ls(1).

The interactive filter.
Interactive filters are a subcategory of filters whose parsing of input and generation of output is done by the user, not programmatically by the utility. Examples are pick, smenu and fzf.

The wrapper.
Wrappers are utilities whose sole purpose is to call another utility. They programmatically set the environment (either environment variables or command-line arguments) for another utility and run them. An example is xargs(1).

The test.
Testers are a subcategory of cantrips or sinks that returns a result as exit status, not on standard output. The script checks the exit status ($?) of a test and proceeds according to its value.

	venam Offline \| 21-06-2021, 07:41 AM \| #28

Quote:Same goes for program names, use mnemonic, names that are obvious and
relate to what your program does.
You should attempt to make the command name easy to remember.

I wrote that previously, but I think it definitely needs to be said again as I keep finding new tools that choose the most unmemorizable names.
I'm sure you all have examples of this phenomenon.

	z3bra Offline \| 24-06-2021, 04:59 AM \| #29

(21-06-2021, 06:23 AM)phillbush Wrote: Instead of a -h option for different programs to produce human readable values, use a utility, such as z3bra's human, that converts values from its input into human-readable sizes.

Thanks my dude 🖤 😄 Link for unaware people: human(1)

	jkl Offline \| 12-10-2021, 05:59 AM \| #30

Heh:

Quote:The fundamental problem with vi is that it doesn’t have a mouse and therefore you’ve got all these commands.

Bill Joy, 1984.

https://web.archive.org/web/200607010830...joy84.html

--
<mort> choosing a terrible license just to be spiteful towards others is possibly the most tux0r thing I've ever seen

1 2 Next »

View a Printable Version