Making the best CLI programs - Programming On Unix
Users browsing this thread: 4 Guest(s)
|
|||
A complete utility is one that does exactly what it says with no extra whistles and bells. (Unlike e.g. most GNU tools.)
-- <mort> choosing a terrible license just to be spiteful towards others is possibly the most tux0r thing I've ever seen |
|||
|
|||
I kind of wish coreutils had more standardized flags (i.e. -v will always be version, -V would always be verbose, -s would be silent or quiet so there would be no -q flag). Apart from that, the important thing is that the output is easy to parse with grep or cut, there is no extraneous debug information, errors go to stderr, and overall it just works nicely in a pipeline.
Also, I agree with jkl's point that the GNU tools have way too much extra garbage built in. `ls` (for instance), does not need a sort flag. That's what `sort` is for. |
|||
|
|||
BEWARE: big post ahead.
TL;DR: act as a filter, easy to parse > easy to read Command line interfaces (for now) only have to ways for interfacing with the user: stdin and stdout. Both are immutable, as in, what's written to it can't be erased. To me, a good CLI program should act as a data filter. Taking a continuous stream of data on it's input, and output the same data, "filtered". All kinds of metadata and "behavior tweaking" methods (such as flag) should be given as an argument to modify the default behavior of the tool. When you write a tool, you are supposedly aware of what the filter should be, so the hardest part is to figure out what the data should be. Here are a few hints I've gathered, either by taking advices from other people or failing miserably (sometimes both!): input: 0. Do not mess "data" and "metadata" Let's take "grep" as an example. Its default behavior is to "filter out" all lines where a PATTERN doesn't appear within a data STREAM. Here are a few "misimplementations" of how "grep" could have been written: Code: $ echo PATTERN | grep FILE0 FILE1 The last one is over-exagerated, but the first two could have been valid implementations. They both suffer the same issue though: they don't make the tool act like a filter (or to reformulate it: they can't operate on STREAMs). 1. Limit the number of flags I often hear people complaining about one-letter flags limiting you to ONLY 26 flags. But I personally think that 26 flags is already too much! If your tool can handle this much behavior changes, it is likely too complex, and certainly should be splitted in multiple different tools. Take "ls" as an example. POSIX specifies 23 flags for it. GNU ls has 40 flags! (not even counting --long-opt only flags): Code: $ ls --help | grep -o "\s-[a-zA-Z0-9]" | sort | uniq | wc -l That is way too much. For example, the flag -r could be dropped: Code: $ ls -1 -r The flags -g and -o don't make much sense either, as they filter the output internally (removing either the OWNER or GROUP column). These flags were indeed added to help the user, but IMO they get in the way as they make the tool more complex to use (more docs and more code can't be a way to go for "simpler" tools) 2. Assume data will be messed up As much as possible, avoid rigid protocols and format for your input data, and try to simplify your input as much as possible. If you write, for example, and MPD client that would read commands on stdin, prefer simple inputs: Code: $ echo "volume 50" | ./pgm output: 0. Prefer "parseable" to "readable" Make your output easy to parse, rather than easy to read. Most unix tools work with lines, and play with separators (the famous $IFS variable). Try to format your output as such (tabulations make a great separator!). Prefer good documentation to verbosity. Consider a program that would print out the weather in different city. It could have the two following outputs (reading city names on stdin): Code: $ printf '%s\n%s\n' "Mexico" "Oslo" | weather The second output is pretty "hard" to read from a user POV. Now try to write a quick tool that would display only the city name and raining probability... As I'm a cool guy, here is the one for the second output: Code: $ printf '%s\n%s\n' "Mexico" "Oslo" | weather | cut -f1,4 It's easier to format RAW data than strip down a formatted one. 1. Your output is an API Think well about how you format your output data, because some people will rely on it. If you take the example above, some people will have a hard time if you suddenly decide to swap the temperature with the humidity value. Be consistent, document your tools and remain backward compatible as much as possible. 2. Be stupid Don't try to look cool by checking wether the user is writting to a terminal or not, and change to output accordingly. This will only confuse the users that will not understand why their parsing tools can't parse what they usually get. The simpler, the better. |
|||
|
|||
(23-05-2016, 10:08 AM)ninjacharlie Wrote: Also, I agree with jkl's point that the GNU tools have way too much extra garbage built in. `ls` (for instance), does not need a sort flag. That's what `sort` is for. My favorite example is GNU's "true" which can - if compiled correctly - return "false". Including --help and, most important, --version. Because, you know, a different version of "true" can lead to different results. -- <mort> choosing a terrible license just to be spiteful towards others is possibly the most tux0r thing I've ever seen |
|||
|
|||
@ninjacharlie:
You are kind of wrong. If your programs have similar flags, you should consider making it an own utility. One example for this is the option -h: it is used in du, ls and df, but can be abstracted into one utility (see git.z3bra.org/human). Of course, this is not always possible. Your examples with -v (for version), -V (for verbose) -h (help, -s (silent) make sense. But I think that duplication of flags always carries the danger of replicating features. I heard an interesting idea a while ago, namely that 3 standard files are not enough. stdin, stdout and stderr all have distinct purposes. But very often stderr is misused for status information which doesn't have anything to do with errors. The author gave a proposal of two other standard file descriptors: usrin and usrout, which can be used for user in- and output while the program is running. If somebody here knows pick(1), he/she knows the truth (pick lets you select the lines that get sent down the pipeline): Some ugly hacking with the tty has to be done. Here is the medium rant (only read the first half, the second one is about images in the terminal): https://medium.com/@zirakertan/rant-the-...45bb29dac8 I liked the first half. The second is totally infected by "Integrate everything." Edit: also, z3bra is right with everything. As always (sigh). |
|||
|
|||
|
|||
(23-05-2016, 10:20 AM)z3bra Wrote: This would be just as easy (or even easier) by adding a header and with alignment, like for <code>ps</code>: Code: City Wind Humidity Rain Temperature I liked these long posts! |
|||
|
|||
(01-06-2016, 10:40 AM)sshbio Wrote: This would be just as easy (or even easier) by adding a header and with alignment, like for ps: I'm not quite sure. To be honest, I feel like headers/alignment makes it harder to parse the output. For example, in order to get all the temperature, you'll need: Code: $ weather | sed 1d | tr -s ' ' | cut -d\ -f5 You first need to filter out the header, then, get rid of the alignment, and finally output the 5 field, using a custom separator. With my solution, all you need is Code: $ weather | cut -f5 Way easier to parse actually, and thus, it should be easy to make a "readable output" right? Code: $ weather | sed '1iCity\tWind\tHumidity\tRain\tTemperature' | column -t |
|||
|
|||
Quote:I'm not quite sure. To be honest, I feel like headers/alignment makes it harder to parse the output. For example, in order to get all the temperature, you'll need: But if you're intending to use awk, the first input shouldn't be a problem. It could be like: Code: weather | sed 1d | awk '{print $5}' awk easily formats columns. obs.: There must be some bug in the javascript because I can't quote z3bra's last post automatically with the quote button. Trying to quote other posts works fine. |
|||
|
|||
If you intend to use awk, you shouldn't use any other tool, as it is a full blow programming language:
Code: $ weather | awk '{if (NR!=1) {print $5}}' The point here was to discuss CLI programs, while awk is a language interpreter. That would be like using python or perl to filter the output. |
|||
|
|||
|
|||
Why so? Because you shouldn't "force yourself" and simply use what's available? It makes sense to do so. My point was more about the fact that using awk only to print specific columns is not efficient at all. We came to a point where people use awk ONLY for this specific purpose. That's what is bugging me there.
|
|||
|
|||
I've found a nice addition to this topic:
A series of posts about CLI design. |
|||
|
|||
Edit. :)
-- <mort> choosing a terrible license just to be spiteful towards others is possibly the most tux0r thing I've ever seen |
|||
|
|||
(14-09-2016, 02:03 PM)jkl Wrote: Well done after I have posted it in another topic.I crawl the web faster than spiders. ☆*・゜゚・*\(^O^)/*・゜゚・*☆ EDIT: It deals with the programming aspect of the CLI and other stuffs I've completely missed in the talk but that were mentioned in the discussion here. Also, this topic is always recurrent, the usual "no news is good news". Pranomostro mentioned it: (28-05-2016, 10:50 AM)pranomostro Wrote: I heard an interesting idea a while ago, namely that 3 standard files are not enough.And the post author in this section. Obviously, this goes along with the textuality of Unix-like systems, you want to be able to join together the programs and not mess up the output, make it as simple as possible and easy to parse. Here's more: A Guidelines for Command Line Interface Design (or https://clig.dev/ )which discusses in depth the generic view of CLI, the what they are and so what they should do to make it easier, directly leaking from the definition. A typical answer to what a Unix CLI should adhere to. This is the usual boring stuffs but if you haven't gone through it check it out. The unfamous Program design in the UNIX environment which discusses the philosophy and generic design of Unix programs. This is more of a styling guide, an analysis of trends that should be avoided or favored. |
|||
|
|||
This article was on HN.
|
|||
|
|||
check out venams hit new single
"i crawl the web faster than spiders" get it on itunes and the play store today k emerging underground artist guys |
|||
|
|||
(12-10-2016, 12:52 AM)venam Wrote: This article was on HN. That's a good post, it's been some time since I stopped checking HN daily. Wonder how you found something relevant.
argonaut · musician · developer · writer · https://www.betoissues.com
|
|||
|
|||
(04-06-2016, 05:01 AM)z3bra Wrote: Why so? Because you shouldn't "force yourself" and simply use what's available? It makes sense to do so. My point was more about the fact that using awk only to print specific columns is not efficient at all. We came to a point where people use awk ONLY for this specific purpose. That's what is bugging me there. To come back to this: I think there is a reason why we have small programming languages as filters, sed, awk, heck, even regular expressions are a small language. Yes, using awk just to print columns means not using it's full capabilities, but I think that is okay. In the book 'The Awk programming language' the authors stated that awk was mostly used for quick one-liners, and then continued to say that it was also possible to write full-blown programs in it. So first purpose was really a tool for flexible one-liners. And if you mean efficient in the sense of 'awk is not fast', I want to discuss that with you. Because I had a really hard time beating a awk 4-liner with a optimised and specialized C program. awk is fast. Give it a try yourself and try to beat awk with C. I found it very hard. awk is useful for simple tasks like printing columns, validating data and so on, for creating filters on the fly. In my opinion it plays well together with the rest of the environment. I took a look at my history to show that awk can be really useful for that: Code: $ ./leyland | awk '{ print($2"^"$3"+"$3"^"$2"=="$1) }' | bc | g -v '^1$' |
|||
|
|||
I'm definitely not saying that awk is neither fast nor efficient. It is indeed!
In my case, the only thing I can do with awk is printing a column. Looking at your history, I must admit I have no idea what these one-liners are doing. So in my case, I feel guilty for using awk because I really have no *real* reason to use it. It is, IMO the same thing as using sed to replace all occurences of 'a' by 'A' (assuming you have no other use for sed on your system). `tr` would be better suited because it does only this. I do agree, however, that nowadays memory/diskspace is cheap. So installing sed/awk instead of tr/cut is irrelevant. It's just that, I don't know, if I was offerend a magic cooking robot with hundred feature, but that I would only use it to mix eggs, I'd feel like I'd waste it. And I don't like wasting things. (Ok, the idea of "wasting" awk is stupid, but really, that's close to what I feel!) |
|||
|
|||
(13-10-2016, 08:53 AM)z3bra Wrote: Looking at your history, I must admit I have no idea what these one-liners are doing. They were quite specific temporary filters. And awk came in quite handy. (13-10-2016, 08:53 AM)z3bra Wrote: So in my case, I feel guilty for using awk because I really have no *real* reason to use it. It is, IMO the same thing as using sed to replace all occurences of 'a' by 'A' (assuming you have no other use for sed on your system). Ah, okay. You have no use for it, so there is no reason for you to use it most of the time. |
|||
|
|||
(13-10-2016, 08:53 AM)z3bra Wrote: (Ok, the idea of "wasting" awk is stupid, but really, that's close to what I feel!) That is also how I feel sometimes: "I can't really call awk for this. It would be an overkill". I also feel dumb to use scripts that are 80% awk. Why not a script fully in awk with a <code>#!/usr/bin/awk -f</code> shebang... I also like ip-style scripts: it has a top-level command (ip) that has subcommands (address, link...), that also have subcommands... It is very easy to implement it in shell script: https://raw.githubusercontent.com/josuah...bin/config I do a top-level config command, and in the same directory, config-* commands that can be called from the config command. Each config-* command has a first line with a comment with description for the command. Without argument, the config command lists the subcommands with these descriptions, with arguments, it selects the scripts (partial name allowed: 'config-git' == 'config g') and run it with the aditionnal arguments. So I can type: Code: $ config build install tmux |
|||
|
|||
I've recorded a new updated version of this episode, as this was lacking quality.
Here's the transcript: --(Show Notes)-- Show notes --- https://en.wikipedia.org/wiki/Command-line_interface http://eng.localytics.com/exploring-cli-best-practices/ http://harmful.cat-v.org/cat-v/unix_prog_design.pdf http://programmers.stackexchange.com/que...-arguments http://www.cs.pomona.edu/classes/cs181f/supp/cli.html http://julio.meroh.net/2013/09/cli-desig...sages.html http://julio.meroh.net/2013/09/cli-desig...ap-up.html http://www.catb.org/~esr/writings/taoup/...11s07.html http://www.catb.org/~esr/writings/taoup/...11s06.html http://www.catb.org/~esr/writings/taoup/...10s01.html http://www.catb.org/~esr/writings/taoup/...10s05.html http://www.catb.org/~esr/writings/taoup/...11s04.html Writing manpages Perl6 Parsing command line args Python Unix CLI template Docopt Getopt history Command Line Interface Guidelines And here are the episode links http://podcast.nixers.net/feed/download....11-251.mp3 Music: https://dexterbritain.bandcamp.com/album...s-volume-1 |
|||
|
|||
(28-05-2016, 10:50 AM)pranomostro Wrote: The author gave a proposal of two other standard file descriptors: usrin and usrout, which can be used for user in- and output Interestingly enough, I have a set of POCs in my ~/devel/snippet directory which does just that. In the end, it's not that ugly, really. The solution is simply to open your controlling terminal an use it for user interaction instead of the naive way of reading/writing on stdin/stdout. I hope that's what pick(1) does, and I wish that's what dialog did. Note that the thread is quite long, so I have not read carefully the whole thing (linear forums make it hard to follow interesting parts of a thread). |
|||
|
|||
Thread Promotion
This is an ever evolving topic, there's a lot to cover that hasn't been already. While we're reading taoup for the bookclub I realized that there's a lot more to add. For example, in the type of interaction, I've listed above: input, output, sink, pipe. But I'd like to add phillbush's idea of "interactive filters" instead of non-interactive ones. And also emphasize too the grouping/bundling of related commands to reduce the cognitive load on users. What some call the subcommand interface. What do you think? What other criteria or ideas that should be emphasized to make a good CLI? |
|||
|
|||
One tool to make the best CLI programs is to abstract out common usage patterns from different programs. For example, some utilities produce file sizes, either to be read by another program or by a human. Instead of a -h option for different programs to produce human readable values, use a utility, such as z3bra's human, that converts values from its input into human-readable sizes.
Another practice to make good CLI programs is to use idiomatic functions:
I made this table for a post on the book club thread. It lists and describes the types of CLI that can be used in a script. roguelike interface is ignored, because it does not scriptable. Code: ┌──────────────────────────────────────────────────────────────────────────┐ I think there are more (sub)categories of command-line interfaces that can be used in a pipeline. The pretty-printer. Pretty-printers are a subcategory of filters or sources whose output is not parseable, instead, the output is meant to be read by the user, not by another program. Pretty printers often use the number of $COLUMNS in a terminal to tabulate the output. ls(1) is a pretty-printer: you should not parse the output of ls(1). The interactive filter. Interactive filters are a subcategory of filters whose parsing of input and generation of output is done by the user, not programmatically by the utility. Examples are pick, smenu and fzf. The wrapper. Wrappers are utilities whose sole purpose is to call another utility. They programmatically set the environment (either environment variables or command-line arguments) for another utility and run them. An example is xargs(1). The test. Testers are a subcategory of cantrips or sinks that returns a result as exit status, not on standard output. The script checks the exit status ($?) of a test and proceeds according to its value. |
|||
|
|||
Quote:Same goes for program names, use mnemonic, names that are obvious and I wrote that previously, but I think it definitely needs to be said again as I keep finding new tools that choose the most unmemorizable names. I'm sure you all have examples of this phenomenon. |
|||
|
|||
(21-06-2021, 06:23 AM)phillbush Wrote: Instead of a -h option for different programs to produce human readable values, use a utility, such as z3bra's human, that converts values from its input into human-readable sizes. Thanks my dude 🖤 😄 Link for unaware people: human(1) |
|||
|
|||
Heh:
Quote:The fundamental problem with vi is that it doesn’t have a mouse and therefore you’ve got all these commands. Bill Joy, 1984. https://web.archive.org/web/200607010830...joy84.html -- <mort> choosing a terrible license just to be spiteful towards others is possibly the most tux0r thing I've ever seen |
|||