nixers
Nixers Book Club - Book #4: The Art of UNIX Programming - Printable Version
+- nixers (https://nixers.net)
+-- Forum: General (https://nixers.net/Forum-General)
+--- Forum: Community & Forums Related Discussions (https://nixers.net/Forum-Community-Forums-Related-Discussions)
+--- Thread: Nixers Book Club - Book #4: The Art of UNIX Programming (/Thread-Nixers-Book-Club-Book-4-The-Art-of-UNIX-Programming)
Pages: 1 2


Nixers Book Club - Book #4: The Art of UNIX Programming - seninha - 26-04-2021

As proposed here the next book of the Nixer's Book Club is gonna be «The Art of UNIX Programming».

[Image: The_Art_of_Unix_Programming.jpg]

Quoting from Wikipedia:
Quote:The Art of Unix Programming by Eric S. Raymond is a book about the history and culture of Unix programming from its earliest days in 1969 to 2003 when it was published, covering both genetic derivations such as BSD and conceptual ones such as Linux.

The author utilizes a comparative approach to explaining Unix by contrasting it to other operating systems including desktop-oriented ones such as Microsoft Windows and the classic Mac OS to ones with research roots such as EROS and Plan 9 from Bell Labs. The book was published by Addison-Wesley, September 17, 2003, ISBN 0-13-142901-9 and is also available online, under a Creative Commons license with additional clauses.

The book is online for free here (just follow the links).

Some chapters are really long, others are not. I think we can do 1 to 2 chapters per week.
As usual, our sessions will take place on Saturdays.
Our first session will be May, 8 (in two weeks), when we will discuss the first two chapters.
See you then!


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - venam - 01-05-2021

Chapter 1 and 2 are about the underlying philosophy and culture of the
Unix movement.

In my opinion, from the reading these two lines sum it up really well:

"Mechanism, not policy"
"Unix + Fun hacker culture + open source movement, community-building devices"

I really like the way this is written, a bit distanced from the topic
to be able to look at it from all sides. A middle-way.
No heavy partisanship or tribal writing, and seeing the positive and
negative aspects of the different ideas and ways of perceiving things.

The mechanism, not policy, reminds me of the last book we read, the
Wayland book, which took this to a next level with only having the
protocol defined.
The same goes for PipeWire, for those who have gotten into it.


Quote:And a vicious circle operates; the competition thinks it has to compete with chrome by adding more chrome.

I've shared that one on IRC, because when taken out of context today
that sentence is super funny.

Even McIlroy was unto it.

Quote:The original HTML documents recommended “be generous in what you accept”, and it has bedeviled us ever since because each browser accepts a different superset of the specifications. It is the specifications that should be generous, not their interpretation.

-- Doug McIlroy

McIlroy adjures us to design for generosity rather than compensating for inadequate standards with permissive implementations. Otherwise, as he rightly points out, it's all too easy to end up in tag soup.

Another sentence that caught my attention and which we should reconsider
today with all the tech-paparazzi, drama, absolutism, and purism
happening:

Quote:If someone has already solved a problem once, don't let pride or politics suck you into solving it a second time rather than re-using. And never work harder than you have to; work smarter instead, and save the extra effort for when you need it. Lean on your tools and automate everything you can.

Torvalds's cheerful pragmatism and adept but low-key style catalyzed an astonishing string of victories for the hacker culture in the years 1993–1997,
Against purists and absolutists.

The history chapters are written sort of like a sci-fi, à la Star Wars trilogy.
That resonates well in creating a kind of folkloric aspect to the topic.

The writing was advanced for its time, predicting that open source could
be used as a marketing tool, a brand of differentiation.


Quote:The other (and more important) intention behind “open source” was to present the hacker community's methods to the rest of the world (especially the business mainstream) in a more market-friendly, less confrontational way.

The open-source movement is winning by commoditizing software. To prosper, Unix needs to maintain the knack of co-opting the cheap plastic solution rather than trying to fight it.

This reminds of of all the cheap boards that are getting sold today,
these could be the new "personal devices" of tomorrow. Or to stay closer
to reality: mobile phones.

Overall, nice setup and writing style, with a good tone.
Lots of cultural anecdotes and historical trivia in the first chapters,
but it's good to review them in retrospect. Personally, I like that way
of describing things.


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - ckester - 01-05-2021

(26-04-2021, 06:22 PM)phillbush Wrote: Our first session will be May, 8 (in two weeks), when we will discuss the first two chapters.
See you there!

Where is "there"? Is the discussion on the IRC channel?


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - seninha - 01-05-2021

(01-05-2021, 04:53 PM)ckester Wrote: Where is "there"? Is the discussion on the IRC channel?
Sorry, I meant "see you then"/at that time.
The discussion will be here, on this thread.


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - ckester - 01-05-2021

OK, got it. Pulling my copy off the shelf for some after-dinner reading.


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - seninha - 08-05-2021

Chapter 1 is about the philosophy of the UNIX culture.
Chapter 2 is about the history of the UNIX culture.

The philosophy, or unwritten tradition of the UNIX culture is described in the first chapter.

The chapters summarize how the UNIX culture, built around free and open development was vital to the durability of UNIX (and its descendants), even after the civil wars of AT&T vs Berkeley and the war amonb different UNIX vendors. As the book says:

Quote:Today the UNIX community itself has taken control of the technology and marketing, and is rapidly and visibly solving UNIX's problems.

UNIX, either by its design or by the culture built around it, is responsible for several good stuff. Namely, the open-source and free software movements (which will be described in the second chapter); portable, composable and flexible software; the hacker culture (UNIX is fun to hack); and the Internet.

In the section “What Unix Gets Wrong” we see the old topics already covered at The UNIX HATERS Handbook: ”Unix files have no structure above byte level; file deletion is irrevocable; job control is botched; etc.” And also, the X Windowing System. As the book says, most of those flaws reflects UNIX's heritage as an operating system designed primarily for technical users.

The book then tries to describe the UNIX philosophy. First, by the words of Doug McIlroy, Rob Pike and Ken Thompson. Then, the author summarizes the phillosophy in seventeen rules, which are explained in details on the following sections.

(01-05-2021, 11:11 AM)venam Wrote: "Mechanism, not policy"

This is also what I get from the first chapter. The rules guide how to design a software, and the “mechanism not policy” (or “separate interfaces from engines”) maxim is present to most of them, along with the simplicity of interface, and the preservation of programmers' time and effort (both for the first programmer and for future programmers).

Quote:Data dominates. If you've chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming
...
Rule of Representation: Fold knowledge into data so program logic can be stupid and robust.

That's a rule I should have known before. Some of my algorithms are unnecessarily complex because of unorganized data structures.

The second chapter is about the history of the UNIX culture, the Internet culture, the hacker movement and the open-source movement (with all of its factions).

(01-05-2021, 11:11 AM)venam Wrote: This reminds of of all the cheap boards that are getting sold today, these could be the new "personal devices" of tomorrow. Or to stay closer to reality: mobile phones.
I have never thought of that. I hope that turns out to be true in the future.


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - ckester - 09-05-2021

(01-05-2021, 11:11 AM)venam Wrote: Another sentence that caught my attention and which we should reconsider
today with all the tech-paparazzi, drama, absolutism, and purism
happening:

Quote:If someone has already solved a problem once, don't let pride or politics suck you into solving it a second time rather than re-using. And never work harder than you have to; work
smarter instead, and save the extra effort for when you need it. Lean on your tools and automate everything you can.

Torvalds's cheerful pragmatism and adept but low-key style catalyzed an astonishing string of victories for the hacker culture in the years 1993–1997,
Against purists and absolutists.

How many reimplementations in Rust or Go of common utilities are there? Hacker culture does seem to have a roll-your-own philosophy, despite this rule, if only because breaking it can provide a learning opportunity.

venam Wrote:This reminds of of all the cheap boards that are getting sold today,
these could be the new "personal devices" of tomorrow. Or to stay closer
to reality: mobile phones.

Raspberry Pi's, Odroids, etc.


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - venam - 15-05-2021

It's Saturday, let's bump this thread! 😁

(09-05-2021, 01:17 PM)ckester Wrote: How many reimplementations in Rust or Go of common utilities are there? Hacker culture does seem to have a roll-your-own philosophy, despite this rule, if only because breaking it can provide a learning opportunity.
I personally think re-implementations are good too in a sense. They allow to revisit old problems. But implementation just for the sake of reimplementation, with nothing added, or for political/ethical/moral reasons is also not worth it.

Anyway, into this week's chapters, 3 and 4.



Chatper 3 is the last chapter in the philosophy section about contrasts.

I like the way esr describes stuff, it's always like a sort of
questioning, a Socratic-like method the kind of writing that makes you
think about other ways to see things.

In that sense, there's a lot of comparison of different OSs, rotating
around the topic to try to get a clearer idea, describing Unix using
anti-Unix terms, telling what it is by what it is not.

The CLI facilities is definitely a cool sections, it resonates a lot.

Quote:If the CLI facilities of an operating system are weak or nonexistent,
you'll also see the following consequences:

Programs will not be designed to cooperate with each other in unexpected
ways — because they can't be. Outputs aren't usable as inputs.

Remote system administration will be sparsely supported, more difficult
to use, and more network-intensive.

Even simple non-interactive programs will incur the overhead of a GUI or
elaborate scripting interface.

Servers, daemons, and background processes will probably be impossible
or at least rather difficult, to program in any graceful way.

To design the perfect anti-Unix, have no CLI and no capability to script
programs — or, important facilities that the CLI cannot drive.

Apart from this, and even through the previous and next chapter, there's
the overall theme of having users in the driving seat, a user-centric
system that also has a social aspect.

Accordingly, the intended audience of the OS should be
considered. Unix-wise, the barrier to development should especially be
decreased, here again the social aspect: cost and time.

The visit of the different "classic" OS puts a lot of things in
perspective. I'd advise anyone who hasn't read it to give it a go.


NB: On NT webserver in kernel space: We discussed that on IRC in the
past. We had a fun discussion about the pros and cons of things like
kHTTPd, TUX, http.sys, etc..
These days, it makes no sense to have these.

BeOS thinking sort of reminds me of the whole snapshot fs we have today,
like ZFS and others. It's still alive in Haiku as far as I know.

Quote:Indeed, a substantial fraction of the Linux user community is understood
to be wringing usefulness out of hardware as technically obsolete today
as Ken Thompson's PDP-7 was in 1969. As a consequence, Linux applications
are under pressure to stay lean and mean that their counterparts under
proprietary Unix do not experience.

The days have changed. Here again, we have to join tech with the people
making it. Survivability, can be untied from the hardware but it is
still tied to social group, devs, and companies.

Then comes chapter 4 about modularity, the first chapter in the design part.

I really didn't know the Unix early devs were the first to apply
modularity in software, or I had forgotten.

Quote:Dennis Ritchie encouraged modularity by telling all and sundry that
function calls were really, really cheap in C. Everybody started writing
small functions and modularizing. Years later we found out that function
calls were still expensive on the PDP-11, and VAX code was often spending
50% of its time in the CALLS instruction. Dennis had lied to us! But it
was too late; we were all hooked...

-- Steve Johnson

He did well to lie. Wouldn't that be like electronjs devs today saying
we have enough RAM though?

Even in modularity, we have to think of the barrier of entry, the social
aspect, and the human aspect.
The book dives into the idea of creating human-first API, described in
everyday language first, and then thinking about how to apply it.
Similarly, the code-size should appropriately fit the human's cognitive
constraints: not too small and fragmented, and not too big.

Quote:Compactness is the property that a design can fit inside a human being's
head. A good practical test for compactness is this: Does an experienced
user normally need a manual? If not, then the design (or at least the
subset of it that covers normal use) is compact.

This reminds me of The Design of Everyday Things by Donald Norman,
which has a focus on how instinctive and intuitive interfaces are.

Even the pragmatic approach is very human.

Quote:Often, top-down and bottom-up code will be part of the same
project. That's where ‘glue’ enters the picture.

Quote:The thin-glue principle can be viewed as a refinement of the Rule of
Separation. Policy (the application logic) should be cleanly separated
from mechanism (the domain primitives), but if there is a lot of code
that is neither policy nor mechanism, chances are that it is accomplishing
very little besides adding global complexity to the system.

One thing this also reminds me of is Dijkstra 1972 ACM Turing Award
lecture where he describes approaching the task of software development
as humble programmers.

Quote:The competent programmer is fully aware of the strictly limited size
of his own skull; therefore he approaches the programming task in full
humility, and among other things he avoids clever tricks like the plague.

I'm not sure I really buy the OO talk because in a lot of cases it does
exactly what is described above and reduces cognitive loads.

The end of chapter 4 reminds me of all the pseudo-discussions people
have about optimizations, and code metrics, static or not.
A new metric that I mentioned recently on IRC and
that is more human is a social coding hotspot metric such as
biomarkers.


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - seninha - 15-05-2021

Chapter 3 compares the UNIX philosophy and the design of UNIX with the philosophy and design of other operating system.

The first section describes the fundamental design ideas of the UNIX operating system, and develop the anti-UNIX, a system with the opposite of those ideas.
The anti-UNIX has no unifying idea (“just an incoherent pile of ad-hoc features”), has no internal boundaries to protect the user and the processes, has record structure and attributes for files and relies relies on binary file formats, has deficient scripting capabilities, and knows what is better for their users, and is non-hackeable.
In the end, I got the intuition that the author was describing MS-DOS as the anti-UNIX.

The chapter then describes others operating systems.
I know nothing or little about some of them.
That's the first time I heard about VMS, VM/CMS and MVS.
BeOS and OS/2 I know only a little about.

Reading this section I feel bad about BeOS; and pissed off about Windows NT.
BeOS seemed to be a good operating system, with the better of the UNIX and classic MacOS world.

The next section describes how UNIX survived from a server operating system into a client operating system, from big machines to PCs (and how the evolution in the opposite direction is more difficult to happen).

Chapter 4 is the first about the design of UNIX programs.

The chapter defends the design of encapsulated discrete modules with orthogonal APIs.
The modules should not be small in size and many in number, nor large in size and few in number. The first section talks about module density and how bugs likely distributes through the different module densities.
The chapter defends that modularization and their interfaces should be intuitive to the human mind.
It also defends that each module should be compact and have a “strong single center” (a single algorithm or general approach that defines the module).

I really like how the book cites other works through the text and the footnotes. I am collecting all the citation to check them later.


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - venam - 22-05-2021

Chapter 5 and 6 are about textuality and transparency, topics that are
inherently related to one another.

Chapter 5 - Textuality

I like the framing of problems that esr uses.
For protocols, he's asking it from the point of view of
communication/transmission of data between computers or its
storage. Marshalling and unmarshalling it.

The requirements/attributes of interest for that are laid down:

- Interoperability
- Transparency
- Extensibility
- Storage or Transaction economy

As a recurrent theme, we go back to the human and social aspects. The
premise here is that text is human, has a low cognitive load, is future
proof, and encourages interoperability.
Parsed by eyeball == Many good things happen.
But still, we can see its limitation in particular cases.
Human is easier on us, but might not be on the machine.

The learn-by-example method is great, again I advise anyone to give it
a look. There's a lot of good textual format examples.

There's a lot of good rfcs that are linked and this get me thinking
that I should read more of them. I'll probably put some time aside to
go over the classic ones as they are really great in protocol design,
and personally, I like going over such docs.


Chapter 6. Transparency

As the previous chapter, we start with the idea that textuality is more
human and thus promotes both transparency and discoverability.

Transparency = Comprehend the whole, no distance between the whole design and us.
Discoverability = Comprehend the part, be able to introspect it.

Quote:The lesson here is clear. Dumbing down your UI is only the half-smart
thing to do. The really smart thing is to find a way to leave the details
accessible, but make them unobtrusive.

I really like that. I wish it was like this more often.


I didn't know about SNG, this would have been way easier when I was playing with PNG format.

Quote:The gains here go beyond the time not spent writing special-purpose code
for manipulating binary PNGs
You don't say 😂.

Also on the image format, corkami has good resources: https://github.com/corkami/formats/tree/master/image

I love the idea of textualizers, binary as readable format. I'm sure a
many projects would benefit from these. Actually, in my day to day work,
I use these a lot.

Well, we did say it depended on the case:
Quote:The design superficially contradicts the advice we gave in Chapter 5
against binary caches, but this is actually the extreme case in which
that's a good tactic. Edits to the text masters are very rare — in fact,
Unixes normally ship with the terminfo database precompiled and the text
master serving primarily as documentation. Thus, the synchronization
and inconsistency problems that would normally militate against this
approach almost never arise.

Discoverability is a cool topic that I'm guilty of forgetting by not
adding verbose flag for introspection.
"The ways in which your code is a communication to other human beings"
Just recently, I got in a detour to learn Gstreamer and it got one of the most
excellent documentation and introspection I've seen in any project.
It even has command line tools to get information about the possible elements.
Code:
$ gst-inspect-1.0 uridecodebin


Quote:Software is maintainable to the extent that people who are not its author
can successfully understand and modify it. Maintainability demands more
than code that works; it demands code that follows the Rule of Clarity
and communicates successfully to human beings as well as the computer.

That's probably one of the most important sentence I've read.
Great chapter!


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - seninha - 22-05-2021

Chapter 5 is about text files for data storage and text protocols.
In this chapter, as in the next ones, the author describes several case studies.
He also describes the metaformats for file formats, that is, their general syntax.
There are several metaformats, such as delimiter-separated values (as /etc/passwd), RFC 822, cookie jar, XML, etc.

He also describes some general conventions for the metaformats.

Quote:Do not allow the distinction between tab and whitespace to be significant.
This is a recipe for serious headaches when the tab settings on your users' editors are different; more generally, it's confusing to the eye. Using tab alone as a field separator is especially likely to cause problems; allowing any run of tabs and spaces to be a field separator, on the other hand, works well.

Well, I'm doing wrong then.
Both xprompt and xmenu, as well as other utilities of mine use tab as delimiter character and indentation character. I thought it was a good idea. Other people also had that idea. Tab as delimiter might be a good option for data serialization generated by a program for the user to read; but not when generate by the user for a program to parse. It's too late now, my programs already have this flaw. I will follow the other general conventions: support backslash escape sequences for nonprintable characters, use # as comments, etc.
As the book says, using tabs as delimiter was initially used in the beginning of the UNIX tradition (and survives in utilities like cut(1) and paste(1)).

Chapter 6 is about transparency, discoverability and maintainability in the design of programs.
The chapter has some studying cases, which include GUI KDE utilities and CLI utilities.

Quote:Discoverability is about reducing barriers to entry; transparency is about reducing the cost of living in the code.

The chapter explains about transparency and discoverability both for user interfaces and for code.
A transparent and discoverable UI is intuitive.
A transparent and discoverable code is maintainable.


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - ckester - 22-05-2021

(22-05-2021, 09:24 AM)phillbush Wrote: Well, I'm doing wrong then.

Heh. I'm getting the same feeling a lot while reading this book!


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - venam - 29-05-2021

Alright, whoever is reading, it's time for sharing what we've learned or have to comment and discuss on the last 2 chapters.

Here's my review and comments.

Chapter 7: Multiprogramming

This chapter is really about IPC, how to communicate between programs,
or rely on other programs to achieve your goal.

It's no surprise that it focuses on separation of concern, and to do
things as simple as possible. Self-evident? Maybe not.
"Programming in the real world is about managing complexity."

So the whole chapter builds upon going through examples of different
types of IPC and we're told to always choose the most simplest between
them that can handle our scenario. Also, self-evident?

Some of the IPC we learn about: pipes/redirection, tempfiles, signals,
wrappers, slave processes, sockets, shared memory, etc..

Quote:Despite occasional exceptions such as NFS (Network File System) and
the GNOME project, attempts to import CORBA, ASN.1, and other forms of
remote-procedure-call interface have largely failed — these technologies
have not been naturalized into the Unix culture.

Say hello to d-bus!

Quote:One is that RPC interfaces are not readily discoverable

D-bus solved that through introspectable interfaces. But it's hated for
the same reasons as other RPC methods.

NB: I had no clue the X server was not threaded.

Chapter 8. Minilanguages

So after having visited how to interact and rely on the power of other
tools, we're diving into minilanguages, which are very popular on Unix.

They're presented by viewing them as types of DSL, each with their
own complexity and categorization.

Quote:The second right way to get to a minilanguage design is to notice
that one of your specification file formats is looking more and more
like a minilanguage

If your config is a whole language, it starts this cycle of complexity:
https://mikehadlow.blogspot.com/2012/05/configuration-complexity-clock.html

I like the idea that these DSL could be "accidentally Turing-complete"
and incidentally, in the future, some crazy person is going to abuse
this to create issues.

On glade: I never really enjoyed it, it was too cumbersome
when I tried it. I did have a go with a similar language called
kivy.

Quote:Case Study: JavaScript
I've heard of that minilanguage before, but where.

Again the rule here is to keep things as simple as possible, not to argue
that everyone should have a minilanguage, but on the contrary to see in
which cases they are valid.

I think these are really classic stuff to that most people should read
at least once. It's good for "software architects" too.


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - seninha - 29-05-2021

Chapter 7 is about multiprocess (or, as the book calls, multiprogramming).

Quote:UNIX encorages us to break our programs into simpler subprocesses, and to concentrate on the interfaces between the subprocesses.

It does that in three ways: by making process-spawning cheap, by providing IPC methods, by encoraging the use of simple text protocols.

The IPC methods are the following:
  • Shelling out (call a program, with system(3) for example).
  • Bolting on (call a program with popen(3)).
  • Pipelining (run filter programs concurrently on a pipeline).
  • Wrapping (create a new interface for a called program).
  • Tempfiles (communication via a temporary file).
  • Signals (signals were originally designed into UNIX as a way for the operating system to notify programs of certain errors and critical events, not as an IPC facility).

Quote:GNOME's main competitor, KDE, started with CORBA but abandoned it in their 2.0 release. They have been on a quest for lighter-weight IPC methods ever since.
And thus D-Bus was born.

Quote:The combination of threads, remote-procedure-call interfaces and heavyweight object-oriented design is especially dangerous. [...] if you are ever invited onto a project that is supposed to feature all three, fleeing in terror might well be an appropriate reaction.

Chapter 8 is about domain-specific languages.

Quote:more expressive languages mean shorter programs and fewer bugs

I have been using a little language, awk, for much of the stuff I would do in C. In particular, after playing with coding styles, I wanted to write a small indent(1) program. At first I thought in fork OpenBSD indent(1), then I thought in using lex(1) for the lexical analyzer. Then I thought, why not awk? It would be hacky, but way more feasible.

The book says there are two right ways to develop a domain-specific language (design the language from the beginning or notice that a specification file format looks more and more like a DSL; and a wrong way (extend the program towards a DSL).
We see several examples of DSL: make, awk, m4 (which I have never used), glade, troff, yacc, lex, etc.

Quote:There are a few embarrassing exceptions; notably, tbl(1) defaults to using a tab as a field separator between table columns, replicating an infamous botch in the design of make(1) and causing annoying bugs when editors or other tools invisibly change the composition of whitespace.
Chatting with freem, he convinced me that tab should be used for TABles.
I do not consider tbl(1), make(1) and, in my case, xprompt(1) and xmenu(1) to have a botched input format anymore.

The book gives a pic(1) example, I remember the first time I have to use pic(1) and troff(1), in an attempt to get rid of LaTeX.

From the examples in the book, I use awk, troff and pic. m4 and PostScript are the ones I want to learn.


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - venam - 05-06-2021

It's Saturday folks!

Chapter 9: Generation

Still in line with the theme of the book, the chapter is about taking
a data-driven approach to programming, focusing on data representation
that can easily be visualized and understood instead of the algorithm.

The tree-like structure of the config reminds me of ALSA
configuration. However, ALSA conf is very opaque, and is "incidentally"
Turing complete.

The ascii table reminds me of the talk by mort in the nixers conf 2020 about
including binary data in C executables.

Chapter 10: Configuration

I like how configurations are described as the "startup environment",
it makes it way more explicit what this is about and how we should think
about it.
Unless the configuration is continuously read, but then the question is:
Would that be a good use of configuration?

The big question of the chapter is: What should be configurable? And the
Unix instant reply would be "everything" but that is too overwhelming
to novice.
So the author goes to ask the reverse instead, "what thing should not
be configurable?"

Then, again with the theme, we take an approach of least surprise,
no burden, and human-first.

"Run-control files" is a much cooler term for startup config files
and dotfiles.

Environment variable are also well described as "global context, what
is available, and should be available, everywhere".

On the command line option, there's another category that wasn't considered:

The "sub tools category", the programs that start with a name such
as openssl and then followed by a sub command. The sub command itself
takes flags.
I can thing of git, openssl, imagemagic fork called graphicsmagick "gm",
busybox? and others..

There's also the "split programs framework" approach: When a "software"
installs a bunch of other sub-software/tools as framework.
Either all these software in the "framework" start with the same
prepended form, fc- for font config for example. Or they come
under a single tool to avoid confusion.
Sometimes it's annoying if a software installs subtools in that
"framework" but they all have very different name, it makes it hard
to find related functionalities, discoverability is very low.
Again, I'd say this is an observation in tune with the book: ease of
use for human-brains.

The list of single char flags with their expected meaning should
definitely be consulted when writing CLI software.


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - seninha - 05-06-2021

Chapter 9 is a short chapter about the rule of representation
(fold knowledge into data, so program logic can be stupid and robust).

The chapter can be summarized by its final paragraph:

Quote:Do as little work as possible. Let the data shape the code. Lean on your tools. Separate mechanism from policy. Expert Unix programmers tend to see possibilities like these quickly and automatically. Constructive laziness is one of the cardinal virtues of the master programmer.

Chapter 10 is about configuration, in particular startup-environment queries, that is, configuration read from the environment at program initialization.

Some of my programs are configurable, so I will share my experience and thoughts on them on this topic.

The chapter begins stating what should (not) be configurable.
• Do not provide configuration for what can be detected at runtime.
• Do not provide configuration for optimization.
• Do not provide configuration for what can be done with other programs (via a pipeline, for example).
• Consider not providing configuration for cosmetic interface features.
• Consider not providing configuration that can be integrated in the program's normal behavior in an innocuous way that would make the option unnecessary.
• Consider not providing configuration.

On the first and second items, xmenu provides a optimization option switch (-i) to disable image cache initialization when the user is not using icons and then start up faster.
In fact, I could detect whether a icon is specified in the input and, if it is, initialize the image cache; and if no icon is being specified, do not initialize the cache and then run faster.
I'll remove the -i option. It is an optimization option that can be detected at runtime.

After xmenu got popular, I have been including several features users are asking for. And not implementing features is hard.

Also, proliferating options leads to a more complex test coverage.
Quote:Unless it is done very carefully, the addition of an on/off configuration option can lead to a need to double the amount of testing. Since in practice one never does double the amount of testing, the practical effect is to reduce the amount of testing that any given configuration receives. Ten options leads to 1024 times as much testing, and pretty soon you are talking real reliability problems.

The chapter then explains where configuration should be read from.
• Run-control files (in /etc and dotfiles in ~/).
• Environment variables.
• Argument options.

xmenu also reads from X-specific environment variables, the X resources.

Quote:When thinking about which mechanism to use to pass configuration data to a program, bear in mind that good Unix practice demands using whichever one most closely matches the expected lifetime of the preference. Thus: for preferences which are very likely to change between invocations, use command-line switches. For preferences which change seldom, but that should be under individual user control, use a run-control file in the user's home directory.

I had that inside when writing xprompt. Font, color and interface options do not change between invocations, so I read them only via X resources. The prompt mode (argument mode (-a) and password mode (-p)) are more likely to change between invocations, so I read them only via command line options.

In special, I tend to avoid the first configuration source (run-control files). They require implementing a parser and are only necessary for complex option systems; if the program option system needs only a couple on/off switches and a few values, environment variables (or X resources) should be used.


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - seninha - 05-06-2021

(05-06-2021, 03:25 AM)venam Wrote: The "sub tools category", the programs that start with a name such
as openssl and then followed by a sub command. The sub command itself
takes flags.
I can thing of git, openssl, imagemagic fork called graphicsmagick "gm",
busybox? and others..

There's also the "split programs framework" approach: When a "software"
installs a bunch of other sub-software/tools as framework.
Either all these software in the "framework" start with the same
prepended form, fc- for font config for example. Or they come
under a single tool to avoid confusion.
Sometimes it's annoying if a software installs subtools in that
"framework" but they all have very different name, it makes it hard
to find related functionalities, discoverability is very low.
Again, I'd say this is an observation in tune with the book: ease of
use for human-brains.

plan9 does subtools in a very elegant way: as binaries in sub-directories in your $path (which, in plan9, is merged into /bin/).
Rather than a master command git(1) have subcomands clone, log, etc; they are all grouped in a subdirectory at /bin (/bin/git). The system will look on subdirectories at /bin/ and call the binary accordingly. Here's an example from 9front port of git:

Code:
git/clone git://git.eigenstate.org/ori/mc.git
git/log
cd subdir/name
git/add foo.c
diff bar.c $repo/.git/fs/HEAD/
git/commit foo.c
git/push



RE: Nixers Book Club - Book #4: The Art of UNIX Programming - ckester - 05-06-2021

(05-06-2021, 09:49 AM)phillbush Wrote: The chapter then explains where configuration should be read from.
• Run-control files (in /etc and dotfiles in ~/).
• Environment variables.
• Argument options.

This is probably an appropriate place to ask: What do you guys think of suckless-style config.h files and compile-time-only configuration?

It's an approach that will frustrate users who aren't coders (admittedly not the target audience) but I kinda like it from a minimalist perspective.

They're an alternative to run-control or dotfiles for preferences which aren't likely to change between invocations and have the advantage of not requiring parsing routines in the code.

The main disadvantage I can see is, even if all the users are coders capable of making the desired changes themselves, they will each have their own copy of the program. This might not be a problem, however, given the storage space available on most systems today.


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - seninha - 05-06-2021

(05-06-2021, 02:12 PM)ckester Wrote:
(05-06-2021, 09:49 AM)phillbush Wrote: The chapter then explains where configuration should be read from.
• Run-control files (in /etc and dotfiles in ~/).
• Environment variables.
• Argument options.

This is probably an appropriate place to ask: What do you guys think of suckless-style config.h files and compile-time-only configuration?

It's an approach that will frustrate users who aren't coders (admittedly not the target audience) but I kinda like it from a minimalist perspective.

They're an alternative to run-control or dotfiles for preferences which aren't likely to change between invocations and have the advantage of not requiring parsing routines in the code.

The main disadvantage I can see is, even if all the users are coders capable of making the desired changes themselves, they will each have their own copy of the program. This might not be a problem, however, given the storage space available on most systems today.

Parsing configuration files requires more code. I can see the advantage of using a config.h for a project that privileges simple code. However, the system provides configuration methods that do not require parsing: command-line arguments, environment variables and X resources, solutions which are not explored by suckless tools (except by patches). In my programs, I explore those methods for configuration.

Also, as movq said
Quote:I think config.h only makes sense for programs with very few external libraries.
Which is the case for suckless tools.


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - venam - 12-06-2021

It's this time of the week again!


Chapter 11: Interfaces


I like how interfaces are defined as "the ways programs communicate with
humans and machines".
As with anything, we go with the rule of least surprise: “Do the least
surprising thing” because it's easier for humans. Always got to say that it's
for us.
The recommended book "The Humane Interface" sounds like a nice book to
read. I found some notes about it here: http://cs.brown.edu/courses/cs092/2005/bell.raskin.pdf

I like that the author still mentions the following, because it's important:
Quote:The Rule of Least Surprise should not be interpreted as a call for
mechanical conservatism in design.

The basic metrics to categorize interface styles are: concision,
expressiveness, ease, transparency, and scriptability.
There's a lot of discussion on the upside/downsides of focusing on each
metric, it's a good review.
Especially Section 6 in chapter 11, it's a must-read classic that lists all the Unix
cli patterns.

One thing that the source, sink, cantrip, etc.. reminds me in the media
world is the audio API of PipeWire: a chain of audio/media processing,
literally a pipeline.

About the Spooler/Daemon Pair: I'm probably the only person that uses
atd as a notification/alarm clock.

On the silence thing, I'm reminded of the concept of application posture.
We don't want parasitic applications.


Chapter 12: Optimization

I guess the overall take-away would be: "Do nothing" (if you can afford to do/not-do so).

On profiling, I'm reminded of POGO.


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - seninha - 12-06-2021

Chapter 11 touches again the topic of interfaces.

The chapter compares CLI with GUI. I do not like this comparison, instead I want to compare whether the program is inherently interactive, and thus cannot be scriptable, or not. Games, editors and browsers are inherently interactive, and a GUI (or TUI/curses) interface is better than a CLI.

The chapter then lists unix interface design patterns for scriptable programs.

• The first pattern is the filter. The author declares that filters are non-interactive, but nowadays there are interactive filters, like fzf(1) and dmenu(1). He then enumerates the rules for filters. Cat-like filters are filters that can get input from stdin or from a list of files passed as arguments (which can include the stdin itself, if the file name is “-”).
• Then there is the “cantrip” interface pattern, used for program that reads no input and produces no output, but have a side effect (like rm(1), touch(1), etc).
• The source interface pattern gets data from the environment, not from its input, and generate an output. The sink interface are programs that read input and generate no output, but change the environment. Examples are lpr(1) and mail(1). The author mentions “sponge” as a program that reads all input before process it, there is also a program called sponge that takes its input and save on a file. Very useful.
• Compiler interface, like the cantrip, change the environment, but it actually reads a input from a file and writes the output to a file. Examples are tar(1) and cc(1).
• ed interface gets it input from the user, so they are interactive Examples are ed(1) and gdb(1). Those programs are barely scriptable.

I made this small table

Code:
UNIX interface design patterns for scriptable programs:
┌────────────────────────────────────────────────────────────────────────────┐
│ Interface    Read input    Write output    Change environment    Examples  │
├────────────────────────────────────────────────────────────────────────────┤
│ Filter       ✓             ✓               ✗                     cat, grep │
│ Cantrip      ✗             ✗               ✓                     rm, touch │
│ Source       ✗             ✓               ✗                     ls, ps    │
│ Sink         ✓             ✗               ✓                     lpr, mail │
│ Compiler     From file     To file         ✓                     tar, cc,  │
│ ed           From user     To user         ✓                     ed, gdb   │
└────────────────────────────────────────────────────────────────────────────┘

The roguelike interface pattern is not scriptable. Such programs rely on complex keyboard commands that appeals to touch-typists who don't like to move the hand from keyboard to mouse. Examples are nethack(1), vi(1) and emacs(1).

Then the author list some interface patterns for design patterns that separate engine from interface (or mechanism from policy).

Chapter 12 details the rule of optmization (Protogype before polishing. Get it working before you optmize it).

The lessons in the chapter can be summarized in a Donald Knuth quote:

Quote:Premature optimization is the root of all evil



RE: Nixers Book Club - Book #4: The Art of UNIX Programming - seninha - 19-06-2021

Chapter 13 is about software complexity.

The chapter breaks up complexity by their kinds and sources and analyses each combination.

In my case, I tend to err to the side of interface simplicity, even when that means a complex implementation.
I think that a simple interface helps tools to be composable and have a cleaner interface.

The chapter then analyses five different text editors and their complexity.

The Rule of Minimality says that a program should be simple within the boundaries of the framework and environment it lives on.
A GUI program can be simple in the boundaries of a desktop environment.

Quote:Choose the shared context you want to manage, and built your programs as small as those boundaries will allow.

Chapter 14 analyses different languages and the choice to use one of them. It also analyses programs written in each one of them.

On the languages and toolkits covered by the chapter, I'm interested in Tcl/Tk. I always heard about they and their uses.
The chapter also lists learning resources for each language.


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - venam - 19-06-2021

Chapter 13: Complexity

This chapter reiterates some concepts that are intuitive to most people
in the field. Things related to how we define complexity, from the
viewer's eyes. Be it programmers, end-users, or via static analysis.
We also see the usual categorization of types of complexity: essential,
optional, accidental.

We're then visiting the classic editor war and bringing the question of
which feature editors should have (or not).

Chapter 14. Languages

Personally, this chapter wasn't something new to me, but I'd definitely
recommend it to people that have stiff ideologies about their programming
language usage.

Quote:Warning: Choice of application language is one of the archetypal religious
issues in the Internet/Unix world.

Quote:To apply the Unix philosophy effectively, you'll need to have more than
just C in your toolkit.

> Why not C?

The answer, as with a lot of things in the book isn't a clear yes or no
but a case by case, a more in-depth discussion than thought-terminating
clichés.

Here the author emphasize, yet again, the concept of human-first approach
and whether C will fill it or not.
I also realized I've used all the languages listed in the
evaluation... that's interesting. But I think today there would need to
be an update that includes languages such as Go, Rust, NodeJS, etc..


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - venam - 26-06-2021

Alright, time for the book club update - this time from the TTY!

Chapter 15 and 16 were interesting, especially that these chapters start
tackling day to day use and "methodology".


Chapter 15: Tools

In this chapter we put ourselves in the boots of new users, devs,
selecting their toolset to do common tasks on Unix.


Quote:Finding and assembling them into a kit that suits your needs has
traditionally taken considerable effort.

This reminds me of the classic tejr's post: https://blog.sanctum.geek.nz/series/unix-as-ide/

Quote:yacc has a rather ugly interface, through exported global variables
with the name prefix yy_. This is because it predates structs in C;
in fact, yacc predates C itself; the first implementation was written
in C's predecessor B.

I didn't know that.

Quote:In the mid-1980s it was fairly common for large Unix program distributions
to include elaborate custom shellscripts that would probe their
environment and use the information they gathered to construct custom
makefiles. These custom configurators reached absurd sizes.

We still got one of these build.sh for some projects at work.

This makes me realize I've never actually learned any of these
Makefiles generator, I probably should one day. Does anyone have a
recommendation? It seems meson is the way to go today.

On the VCS, that always reminds me of this thread:
https://nixers.net/Thread-Comparing-the-single-file-efficiency-of-version-control-systems
...And there's no mention of git either.

Chapter 16: Reuse

Philosophy of the rule of least action: don't redo something if someone
has already done it. Unless you want to learn how it was done or if you have
time to waste.

Quote:Newbie is growing horribly frustrated. He had heard in college that
in industry, a hundred lines of finished code a week is considered
good performance. He had laughed then, because he was many times more
productive than that on his class projects and the code he wrote for
fun. Now it's not funny any more. He is wrestling not merely with
his own inexperience but with a cascade of problems created by the
carelessness or incompetence of others problems he can't fix, but
can only work around.

Newbie is learning a lesson; the less he relies on other peoples' code,
the more lines of code he can get written. This lesson feeds his ego. Like
all young programmers, deep down he thinks he is smarter than anyone
else. His experience seems, superficially, to be confirming this. He
begins building his own personal toolkit, one better fitted to his hand.

They have the drives and needs of artists, including the desire to have
an audience.

So true, I find the NIH complex description to be on point. There's a
lot of that, always has been a lot of that.

Quote:People from outside the Unix world (especially non-technical people)
are prone to think open-source (or free) software is necessarily
inferior to the commercial kind, that it's shoddily made and unreliable
and will cause more headaches than it saves.

I think that this mentality is slowly changing in many parts of the world.

Quote:They may lack polish and have documentation that assumes much, but the
vital parts will usually work quite well.

It's also a good omen when the software has its own Web page, on-line
FAQ (Frequently Asked Questions) list, and an associated mailing list
or Usenet newsgroup. These are all signs that a live and substantial
community of interest has grown up around the software. On Web pages,
recent updates and an extensive mirror list are reliable signs of a
project with a vigorous user community. Packages that are duds just don't
get this kind of continuing investment, because they can't reward it.

Documentation is often a more serious issue. Many high-quality open-source
packages are less useful than they technically ought to be because they
are poorly documentated.

We're here to fill the docs!
I agree that the community around a software or framework/library,
users and devs is what shows its alive.

On the "where to look": these days there's github, gitlab and blogs,
and things are easily found on search engines, it's booming.


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - seninha - 26-06-2021

Chapter 15 and 16 are the last ones of the Implementation part of the book.

Chapter 15 introduces the tools on the UNIX programming environment.
It touches again in the topic of editors, this time especially on Vi(m) and Emacs.

Then, it's introduced the code generators lex and yacc (and their relatives to other languages).
I didn't know that yacc predates structs in C, nor that structs were added later to the language.

I have no knowledge of makefile generators. I use plain makefiles while trying to make them portable, if necessary adding a config.mk file to tune the rules. Four makefile generators are listed: X11 makedepend and imake, and GNU autoconf and automake. I have used none of them.

On the version control topic, it lists the classic UNIX vcs: SCCS, RCS and CVS. Both Git and Mercurial, the most common vcs in use today, are not listed. They were both released in 2005, and the book was published in 2003.

Chapter 16 is about code reuse and the benefits of open source software.
It covers topics such as licensing, documentation, and where to look for other's code (of the places listed, only sourceforge still exists, and it is losing space to github and other git services).


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - venam - 03-07-2021

Time for the update!
I haven't taken much notes on these ones but still found them somewhat useful.
Next week will be the last 2 chapters of the book, so we might want to start brainstorming ideas for the next one.


Chapter 17: Portability

This chapter talk about Unix and its tooling/language portability.
It goes through the history of C, from its inception, standard creation
and others. I'm not so interested in this but it's still nice to read.
It goes through a similar discussion about the story of standards.

I didn't know about gettext, weirdly or coincidentally, in a lot of
software the translation layer has the same name.

Chapter 18: Documentation

Quote:I've never met a human being who would want to read 17,000 pages of
documentation, and if there was, I'd kill him to get him out of the
gene pool. -- Joseph Costello

Welcome to nixers. (kidding)

The survey of docs format, or as the author calls them Zoo of docs
formats, shows how much of a mess it is. However, in my opinion, it also
shows that there's flexibility an options.

For me, any documentation is good documentation, as long as its
understandable, and present.


Two quotes especially caught my attention.

Quote:Most software documentation is written by technical writers for the
least-common-denominator ignorant — the knowledgeable writing for
the knowledgeless. The documentation that ships with Unix systems has
traditionally been written by programmers for their peers. Even when it
is not peer-to-peer documentation, it tends to be influenced in style
and format by the enormous mass of programmer-to-programmer documentation
that ships with Unix systems.

Quote:The advice we gave earlier in the chapter about reading Unix documentation
can be turned around. When you write documentation for people within
the Unix culture, don't dumb it down. If you write as if for idiots,
you will be written off as an idiot yourself. Dumbing documentation
down is very different from making it accessible; the former is lazy and
omits important things, whereas the latter requires careful thought and
ruthless editing.

These are perfectly said. It's very hard to explain something in an
approachable way, while still going into the right technicalities.

I think the author predicted this well, HTML/XML won, and local docs are
dwindling. Many tools don't even come with a manpage anymore.
We're often left with autogenerated pages based on the code.

And on the docbook dizzying conversion stuff, we also got pandoc, which
is pretty good.


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - seninha - 03-07-2021

This has been a busy week, I had no time to read with attention and write notes as I read. I'm compiling what I remember now.

Chapter 17 is about standards.

It begins with the history of C standards, from early C to K&R C, to C89 and C99.
Then comes the history of UNIX standards, from the early UNIX wars to POSIX, XPG and SUS
The chapter continues with the standards-as-DNA attitude and how UNIX programs are durable through time as they are portable through hardwares.
It ends with the portability of each language.

Quote:Linux changes so fast that any given release of a distribution would probably be obsolete by the time it could get certified.
And that's probably one reason why so few Linuxes are certified as UNIX.

Chapter 18 is about documentation.

The author enumerates the available zoo of documentation formats and the styles of documentation.
WYSIWYG and hyperlinking are explored. Traditional UNIX documentation lacks both, and also lacks online/on-program documentation, except for some programs such as vim.
WYSIWYG is something I do not care for writing documentation, but TROFF syntax does not resamble at all what you get in the end.

Hyperlinking is good, but can be abused. When the user come across a lot of hyperlinks, both through the document itself and to other documents, it may give the false impression that all of them are important. I call this “hyperlink hell”. It can be solved with a good index and a SEE ALSO/REFERENCES section.


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - jkl - 03-07-2021

I wonder what C standards will be named in 60 years from now. C11 is - in theory - before C89.


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - ckester - 04-07-2021

(03-07-2021, 12:14 PM)jkl Wrote: I wonder what C standards will be named in 60 years from now. C11 is - in theory - before C89.

Any C programmer worth his salt should have no problem with a circular list, modulo 100 and sparsely implemented. If there's ever a naming conflict, just hash it.

;-)


RE: Nixers Book Club - Book #4: The Art of UNIX Programming - venam - 10-07-2021

Chapter 19: Open Source

This chapter emphasize what it means to develop in the "open" and some
of the practices that developped around this.

Quote:Release early, release often. A rapid release tempo means quick and
effective feedback. When each incremental release is small, changing
course in response to real-world feedback is easier.

Some people have taken this religiously today. A lot of the new languages upgrade often, every months, sometimes breaking backward compatibility. Mort shared on IRC this zero ver, which is relevant.

Quote:Reward contribution with praise. If you can't give your co-developers
material rewards, give psychological ones. Even if you can, remember that
people will often work harder for reputation than they would for gold.

That one has weirdly gone in the reverse direction...

The patch section is great, but a lot of the issues are now fixed with git.

And the autotools tutorial I
was looking for
in the previous sections.
On that, I recently started learning the meson build system.

That whole page is definitely a must read if you want to start a serious
open source project. It covers a lot of aspects that aren't obvious.

Chapter 20: Future


The future of Unix, I think it's something we've discussed multiple
times on the forums, from the interface perspective, to new tools and
other new software.

We can't miss the chapter on Plan9. It discusses some features that
inspired other OS, such as union filesystems in this case.

Quote:The most dangerous enemy of a better solution is an existing codebase
that is just good enough.

Quote:These are economic problems. We have other problems of a more political
nature, because success makes enemies.

This is so true.

On the future cultural aspect of Unix, I think we're seeing the change
happening in real time today. New thinking/software always grinds with people.

Quote:The raucous energy, naïvete and gleeful zealotry of the Linux kids
sometimes grates on elders who have been around since the 1970s and
(often rightly) consider themselves wiser. It's only exacerbated by the
fact that the kids are succeeding where the elders failed.

And that concludes this book.
It was a pretty good one, I had read it in the past but a lot of ideas
still hold and are nice to revisit.