UNIX Diary - Psychology, Philosophy, and Licenses

Users browsing this thread: 5 Guest(s)
fre d die
Members
Dear Unix diary,
last thursday i installed gentoo sucsessfully for the first time. last time i tried was probably 2017 and always ran into problems at with grub. i recently decided its time to get rid of my mac and get a thinkpad. writing this from my thinkpad with gentoo! loving it so far. less problems than i expected... so far...
seninha
Long time nixers
Dear Unix diary,
That's the first time I am logged in in X with the window manager I wrote.
Until now I have only used it with Xephyr (a utility that simulates a X session in a window).

It was a hard task to write a window manager from scratch (although I stole some routines from dwm). I had to rewrite it twice, and it still has old code from the first versions (I still have to look through the code for them and eliminate them).

I decided to write a wm because I couldn't find one that fulfilled my needs.
For now, it is EWMH compliant (only the best parts) and supports multi-monitor (although it segfaults when I disconnect a monitor, but I'm working on that).
It was fun to write it and I passed the last four weeks in this project.

It is not usable for others though, I still have to document it and work more on some edge cases, but I think that it is going to replace cwm as my daily driver.

There are some features I need to add to it, such as using mouse to move/resize windows (for now I'm using wmutils' xmmv and xmrs for this). But it is totally usable (for me, at least).
movq
Long time nixers
Dear Unix diary,

I thought about adding a "bookmark" feature to my local diary tool. You know, a feature that allows you to jump to a particular entry in your diary.

Then I remember an old trick at the shell:

Code:
$ gitary edit 2020-09-04/14-35-12  # track project
$ gitary edit 2020-09-09/13-51-06  # letter sarah

Just add a comment at the end of the line. Now I can hit Ctrl-R, type "track project" or "letter sarah", and thus jump directly to one of the entries.

Is this better? Is this "the spirit"? At least it's less code for me to write.
z3bra
Grey Hair Nixers
Be it the spirit or not, that's clever !
jkl
Long time nixers
"The spirit" would have more pipes... ;-)

--
<mort> choosing a terrible license just to be spiteful towards others is possibly the most tux0r thing I've ever seen
movq
Long time nixers
Dear UNIX diary,

today, I had fun with file descriptor flags at work.

I had a Python script that failed with `EAGAIN` while doing a `print()` on `stdout`. Say what?

Looking at `write(2)`, we can see this:

Code:
EAGAIN The  file  descriptor  fd  refers to a file other than a
       socket and has been marked nonblocking (O_NONBLOCK), and
       the  write would block.  See open(2) for further details
       on the O_NONBLOCK flag.

EAGAIN or EWOULDBLOCK
       The file descriptor fd refers to a socket and  has  been
       marked  nonblocking  (O_NONBLOCK),  and  the write would
       block. [...]

You can trigger the error using the following snippet of C code:

Code:
#include <fcntl.h>
#include <stdio.h>

int
main()
{
    int i;

    if (fcntl(1, F_SETFL, O_NONBLOCK) == -1)
    {
        perror("fcntl");
        return 1;
    }

    for (i = 0; i < 4 * 8192; i++)
        printf("hello world");  /* no trailing \n */

    return 0;
}

Just run it in a terminal, maybe under `strace` to see what’s going on. It goes something like this:

Code:
17:53:50.053752 write(1, "ello worldhello worldhello world"..., 1024) = 1024
17:53:50.053779 write(1, "llo worldhello worldhello worldh"..., 1024) = 1024
17:53:50.053949 write(1, "rldhello worldhello worldhello w"..., 1024) = -1 EAGAIN (Resource temporarily unavailable)
17:53:50.053975 write(1, "hello worldhello worldhello worl"..., 1024) = -1 EAGAIN (Resource temporarily unavailable)
17:53:50.054436 write(1, "hello worldhello worldhello worl"..., 1024) = 1024
17:53:50.054569 write(1, " worldhello worldhello worldhell"..., 1024) = 1024
17:53:50.054596 write(1, "worldhello worldhello worldhello"..., 1024) = -1 EAGAIN (Resource temporarily unavailable)
17:53:50.054620 write(1, "hello worldhello worldhello worl"..., 1024) = -1 EAGAIN (Resource temporarily unavailable)
17:53:50.054645 write(1, "hello worldhello worldhello worl"..., 1024) = -1 EAGAIN (Resource temporarily unavailable)

libc does some buffering and then eventually tries to write chunks of 1024 bytes. Since we set the file descriptor to nonblocking, the syscall immediately returns and doesn’t wait for the data to be actually written to its final destination. So, I guess, there’s another kernel-internal buffer. When that one is full, we get EAGAIN.

My original Python script was indeed writing lots of data. So, somehow, my `stdout` must have been set to nonblocking mode. I checked the code, but I couldn’t find anything that did this. But okay, maybe Python does this internally for some weird reason? I tried to reproduce this in isolated test cases, but nope, I couldn’t see Python doing an `fcntl()`.

Alright, what else does my script do? It forks and runs ssh. Mhm. But I used Python’s `check_output()`, so ssh’s `stdout` was a newly created pipe. At first, I couldn’t see how ssh could possibly botch my `stdout`?!

Then another idea: This script was being run as a systemd service. Meaning, the `stdout` we’re talking about is actually a socket which is connected to journald. So maybe systemd screws up and sets this socket to nonblocking for whatever reason … ? But I couldn’t verify that theory, either.

And then it dawned on me: systemd connects the same socket to both `stdout` and `stderr`. So, if ssh did something to its `stderr`, that could affect my Python script’s `stdout`.

Let’s give this a shot:

Code:
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>

int
main()
{
    int ret, i, fd;

    if (fork() == 0)
    {
        fd = open("/dev/null", O_WRONLY);
        dup2(fd, 1);
        close(fd);

        execlp("strace", "strace", "-o", "ssh.log",
               "ssh", "myserver.example.com", "date",
               (char *)0);
        return 1;
    }

    for (i = 0; i < 10; i++)
    {
        if ((ret = fcntl(1, F_GETFL)) < 0)
        {
            perror("fcntl");
            return 1;
        }
        if (ret & O_NONBLOCK)
            puts("Is NONblocking");
        else
            puts("Is blocking");

        sleep(1);
    }

    return 0;
}

And here’s the glorious output, `stdout` and `stderr` in the parent being the same pipe:
Code:
$ ./borked 2>&1 | cat
Is blocking
Is NONblocking
Is NONblocking
Is NONblocking
Is NONblocking
Is NONblocking
Is NONblocking
Is NONblocking
Is NONblocking
Is NONblocking

The child process did close its original `stdout` (got reopened from /dev/null), but `stdout` in the parent process still got changed.

Now, the original environment uses ssh multiplexing, so I used that in my now isolated test case as well. I reproduced it without multiplexing as well, I put that case at the end of the posting.

The ssh process I’ve spawned transfers the three standard file descriptors to the main multiplexing process:

Code:
19:11:32.186688 sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0", iov_len=1}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[0]}], msg_controllen=24, msg_flags=0}, 0) = 1
19:11:32.186753 sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0", iov_len=1}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[1]}], msg_controllen=24, msg_flags=0}, 0) = 1
19:11:32.186795 sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0", iov_len=1}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[2]}], msg_controllen=24, msg_flags=0}, 0) = 1

And that process, in turn, does this:

Code:
19:11:32.186811 recvmsg(5, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0", iov_len=1}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[6]}], msg_controllen=24, msg_flags=0}, 0) = 1
19:11:32.186853 recvmsg(5, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0", iov_len=1}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[7]}], msg_controllen=24, msg_flags=0}, 0) = 1
19:11:32.186903 recvmsg(5, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0", iov_len=1}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[8]}], msg_controllen=24, msg_flags=0}, 0) = 1
19:11:32.186941 ioctl(6, TCGETS, {B38400 opost isig icanon echo ...}) = 0
19:11:32.186976 ioctl(7, TCGETS, 0x7ffdce246db0) = -1 ENOTTY (Inappropriate ioctl for device)
19:11:32.187004 fcntl(7, F_GETFL)       = 0x8000 (flags O_RDONLY|O_LARGEFILE)
19:11:32.187030 fcntl(7, F_SETFL, O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 0
19:11:32.187054 ioctl(8, TCGETS, 0x7ffdce246db0) = -1 ENOTTY (Inappropriate ioctl for device)
19:11:32.187079 fcntl(8, F_GETFL)       = 0x1 (flags O_WRONLY)
19:11:32.187104 fcntl(8, F_SETFL, O_WRONLY|O_NONBLOCK) = 0    <-------- TADA

There you have it. It sets fd 8, which is the `stderr` it received, to nonblocking. Since that is the same pipe as the parent process’s `stdout`, both are changed.

Duplicated file descriptors share some things, but not everything. `dup(2)` says:

Code:
After a successful return, the old and new file descriptors may
be used interchangeably.  They refer to the same open file  de‐
scription  (see  open(2))  and  thus share file offset and file
status flags; [...]

The two file descriptors do not  share  file  descriptor  flags [...]

And “file status flags” are those set by `fcntl(..., F_SETFL, ...)`, but “file descriptor flags” are something else:

Code:
F_SETFL       Set the file status flags, defined in  <fcntl.h>, [...]

F_SETFD       Set  the  file  descriptor  flags defined in [...]

Now the funny thing is, you might say: “Well, just don’t spawn processes and let them inherit your file descriptors. Of course they can do nasty things.” Yeah, but this was actually done intentionally: My Python script spawned ssh and left its `stderr` untouched in an effort to automatically have ssh’s error messages end up at the same destination (i.e., systemd journal) as my Python logs. I now have to manually collect and handle ssh’s `stderr`, which is more code and more error prone.

Alright, so let’s quickly look at the case where we don’t use ssh multiplexing. At first, I thought this didn’t trigger the “bug”, but it indeed does, it’s just a little harder to spot. Here’s a partial strace log:

Code:
20:05:14.074435 dup(0)                  = 4
20:05:14.074505 dup(1)                  = 5
20:05:14.074532 dup(2)                  = 6
...
20:05:14.074674 fcntl(6, F_GETFL)       = 0x1 (flags O_WRONLY)
20:05:14.074691 fcntl(6, F_SETFL, O_WRONLY|O_NONBLOCK) = 0

... then at the very end:
20:05:14.167997 close(6)                = 0
20:05:14.168232 fcntl(2, F_GETFL)       = 0x801 (flags O_WRONLY|O_NONBLOCK)
20:05:14.168249 fcntl(2, F_SETFL, O_WRONLY) = 0
20:05:14.168430 exit_group(0)           = ?

In other words, ssh resets the flags in this case. My ssh command finished very quickly, so I didn’t see a “NONblocking” line at first.

All this is heavy usage of strace. I love this tool. I haven’t read ssh’s source code, though, that might be interesting as well. But that’s something for another day.
movq
Long time nixers
Dear UNIX diary,

today I’m angry with you.

I have a cronjob that runs `lsblk -S` and pipes it to `awk`. `awk` then matches some of the fields: It only prints those lines where the `TYPE` column is `disk` and the `TRAN` column is `sata`. For matching lines, it shows the `MODEL` and `SERIAL` column. All this is an effort to find all hard disks and SSDs in the system, and generate their names in `/dev/disk/by-id`. So, for example, for my main SSD, the expected result is `/dev/disk/by-id/ata-Samsung_SSD_840_PRO_Series_XXX123SERIAL`.

This has been working fine for a long time.

Two months ago (!), it silently started to fail. Why? Because `lsblk` started to include spaces the `MODEL` columns. `Samsung_SSD_840_PRO_Series` became `Samsung SSD 840 PRO Series`. Of course, `awk` no longer matched these lines, because it was checking `$8 == "sata"`, but `$8` is now `PRO`.

So my cronjob didn’t operate on some of the disks anymore and it didn’t produce an error, either. It just didn’t do anything. Bah!
venam
Administrators
Dear UNIX diary,

Today I learned, because it's good to be humble, that IPv6 addresses can be built automatically from the mac address (something called EUI-64). I'm wondering if this isn't too much information disclosure though, because the mac address can inform you about the manufacturer.

Another thing I've sort of learned, or more got a revelation about, is that you can run multiple services on the same port on the same machine... they just need to bind to different addresses. What clicked was that these addresses can even be incidentally local/loopback (ex: 127.0.0.1 and 127.0.0.2). I just never realized, even though it's kind of obvious, because we always say "check if something is running on that port", not "check if something is running on the combination of address and port".
This can be really useful when running multiple virtual machines and NATing to external IPs.

EDIT: I'm also getting on learning K8s, it's not bad thus far!

What did you learn today?
z3bra
Grey Hair Nixers
(17-12-2021, 03:49 PM)venam Wrote: I'm wondering if this isn't too much information disclosure though, because the mac address can inform you about the manufacturer.

This is a legitimate concern, and for that reason, most OSes switched to randomized identifiers rather than EUI-64 by default. There is a "risk" of collision (2^64 in your typical /64 ipv6 subnet), and that's why you can still see DHCP servers, even in full ipv6 networks. Using EUI-64 guarantees that you won't get any collisions though, so here you gotta pick the less-evil for your SLAAC method.

(17-12-2021, 03:49 PM)venam Wrote: you can run multiple services on the same port on the same machine

And this is pretty awesome ! I use that "feature" to run 2 different DNS software on my servers. The internet facing one runs nsd (fully-authoritative for my personnal zones), while unbound (recursive/caching dns server) listens on the private side (wireguard + yggdrasil), so I can use them as my day-to-day DNS.
This feature can also be used when you got a service that can bind on different "socket" (the actual term for address + port) for user-facing and administration interfaces.