UNIX Diary - Psychology, Philosophy, and Licenses
Users browsing this thread: 3 Guest(s)
|
|||
Dear Unix diary,
last thursday i installed gentoo sucsessfully for the first time. last time i tried was probably 2017 and always ran into problems at with grub. i recently decided its time to get rid of my mac and get a thinkpad. writing this from my thinkpad with gentoo! loving it so far. less problems than i expected... so far... |
|||
|
|||
Dear Unix diary,
That's the first time I am logged in in X with the window manager I wrote. Until now I have only used it with Xephyr (a utility that simulates a X session in a window). It was a hard task to write a window manager from scratch (although I stole some routines from dwm). I had to rewrite it twice, and it still has old code from the first versions (I still have to look through the code for them and eliminate them). I decided to write a wm because I couldn't find one that fulfilled my needs. For now, it is EWMH compliant (only the best parts) and supports multi-monitor (although it segfaults when I disconnect a monitor, but I'm working on that). It was fun to write it and I passed the last four weeks in this project. It is not usable for others though, I still have to document it and work more on some edge cases, but I think that it is going to replace cwm as my daily driver. There are some features I need to add to it, such as using mouse to move/resize windows (for now I'm using wmutils' xmmv and xmrs for this). But it is totally usable (for me, at least). |
|||
|
|||
Dear Unix diary,
I thought about adding a "bookmark" feature to my local diary tool. You know, a feature that allows you to jump to a particular entry in your diary. Then I remember an old trick at the shell: Code: $ gitary edit 2020-09-04/14-35-12 # track project Just add a comment at the end of the line. Now I can hit Ctrl-R, type "track project" or "letter sarah", and thus jump directly to one of the entries. Is this better? Is this "the spirit"? At least it's less code for me to write. |
|||
|
|||
Be it the spirit or not, that's clever !
|
|||
|
|||
"The spirit" would have more pipes... ;-)
-- <mort> choosing a terrible license just to be spiteful towards others is possibly the most tux0r thing I've ever seen |
|||
|
|||
Dear UNIX diary,
today, I had fun with file descriptor flags at work. I had a Python script that failed with `EAGAIN` while doing a `print()` on `stdout`. Say what? Looking at `write(2)`, we can see this: Code: EAGAIN The file descriptor fd refers to a file other than a You can trigger the error using the following snippet of C code: Code: #include <fcntl.h> Just run it in a terminal, maybe under `strace` to see what’s going on. It goes something like this: Code: 17:53:50.053752 write(1, "ello worldhello worldhello world"..., 1024) = 1024 libc does some buffering and then eventually tries to write chunks of 1024 bytes. Since we set the file descriptor to nonblocking, the syscall immediately returns and doesn’t wait for the data to be actually written to its final destination. So, I guess, there’s another kernel-internal buffer. When that one is full, we get EAGAIN. My original Python script was indeed writing lots of data. So, somehow, my `stdout` must have been set to nonblocking mode. I checked the code, but I couldn’t find anything that did this. But okay, maybe Python does this internally for some weird reason? I tried to reproduce this in isolated test cases, but nope, I couldn’t see Python doing an `fcntl()`. Alright, what else does my script do? It forks and runs ssh. Mhm. But I used Python’s `check_output()`, so ssh’s `stdout` was a newly created pipe. At first, I couldn’t see how ssh could possibly botch my `stdout`?! Then another idea: This script was being run as a systemd service. Meaning, the `stdout` we’re talking about is actually a socket which is connected to journald. So maybe systemd screws up and sets this socket to nonblocking for whatever reason … ? But I couldn’t verify that theory, either. And then it dawned on me: systemd connects the same socket to both `stdout` and `stderr`. So, if ssh did something to its `stderr`, that could affect my Python script’s `stdout`. Let’s give this a shot: Code: #include <fcntl.h> And here’s the glorious output, `stdout` and `stderr` in the parent being the same pipe: Code: $ ./borked 2>&1 | cat The child process did close its original `stdout` (got reopened from /dev/null), but `stdout` in the parent process still got changed. Now, the original environment uses ssh multiplexing, so I used that in my now isolated test case as well. I reproduced it without multiplexing as well, I put that case at the end of the posting. The ssh process I’ve spawned transfers the three standard file descriptors to the main multiplexing process: Code: 19:11:32.186688 sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0", iov_len=1}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[0]}], msg_controllen=24, msg_flags=0}, 0) = 1 And that process, in turn, does this: Code: 19:11:32.186811 recvmsg(5, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0", iov_len=1}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[6]}], msg_controllen=24, msg_flags=0}, 0) = 1 There you have it. It sets fd 8, which is the `stderr` it received, to nonblocking. Since that is the same pipe as the parent process’s `stdout`, both are changed. Duplicated file descriptors share some things, but not everything. `dup(2)` says: Code: After a successful return, the old and new file descriptors may And “file status flags” are those set by `fcntl(..., F_SETFL, ...)`, but “file descriptor flags” are something else: Code: F_SETFL Set the file status flags, defined in <fcntl.h>, [...] Now the funny thing is, you might say: “Well, just don’t spawn processes and let them inherit your file descriptors. Of course they can do nasty things.” Yeah, but this was actually done intentionally: My Python script spawned ssh and left its `stderr` untouched in an effort to automatically have ssh’s error messages end up at the same destination (i.e., systemd journal) as my Python logs. I now have to manually collect and handle ssh’s `stderr`, which is more code and more error prone. Alright, so let’s quickly look at the case where we don’t use ssh multiplexing. At first, I thought this didn’t trigger the “bug”, but it indeed does, it’s just a little harder to spot. Here’s a partial strace log: Code: 20:05:14.074435 dup(0) = 4 In other words, ssh resets the flags in this case. My ssh command finished very quickly, so I didn’t see a “NONblocking” line at first. All this is heavy usage of strace. I love this tool. I haven’t read ssh’s source code, though, that might be interesting as well. But that’s something for another day. |
|||
|
|||
Dear UNIX diary,
today I’m angry with you. I have a cronjob that runs `lsblk -S` and pipes it to `awk`. `awk` then matches some of the fields: It only prints those lines where the `TYPE` column is `disk` and the `TRAN` column is `sata`. For matching lines, it shows the `MODEL` and `SERIAL` column. All this is an effort to find all hard disks and SSDs in the system, and generate their names in `/dev/disk/by-id`. So, for example, for my main SSD, the expected result is `/dev/disk/by-id/ata-Samsung_SSD_840_PRO_Series_XXX123SERIAL`. This has been working fine for a long time. Two months ago (!), it silently started to fail. Why? Because `lsblk` started to include spaces the `MODEL` columns. `Samsung_SSD_840_PRO_Series` became `Samsung SSD 840 PRO Series`. Of course, `awk` no longer matched these lines, because it was checking `$8 == "sata"`, but `$8` is now `PRO`. So my cronjob didn’t operate on some of the disks anymore and it didn’t produce an error, either. It just didn’t do anything. Bah! |
|||
|
|||
Dear UNIX diary,
Today I learned, because it's good to be humble, that IPv6 addresses can be built automatically from the mac address (something called EUI-64). I'm wondering if this isn't too much information disclosure though, because the mac address can inform you about the manufacturer. Another thing I've sort of learned, or more got a revelation about, is that you can run multiple services on the same port on the same machine... they just need to bind to different addresses. What clicked was that these addresses can even be incidentally local/loopback (ex: 127.0.0.1 and 127.0.0.2). I just never realized, even though it's kind of obvious, because we always say "check if something is running on that port", not "check if something is running on the combination of address and port". This can be really useful when running multiple virtual machines and NATing to external IPs. EDIT: I'm also getting on learning K8s, it's not bad thus far! What did you learn today? |
|||
|
|||
(17-12-2021, 03:49 PM)venam Wrote: I'm wondering if this isn't too much information disclosure though, because the mac address can inform you about the manufacturer. This is a legitimate concern, and for that reason, most OSes switched to randomized identifiers rather than EUI-64 by default. There is a "risk" of collision (2^64 in your typical /64 ipv6 subnet), and that's why you can still see DHCP servers, even in full ipv6 networks. Using EUI-64 guarantees that you won't get any collisions though, so here you gotta pick the less-evil for your SLAAC method. (17-12-2021, 03:49 PM)venam Wrote: you can run multiple services on the same port on the same machine And this is pretty awesome ! I use that "feature" to run 2 different DNS software on my servers. The internet facing one runs nsd (fully-authoritative for my personnal zones), while unbound (recursive/caching dns server) listens on the private side (wireguard + yggdrasil), so I can use them as my day-to-day DNS. This feature can also be used when you got a service that can bind on different "socket" (the actual term for address + port) for user-facing and administration interfaces. |
|||