`time(1)` and children - GNU/Linux

Users browsing this thread: 2 Guest(s)
movq
Long time nixers
Recently dug a little into how time(1) works. Put the results in a blog post:

https://www.uninformativ.de/blog/posting...NG-en.html

I copied the blog post below in full, so you don't have to visit my page.

Let me know if this model (blog post + discussion over here) is a good idea or not, whether or not the post should be copied, and so on.



`time(1)` and children

I was running a shell script on a server. It forked off several other processes and then waited for them to quit. I was making some optimizations to that script and wanted to see if they were actually making it run faster. So, I did the obvious thing:

Code:
$ time my_script

The numbers looked reasonable. Then I wondered: Hold on, the script itself isn’t doing anything. It just launches a bunch of other processes and then just sits there and waits. Why does `time` show such high numbers for *my script*?

Turns out, the output of `time` is a little bit misleading. At least to naïve readers like me.

It so happened that the real `/usr/bin/time` was used on my server, so I’m not looking at a potential shell builtin for `time`, but that probably works the same way.


The syscall `times(2)`

Looking at the implementation of GNU `time`, you can see that there are (of course) multiple `ifdef`s. One of them assumes that the operating system only provides a simple `wait()`: It waits until some child quits. In this case, this is of course the command spawned earlier, so it’s `my_script`.

`wait()` only tells you about the exit status of the child. So how do we get its execution times? Also keep in mind that `wait()` reaps that process from its zombie state, so we can’t issue any other calls for the child PID.

We ask the operating system for the execution times of the *current process*, that is `time` itself. You can do that using `times()`:

Code:
times() stores the current process times in the struct tms that
    buf points to.  The struct tms is as defined in <sys/times.h>:

    struct tms {
        clock_t tms_utime;  /* user time */
        clock_t tms_stime;  /* system time */
        clock_t tms_cutime; /* user time of children */
        clock_t tms_cstime; /* system time of children */
    };

There you go, `tms_cutime` and `tms_cstime` tell us about the execution times of `time`’s children.

Still doesn’t answer my original question, but it’s explained just a few lines below:

Code:
The  tms_cutime  field  contains  the  sum of the tms_utime and
    tms_cutime values for all waited-for terminated children.   The
    tms_cstime   field  contains  the  sum  of  the  tms_stime  and
    tms_cstime values for all waited-for terminated children.

So, `tms_cutime` is not only the child’s “user time”, but the “user time” plus “user time of its children”. As this is presumably tansitive, you get the times of the entire process tree.

In other words, `time` doesn’t even have a way in this case to tell the execution time of `my_script` apart from the execution time of the processes it launches.


`wait3()` and `wait4()`

The GNU implementation also has another code path where it uses one of those syscalls (actually, only `wait3()`, but that’s implemented using `wait4()` on my system). It does what I originally expected would happen:

Code:
pid_t wait3(int *wstatus, int options, struct rusage *rusage);

    If  rusage  is  not  NULL, the struct rusage to which it points
    will be filled with accounting  information  about  the  child.
    See getrusage(2) for details.

I guess it’s safe to assume that this version also operates transitively / recursively, so you get the times of the entire process tree.

This is much more flexible than `times()` in cases where you have to deal with multiple children.


Initial confusion

My first (naïve) attempt to figure out what’s going on looked like this:

Code:
#!/bin/bash

    for (( i = 0; i < 10000000; i++ ))
    do
        true
    done &
    killpid=$!

    sleep 1
    kill $killpid

And the result:

Code:
$ /usr/bin/time ./a.sh
    0.00user 0.00system 0:01.00elapsed 0%CPU (0avgtext+0avgdata 3852maxresident)k
    0inputs+0outputs (0major+371minor)pagefaults 0swaps

Whoops, all zero.

Well, with the knowledge above, it’s trivial. We don’t `wait`.

The whole `wait` vs. no `wait` issue took me a second to notice. I mean, in theory, the operating system *could* always include the execution times of children in the times reported for their parent, even without a `wait`. In my mind, `wait` was simply a way to tell the OS to wait for end of that process – I wasn’t aware that it also had implications on resource usage statistics.

And, well, I’m not sure if it’s entirely obvious that execution times of children should be included in the first place. What if I wanted to measure `my_script`, really, just my script? To find out if it adds any significant overhead? It appears that’s not possible from “the outside” and must be done in `my_script` directly, but I’d be happy to learn otherwise.
venam
Administrators
I love those deep dives driven by questioning about how things actually works internally.

time(1) is misleading because its goal isn't to output something for humans, its goal is to give resource usage, so if a process is composed of multiple threads or sub-processes the time is going to be cummulative. Though, I had no clue it was actually implemented through wait(3) (or other implementations of wait), but this makes total sense for accounting and traceability.

I'll link your article in this section of mine, helping those pesky search engines know where to find the good stuff.
movq
Long time nixers
Quote:if a process is composed of multiple threads or sub-processes the time is going to be cummulative
That's probably the correct perspective. I'm just not used to the concept of "sub-processes". When one process forks and execs, it's an entirely new "thing" in my mind. (Maybe that's wrong anyway, since a lot of stuff gets inherited. Hmm.)

btw, I found a related command: "times". It's even in POSIX. It appears to be equivalent to the syscall `times(2)` and shows usage times of both the current process and the sum of its children (it's a two-line output). That helped me in my particular use case.

Quote:I'll link your article in this section of mine, helping those pesky search engines know where to find the good stuff.

Thanks!