My new distro "Glacies" - GNU/Linux

Users browsing this thread: 1 Guest(s)
eadwardus
Members
I'm building an operating system, or so is the intention, and i have started with the userland (which should run upon any posix system) as it were the thing, specially on Linux, that i were more discontent with, the result is the current distro that i am building "Glacies", that is the "Eltanin" userland with Linux as kernel.

[Image: preview]

All is still in pretty early stage, but i have done quite a few things:
* tertium (own non-posix libc)
* ecore (unix utilities)
* ccore (complementary core, mostly heirloom-toolchest)
* ports (source-based package manager)
* simia (posix compatibility layer upon tertium, supposed to replace musl later)
* venus (binary package manager)

The system is built with simplicity in mind, so the tools that compose the base system are selected with this in mind.

A few uncommon things, worth to citate:
* the entire system is statically linked
* the filesystem hierarchy is the standard, but with /usr as link to /
* manpages are compressed with zlib (.zz)
* sinit and perp compose the init system
* rc is the default shell (dash is the posix one)

I'm now deciding how to provide a more complete desktop, i have tried to add Xorg to ports but it has being painful to make it work statically (a LOT of dependencies, cyclic dependencies with drivers, ignoring flags, etc.).

Links:
https://eltan.in.net
https://github.com/eltanin-os
https://gitlab.com/eltanin-os
https://git.tuxfamily.org/eltaninos

Please leave your opinions and feel free to ask anything.
bouncepaw
Members
Really cool! Have you tried Wayland? I heard it's simpler, maybe you can link it successfully.
z3bra
Grey Hair Nixers
That is really cool. I mean, it is a lot lf work, and it seems you took it pretty far already!

I got lne question, about the userland. As far as I can tell, you're recreating a POSIX compliant userland, while in an other post on the forum you were complaining about the sometimes bad interfaces of it.
Are you trying to fix it and come up with new idioms later on? Or you have a plan to only use thw standard as a "guideline" for your project ?
eadwardus
Members
(09-12-2019, 01:54 PM)bouncepaw Wrote: Really cool! Have you tried Wayland? I heard it's simpler, maybe you can link it successfully.
Is in my list of alternatives to consider, but i am being a little resistant to choose it because of its instability (with that i mean not being clearly defined, with good standards), also i am unsure about how well it can run without X (xwayland).

(09-12-2019, 08:08 PM)z3bra Wrote: I got lne question, about the userland. As far as I can tell, you're recreating a POSIX compliant userland, while in an other post on the forum you were complaining about the sometimes bad interfaces of it.
Are you trying to fix it and come up with new idioms later on? Or you have a plan to only use thw standard as a "guideline" for your project ?
The intention is to follow POSIX compliance strictly where portability matters, in practice the unix utilities(ecore) and "libc"(simia). Any extension/alternative that may appear will not collide with its namespace. In this way the environment will be more friendly to development and will result in least surprises.
z3bra
Grey Hair Nixers
Why did you choose to reimplement it all (libc + coreutils) over using something like musl + sbase from suckless ?
eadwardus
Members
(10-12-2019, 04:25 AM)z3bra Wrote: Why did you choose to reimplement it all (libc + coreutils) over using something like musl + sbase from suckless ?
The libc is simply because i am already building a non-posix one, and the posix one will act as a translation layer, in this way i will have one less component to observe in the base system and a easier to port libc. The core utilities is mostly because of code consistency of having it built with tertium as all of the other tools, also i wanted to have strict POSIX compliance and give a little more care to resources (suckless abuses the stack (at least as a single binary), and sometimes do a little weird things, like "ls" with hardcoded history and heap grow ratio of 1)
jkl
Long time nixers
How do the rc shell and POSIX relate?
josuah
Long time nixers
Hello, Dan Berstein-style libc, happy to see you around! scanf()? No thank you!

I have never seen Dan Bernstein-style code crafting an Operating System, and I am eager to look at it.

Wow, these people write actual operating systems!

It is close to the dilemma:

* I want to write fresh clean code
* I want to use existing interfaces and integrate my project in the greatest context.

I am always going from the one to the other.

Thank you for giving us code to read, and maybe to execute!
josuah
Long time nixers
(10-12-2019, 07:40 PM)jkl Wrote: How do the rc shell and POSIX relate?

rc is not POSIX. rc(1) is UNIX-like, more than the ALGOL-like sh(1) [1], it is used on Plan 9, but predates it and was written for UNIX at first.

[1]: https://en.wikipedia.org/wiki/Bourne_she...al_version
josuah
Long time nixers
To eadwardus:

If you do not mind, would you happen to have a few tips for where to get started?

https://wiki.osdev.org/ ? XV6 ? Linux kernel source ? All of em ?

I can't say I will ever get started myself, but it's always fun to read on that topic.
eadwardus
Members
(10-12-2019, 08:48 PM)josuah Wrote: Hello, Dan Berstein-style libc, happy to see you around! scanf()? No thank you!
I have never seen Dan Bernstein-style code crafting an Operating System, and I am eager to look at it.
Actually, i am a heretic to his ways, i am making use of a string formatted routine to output, although don't plan to provide a similar one to input (skalibs is one similar lib that follows more strictly djb style, although relies on the standard libc)

(10-12-2019, 08:52 PM)josuah Wrote: If you do not mind, would you happen to have a few tips for where to get started?
OSDev is a pretty good place to gather information about how to start, and is very likely to be the better one, because it has plenty information for each step, and citates a lot of others helpful sources. Following a implementation process of a simple system may be the better way to get the grasps of how things works, allowing you to experiment further (the OSDev has a "OS Implementations" section in its books page)


(10-12-2019, 08:52 PM)josuah Wrote: https://wiki.osdev.org/ ? XV6 ? Linux kernel source ? All of em ?
Reading "real" operating systems code may show itself to be of little productivity, because you will get too much details (linux code is mostly drivers), and i think that is better to understand what must be done in its most basilar form, only then those details start to have value; so smaller systems are better in this regard, for example: the already cited xv6 (this one is good, small and has a book), temple os (simpler design, all ring 0), minix (has andrew books addressing it), etc.
eadwardus
Members
Update: Did a little more detailed description of the project[0], with a few examples and videos (qemu+ncurses recorded with asciinema), of the project. Later i will make a more detailed post about Glacies itself, and then a few benchmarks and comparisons of the tools with others of similar purpose (coreutils, libc, package manager, etc.).
[0]: http://eadwardus.site/?blog/2019-12-31
josuah
Long time nixers
I like the hierarchical naming.

C somehow lacks namespaces, but the '_' char does it well!

It is an idiom well-adopted by the community for library namespaces, so that the integrate well with the end project.

Some project end up using namespaces for splitting different sections of their library.

If you combine the two (let's say, it is a general purpose programing library with multiple sections), you get a 2-fold library namespace.
josuah
Long time nixers
Stralloc from DJB : https://cr.yp.to/lib/stralloc.html
Advised: reading malloc(3) and realloc(3) man page and the link above. That's it...

The libstralloc is a very tiny (~60 lines?) library to build strings onto a simple struct with a pointer to a malloc()ed buffer (->s) the available size of that buffer (->a), and the used size (->n).

You end up using stralloc_cats(sa, "text") without extra checking for "is there enough room in the string sa?":
if there is not enough room, stralloc_cats will realloc sa->s and denote thew available size in sa->a.

Why having both ->a and ->n ? Because while adding more memory, stralloc_cats() will do a little planning and ask for more than just the new length of the string to build, so that it does not have to call realloc too often...

BUT, this, the libc already does it! :)

Check by reallocating a string and printing its address:

----in----
#include <stdio.h>
#include <stdlib.h>
int main(void) {
char *p = NULL;
printf("%p\n", p = realloc(p, 10));
printf("%p\n", p = realloc(p, 11));
printf("%p\n", p = realloc(p, 13));
printf("%p\n", p = realloc(p, 20));
return 0;
}
----out----
0x734306a2460
0x734306a2460
0x734306a2460
0x734bce1e0e0
----end----

Only the 1st and the 4th call to realloc did actually copied the old buffer to a new, larger one.


My very humble benchmark did show that, on some example of malloc I used,:

- using strlen() to add bytes by bytes was much slower than using stralloc (so keeping the size ->n is a really good idea, which most programming languages does),

- keeping track of the usable size (->a) of the buffer adds very little advantage over using realloc every time a string is added


Why the second point? Note that malloc()/realloc() is not a syscall!

Most malloc implementations (if not all), have a block describing the buffer just before it, so the "usable buffer size" is already saved, somewhere *before* the ->s pointer, and it very fast for realloc() to pick it up :

Possible memory layout:
--------
-- ,---- S (char *, struct stralloc)
-- | --- A (size_t, struct stralloc)
-- | --- N (size_t, struct stralloc)
-- | ---
-- | --- [...]
-- | ---
-- | ---
-- | --- [...]
-- | --- (??? memory used by malloc)
-- | --- (??? memory used by malloc)
-- `> - s[0] (char, malloc()ed buffer)
-------- s[1] (char)
-------- s[2] (char)
-------- s[3] (char)
-------- [...]
--------

Somewhere in "??? memory used by malloc" (imlementation-specific), there is the capacity of the buffer stored, for internal use by malloc/realloc (which precisely know where to look).

So I got rid of of the ->a field (capacity) in the stralloc library, to only have ->s (sring pointer) and ->n (total length) left.

Beside than that (which mean I had to go over-pedantic), I rarely found a way to do more concise than DJB's libc and be confident that I was not breaking something elsewhere doing so.

Programming with DJB's libraries makes you an alien. A minimalist, efficient, safe and correct alien.
eadwardus
Members
(13-01-2020, 08:42 PM)josuah Wrote: I like the hierarchical naming.
It was chosen to avoid name collision (one of the standard libc "sins") and to organize related sections.

(13-01-2020, 09:12 PM)josuah Wrote: - keeping track of the usable size (->a) of the buffer adds very little advantage over using realloc every time a string is added
I disagree, because calling realloc with just the required space is not optimal, and it will shrink the buffer unnecessarily if reused (while not being possible to shrink when necessary). That said, if you intend to use only short strings, those problems are pretty much irrelevant.

(13-01-2020, 09:12 PM)josuah Wrote: Programming with DJB's libraries makes you an alien. A minimalist, efficient, safe and correct alien.
True. Usually all his code have a good design, and most of the people that followed his ways are writing code with similar quality (see: http://thedjbway.b0llix.net/friends.html)
josuah
Long time nixers
(15-01-2020, 04:31 PM)eadwardus Wrote: calling realloc with just the required space is not optimal

I think I express poorly. So I will express myself in term of other people's code. :)

Here is the mechanism I am talking about in DJB code (skalibs):

Code:
int stralloc_ready_tuned (stralloc *sa, size_t n, size_t base, size_t a, size_t b)
{
  [...]
  t = n + base + a * n / b ;
  [...]
}

#define stralloc_ready(sa, n) stralloc_ready_tuned(sa, (n), 8, 1, 8)

int stralloc_catb (stralloc *sa, char const *s, size_t n)
{
  if (!stralloc_readyplus(sa, n)) return 0 ;
  [...]
}
https://github.com/skarnet/skalibs/blob/...uned.c#L12
https://github.com/skarnet/skalibs/blob/...lloc.h#L21
https://github.com/skarnet/skalibs/blob/...tb.c#L6-L8

Here is how it is already done in a few libcs.

Musl:
Code:
static int adjust_size(size_t *n)
{
    /* Result of pointer difference must fit in ptrdiff_t. */
    if (*n-1 > PTRDIFF_MAX - SIZE_ALIGN - PAGE_SIZE) {
        if (*n) {
            errno = ENOMEM;
            return -1;
        } else {
            *n = SIZE_ALIGN;
            return 0;
        }
    }
    *n = (*n + OVERHEAD + SIZE_ALIGN - 1) & SIZE_MASK;
    return 0;
}

void *realloc(void *p, size_t n)
{
    [...]
    if (adjust_size(&n) < 0) return 0;
    [...]
}
http://git.musl-libc.org/cgit/musl/tree/...loc.c#n377

OpenBSD:
Code:
#define PAGEROUND(x)  (((x) + (MALLOC_PAGEMASK)) & ~MALLOC_PAGEMASK)

static void *
orealloc(struct dir_info **argpool, void *p, size_t newsz, void *f)
{
    [...]
        size_t rnewsz = PAGEROUND(gnewsz);
    [...]
}
https://cvsweb.openbsd.org/src/lib/libc/...web-markup

Glibc:
Code:
/* pad request bytes into a usable size -- internal version */

#define request2size(req)                                         \
  (((req) + SIZE_SZ + MALLOC_ALIGN_MASK < MINSIZE)  ?             \
   MINSIZE :                                                      \
   ((req) + SIZE_SZ + MALLOC_ALIGN_MASK) & ~MALLOC_ALIGN_MASK)

void *
__libc_realloc (void *oldmem, size_t bytes)
{
  [...]
    if (!checked_request2size (bytes, &nb))
    {
      __set_errno (ENOMEM);
      return NULL;
    }
  [...]
}
https://sourceware.org/git/?p=glibc.git;...HEAD#l3130


But then, there still might be something I missed about that, which makes stralloc's (->a) size_t still useful. This happen often to me. I read some other code, and then I get what I read a day ago was there...

In any way, there is not much benefit besides a single function removed (in case of skarnet's code), and no performance benefit! Just me choosing the color of the bicycle shedding...
josuah
Long time nixers
http://thedjbway.b0llix.net/friends.html

Nice cluster of similar-minded software author.

I would also add https://www.fehcom.de/, which will be there at FOSDEM 2020 :
https://fosdem.org/2020/schedule/speaker...fmann_feh/
eadwardus
Members
Those cited codes are doing memory alignment, the internal variable "->a" is used to check if there's enough space to hold more data, while a similar check is done by the malloc implementation:
musl: 382-401, 411-414, 424-428
openbsd: 1579-1617, 1678-1688
glibc: 3183-3222, 4573-4623
Those values are not exposed; so if you plan to use any heuristic to avoid brk/mmap calls (for example "->a * 2"), you will need to keep track of the current buffer size (which may be bigger than the necessary).
josuah
Long time nixers
Ah, thank you, now I understand: it is about portability. So that your software not only works portably, but also has roughly the same performance portably.

Another aspect of portability I did not think of!