nixers
My new distro "Glacies" - Printable Version
+- nixers (https://nixers.net)
+-- Forum: Operating Systems (https://nixers.net/forumdisplay.php?fid=4)
+--- Forum: GNU/Linux (https://nixers.net/forumdisplay.php?fid=12)
+--- Thread: My new distro "Glacies" (/showthread.php?tid=2290)
Pages: 1 2


RE: My new distro "Glacies" - eadwardus - 11-12-2019

(10-12-2019, 08:48 PM)josuah Wrote: Hello, Dan Berstein-style libc, happy to see you around! scanf()? No thank you!
I have never seen Dan Bernstein-style code crafting an Operating System, and I am eager to look at it.
Actually, i am a heretic to his ways, i am making use of a string formatted routine to output, although don't plan to provide a similar one to input (skalibs is one similar lib that follows more strictly djb style, although relies on the standard libc)

(10-12-2019, 08:52 PM)josuah Wrote: If you do not mind, would you happen to have a few tips for where to get started?
OSDev is a pretty good place to gather information about how to start, and is very likely to be the better one, because it has plenty information for each step, and citates a lot of others helpful sources. Following a implementation process of a simple system may be the better way to get the grasps of how things works, allowing you to experiment further (the OSDev has a "OS Implementations" section in its books page)


(10-12-2019, 08:52 PM)josuah Wrote: https://wiki.osdev.org/ ? XV6 ? Linux kernel source ? All of em ?
Reading "real" operating systems code may show itself to be of little productivity, because you will get too much details (linux code is mostly drivers), and i think that is better to understand what must be done in its most basilar form, only then those details start to have value; so smaller systems are better in this regard, for example: the already cited xv6 (this one is good, small and has a book), temple os (simpler design, all ring 0), minix (has andrew books addressing it), etc.


RE: My new distro "Glacies" - eadwardus - 31-12-2019

Update: Did a little more detailed description of the project[0], with a few examples and videos (qemu+ncurses recorded with asciinema), of the project. Later i will make a more detailed post about Glacies itself, and then a few benchmarks and comparisons of the tools with others of similar purpose (coreutils, libc, package manager, etc.).
[0]: http://eadwardus.site/?blog/2019-12-31


RE: My new distro "Glacies" - josuah - 13-01-2020

I like the hierarchical naming.

C somehow lacks namespaces, but the '_' char does it well!

It is an idiom well-adopted by the community for library namespaces, so that the integrate well with the end project.

Some project end up using namespaces for splitting different sections of their library.

If you combine the two (let's say, it is a general purpose programing library with multiple sections), you get a 2-fold library namespace.


RE: My new distro "Glacies" - josuah - 13-01-2020

Stralloc from DJB : https://cr.yp.to/lib/stralloc.html
Advised: reading malloc(3) and realloc(3) man page and the link above. That's it...

The libstralloc is a very tiny (~60 lines?) library to build strings onto a simple struct with a pointer to a malloc()ed buffer (->s) the available size of that buffer (->a), and the used size (->n).

You end up using stralloc_cats(sa, "text") without extra checking for "is there enough room in the string sa?":
if there is not enough room, stralloc_cats will realloc sa->s and denote thew available size in sa->a.

Why having both ->a and ->n ? Because while adding more memory, stralloc_cats() will do a little planning and ask for more than just the new length of the string to build, so that it does not have to call realloc too often...

BUT, this, the libc already does it! :)

Check by reallocating a string and printing its address:

----in----
#include <stdio.h>
#include <stdlib.h>
int main(void) {
char *p = NULL;
printf("%p\n", p = realloc(p, 10));
printf("%p\n", p = realloc(p, 11));
printf("%p\n", p = realloc(p, 13));
printf("%p\n", p = realloc(p, 20));
return 0;
}
----out----
0x734306a2460
0x734306a2460
0x734306a2460
0x734bce1e0e0
----end----

Only the 1st and the 4th call to realloc did actually copied the old buffer to a new, larger one.


My very humble benchmark did show that, on some example of malloc I used,:

- using strlen() to add bytes by bytes was much slower than using stralloc (so keeping the size ->n is a really good idea, which most programming languages does),

- keeping track of the usable size (->a) of the buffer adds very little advantage over using realloc every time a string is added


Why the second point? Note that malloc()/realloc() is not a syscall!

Most malloc implementations (if not all), have a block describing the buffer just before it, so the "usable buffer size" is already saved, somewhere *before* the ->s pointer, and it very fast for realloc() to pick it up :

Possible memory layout:
--------
-- ,---- S (char *, struct stralloc)
-- | --- A (size_t, struct stralloc)
-- | --- N (size_t, struct stralloc)
-- | ---
-- | --- [...]
-- | ---
-- | ---
-- | --- [...]
-- | --- (??? memory used by malloc)
-- | --- (??? memory used by malloc)
-- `> - s[0] (char, malloc()ed buffer)
-------- s[1] (char)
-------- s[2] (char)
-------- s[3] (char)
-------- [...]
--------

Somewhere in "??? memory used by malloc" (imlementation-specific), there is the capacity of the buffer stored, for internal use by malloc/realloc (which precisely know where to look).

So I got rid of of the ->a field (capacity) in the stralloc library, to only have ->s (sring pointer) and ->n (total length) left.

Beside than that (which mean I had to go over-pedantic), I rarely found a way to do more concise than DJB's libc and be confident that I was not breaking something elsewhere doing so.

Programming with DJB's libraries makes you an alien. A minimalist, efficient, safe and correct alien.


RE: My new distro "Glacies" - eadwardus - 15-01-2020

(13-01-2020, 08:42 PM)josuah Wrote: I like the hierarchical naming.
It was chosen to avoid name collision (one of the standard libc "sins") and to organize related sections.

(13-01-2020, 09:12 PM)josuah Wrote: - keeping track of the usable size (->a) of the buffer adds very little advantage over using realloc every time a string is added
I disagree, because calling realloc with just the required space is not optimal, and it will shrink the buffer unnecessarily if reused (while not being possible to shrink when necessary). That said, if you intend to use only short strings, those problems are pretty much irrelevant.

(13-01-2020, 09:12 PM)josuah Wrote: Programming with DJB's libraries makes you an alien. A minimalist, efficient, safe and correct alien.
True. Usually all his code have a good design, and most of the people that followed his ways are writing code with similar quality (see: http://thedjbway.b0llix.net/friends.html)


RE: My new distro "Glacies" - josuah - 18-01-2020

(15-01-2020, 04:31 PM)eadwardus Wrote: calling realloc with just the required space is not optimal

I think I express poorly. So I will express myself in term of other people's code. :)

Here is the mechanism I am talking about in DJB code (skalibs):

Code:
int stralloc_ready_tuned (stralloc *sa, size_t n, size_t base, size_t a, size_t b)
{
  [...]
  t = n + base + a * n / b ;
  [...]
}

#define stralloc_ready(sa, n) stralloc_ready_tuned(sa, (n), 8, 1, 8)

int stralloc_catb (stralloc *sa, char const *s, size_t n)
{
  if (!stralloc_readyplus(sa, n)) return 0 ;
  [...]
}
https://github.com/skarnet/skalibs/blob/master/src/libstddjb/stralloc_ready_tuned.c#L12
https://github.com/skarnet/skalibs/blob/master/src/include/skalibs/stralloc.h#L21
https://github.com/skarnet/skalibs/blob/master/src/libstddjb/stralloc_catb.c#L6-L8

Here is how it is already done in a few libcs.

Musl:
Code:
static int adjust_size(size_t *n)
{
    /* Result of pointer difference must fit in ptrdiff_t. */
    if (*n-1 > PTRDIFF_MAX - SIZE_ALIGN - PAGE_SIZE) {
        if (*n) {
            errno = ENOMEM;
            return -1;
        } else {
            *n = SIZE_ALIGN;
            return 0;
        }
    }
    *n = (*n + OVERHEAD + SIZE_ALIGN - 1) & SIZE_MASK;
    return 0;
}

void *realloc(void *p, size_t n)
{
    [...]
    if (adjust_size(&n) < 0) return 0;
    [...]
}
http://git.musl-libc.org/cgit/musl/tree/src/malloc/malloc.c#n377

OpenBSD:
Code:
#define PAGEROUND(x)  (((x) + (MALLOC_PAGEMASK)) & ~MALLOC_PAGEMASK)

static void *
orealloc(struct dir_info **argpool, void *p, size_t newsz, void *f)
{
    [...]
        size_t rnewsz = PAGEROUND(gnewsz);
    [...]
}
https://cvsweb.openbsd.org/src/lib/libc/stdlib/malloc.c?rev=1.262&content-type=text/x-cvsweb-markup

Glibc:
Code:
/* pad request bytes into a usable size -- internal version */

#define request2size(req)                                         \
  (((req) + SIZE_SZ + MALLOC_ALIGN_MASK < MINSIZE)  ?             \
   MINSIZE :                                                      \
   ((req) + SIZE_SZ + MALLOC_ALIGN_MASK) & ~MALLOC_ALIGN_MASK)

void *
__libc_realloc (void *oldmem, size_t bytes)
{
  [...]
    if (!checked_request2size (bytes, &nb))
    {
      __set_errno (ENOMEM);
      return NULL;
    }
  [...]
}
https://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/malloc.c;h=f7cd29bc2f93e1082ee77800bd64a4b2a2897055;hb=HEAD#l3130


But then, there still might be something I missed about that, which makes stralloc's (->a) size_t still useful. This happen often to me. I read some other code, and then I get what I read a day ago was there...

In any way, there is not much benefit besides a single function removed (in case of skarnet's code), and no performance benefit! Just me choosing the color of the bicycle shedding...


RE: My new distro "Glacies" - josuah - 18-01-2020

http://thedjbway.b0llix.net/friends.html

Nice cluster of similar-minded software author.

I would also add https://www.fehcom.de/, which will be there at FOSDEM 2020 :
https://fosdem.org/2020/schedule/speaker/erwin_hoffmann_feh/


RE: My new distro "Glacies" - eadwardus - 19-01-2020

Those cited codes are doing memory alignment, the internal variable "->a" is used to check if there's enough space to hold more data, while a similar check is done by the malloc implementation:
musl: 382-401, 411-414, 424-428
openbsd: 1579-1617, 1678-1688
glibc: 3183-3222, 4573-4623
Those values are not exposed; so if you plan to use any heuristic to avoid brk/mmap calls (for example "->a * 2"), you will need to keep track of the current buffer size (which may be bigger than the necessary).


RE: My new distro "Glacies" - josuah - 25-01-2020

Ah, thank you, now I understand: it is about portability. So that your software not only works portably, but also has roughly the same performance portably.

Another aspect of portability I did not think of!