Users browsing this thread: 1 Guest(s)
venam
Administrators
(This is part of the podcast discussion extension)

Unix file hierarchy

Link of the recording [ http://podcast.nixers.net/feed/download....10-251.mp3 https://raw.githubusercontent.com/nixers...-10-25.mp3 ]

Files, the predominant representation of everything on Unix, how are they scattered around?

--(Show Notes)--
https://archive.org/details/bstj57-6-1905 1978 Time-Sharing System research
http://cm.bell-labs.com/7thEdMan/v7vol1.pdf (~page 385)
https://en.wikipedia.org/wiki/Unix_filesystem
https://en.wikipedia.org/wiki/Filesystem...y_Standard
http://www.pathname.com/
http://refspecs.linuxfoundation.org/FHS_...index.html
https://wiki.linuxfoundation.org/lsb/fhs
http://www.tldp.org/LDP/Linux-Filesystem...rarchy.pdf
http://objectroot.org/articles/brief-history-of-hier/
http://gobolinux.org/index.php?page=doc/...s/clueless
http://gobolinux.org/index.php?page=k5
http://lists.busybox.net/pipermail/busyb...74114.html (You can follow up the mail thread)
http://askubuntu.com/questions/130186/wh...-directory
http://sabotage.tech/
http://morpheus.2f30.org/index.html (the suckless guys)
The man pages:
man 7 intro
proc(5) {linux}
hier(7)
http://www.pathname.com/fhs/pub/fhs-2.3.html
http://man.openbsd.org/OpenBSD-current/man7/hier.7
https://www.freebsd.org/cgi/man.cgi?query=hier
https://www.freebsd.org/cgi/man.cgi?quer...=2.9.1+BSD
https://www.freebsd.org/cgi/man.cgi?quer...ULTRIX+4.2
https://www.freebsd.org/cgi/man.cgi?quer...unOS+4.1.3
https://linux.die.net/man/7/hier
man 7 file-hierarchy
--
https://github.com/thelostt/ricing (ricing book)

Music: Soundtrack by kronstudio and it's GEM.
Checkout their website they have amazing tunes there.

If you want to contribute check this thread.
venam
Administrators
--(Transcript)--
On my blog
And here:
Unix file hierarchy

Files, the predominant representation of everything on Unix, how are
they scattered around?

-----


* What - The start and meanings
* Who owns it now and when
* Iterative makeup adjustments
* So what - The Issues
* Now what - Reviewing it
* Conclusion


-----

# What - The start and meanings

## Principle - What are files

The filesystem is represented as a rooted tree of directories where the
root is denoted by the '/' character.
When specifying that a file or directory is under another a '/' is also
used as a separator. For example /bin/init is a file under the root
directory which is under the bin directory.

Everything is under that '/', it doesn't matter if it's a hard disk,
a network card, or your favorite game. They're all somewhere under
that root.

There are 3 main filetypes: ordinary files, directories, and special files
such as devices and virtual files.

However, the unix directories don't literally contain those files.
Instead they contain their names and pair them with a reference to their
inode number which contains the file and its metadata. That explains
why we can have links hard and soft.

## Original Layout

So, with that in mind, what was the original layout or directory structure
under the root in the original unix version.

Let's review the layout from the original UNIX Programmer's manual
hier(7) manpage. You can get it from the show notes.

/ root
/dev devices
/bin utility programs
/lib object libraries and other stuff
/etc essential data and dangerous maintenance utilities (passwd, group, motd, init, rc)
/tmp temporary files, usually on a fast device
/usr general-purpose directory, usually a mounted file system

This seems highly minimal compared what we currently have.
Where does all those new directories come from, why were they added?

Let's first blame someone by digging into who's in control of the
current standards.


# Who owns it now and when


Amongst all the Unix standards out there, such as the single UNIX specification
and the stuffs from the open group, the one that holds the hierarchy is the
Filesystem Hierarchy Standard, FHS.
It's under the governance of the Linux Fondation.

Like all standards it's meant as a guideline and it doesn't have to be
followed thoroughly, it's just a reference.

Until today there are only Linux distributions adhering to it, and thus
the BSDs and other unix-like variants don't necessarily preach its words.

So what are they trying to achieve and how?

They want to predict the location of installed files and directory,
to make it less chaotic, by specifying what every area of the system
should contain, the principle behind them, and the mimimum files and
directory requirements.

Let's unpack that idea of principle behind directories.

They wrote down standards for everyone one of the directory that should
appear under the root.
Another one of the distinction they put forward is the one between shareable,
unshearable, static, and variable, directories.

Typically, everything under /var is of variable size, /usr and /opt are
shareable and everything else is unshareable and static.

Following that logic, /usr and /opt can be left out of a system and the
remaining content should be enough to boot, restore, recover, and/or
repair the system.

Now let's jump into the real deal, what are they prescribing we should
definitely have under the root directory and what are they mentioning
that is special about those specific dirs.

Let's say here that instead of the original 6 directories there is now
14 directories, that's 8 brand new directories under the root.

```
Directory Description
bin Essential command binaries
(They even have a list of must have commands under that dir,
such as cat, cp, date, dd, kill, ls, etc.. the usual
stuff that are correspondant with system calls)
boot Static files of the boot loader
( Which was there before but not as a directory but as a
file, you might have wondered where it was booting from
in the original unix without that directory, so that
explains it.)
dev Device files
etc Host-specific system configuration
lib Essential shared libraries and kernel modules
media Mount point for removable media
mnt Mount point for mounting a filesystem temporarily
opt Add-on application software packages
run Data relevant to running processes
sbin Essential system binaries
srv Data for services provided by this system
tmp Temporary files
usr Secondary hierarchy
var Variable data
```

Why did all those new directories appear?
What the hek happened?
Aren't we supposed to be minimalists, where does that come from?


# Iterative makeup adjustments


Where did all those new directories spurt out from?
It's all because of an infectious aging process that slowly started in
the early unix days.


## Back in the days

Remember the 6 directories under root in the original Unix.
Well, there's something that needs to be mentioned.
If you noticed, there's no /home so where did the home directory go?
/usr, the general purpose directory was in fact where that directory was
stored. It meant "user", and not "unix system resources".

This started because there was not enough size on the medium to store the
directories that grew fast, such as the user directory.

And thus /usr was mounted on a separate filesystem.

Early on, the biggest binaries where moved to that separate filesystem,
both because they weren't essential to boot the system and because
they've ran out of space on the first medium.

That was when the /usr/bin, /usr/lib, and /usr/tmp where introduced.
All of them non-essential for the minimal system but still needed by users.

This notion of splitting across several places became inherent.

After Unix got a bit popular outside of the Bell Labs, people wanted to
install their own utilities without infringing on the default installation.
That means keeping their custom programs while they can update the
whole system.

And thus they introduced another split under /usr, the /usr/local, which
contains a replica of the higher hierarchy /usr/local/bin, /usr/local/lib.

Yet again, another split happened when independent software vendors
wanted to offer their packages. They didn't want to interfere with users
and they didn't want to mess up the whole install and thus places their
packages under /usr/local/vendor, as another replica of the hierarchy,
/usr/local/vendor/bin /usr/local/vendor/lib.


That's a lot of replica of the same hierarchy in multiple places.

That's about /usr but there are many other directories that appeared.

Let's talk about /var, where did this come from?

We talked about variable sized directories, right, and about problems
with disk space. This is where I'm going at, the log directory, the
spool, and tmp directories, were all under /usr before but because they
grew in size they were moved to a brand new directory under the root:
/var so that it could be placed on another disk.

Size was a really big issue at the time, it brought more schism in
the hierarchy.

Having /usr on another disk and moving the binaries wasn't creating
enough space to store the programs.

Anyway, another thing with commercial vendors happened, they suddenly
wanted to leave behind the /usr/local/vendor hierarchy and move it to
/var/opt because it gave them the flexibility to offer packages that
couldn't be shared across multiple machines incidentally. Also, those
packages were more or less optional, think of it like a local repository
of proprietary packages.

Soon enough, vendors thought, insidiously, that it would be better
to store all those packages under /opt instead. Bringing yet another
directory under the root.

In the same manner, because of the new clutter created, new directories
started to appear under root, for example the home directory, which was
originally /usr and now is independant under root.
The /usr had now become too crowded for what it was original designed for.

Ironic, isn't it?

The fancy names "U NIX s ource r epository" or "U NIX s ystem r esources"
are all made-up names, and it's too late to rename them anyway.

By the way, you can still install packages in your home directory, you
just have to install them in ~/.local and put the settings in ~/.config.

More and more artificial justifications appeared and people are now
following them.
That's one reason why the FHS exists, to formalize those nebulous ideas
we have about those places on our system.

Legacy mixed with pompous new creative and destructive ideas.

So if you ask yourself where you should install your stuffs or where
you should look for when searching for specific things you better take
a look at the FHS because you indeed need a standard to be able to make
sense of this.


# So what - The Issues


The world didn't crumble, so what could be the issue?

Well, we've been stacking problems, piling them up years after years,
so it might someday eventually if we don't manage the mess.

There are many arguments on the current issues with the hierarchy.

Let's go over them.

One of them was already mentioned at the end of the previous section,
the confusion that this iterative dichotomy has brought.

The hierarchy has experienced a continuously increasing entropy on
multiple scales. Starting from the hierarchy of the original Unix Seventh
Edition everyone has put their hands on it.

It's hard to know where to place your executable because we are saturated
with different valid contestants for this place.

The confusion has another catalyst and that is legacy.

The history of /usr alone is mind boggling, it was first the user directory,
then separated to another disk to host big stuffs, then the home directory
was removed from it, and now it's vaguely if even used as a directory hosted
on another partition. And thus they had to reintroduce this concept by
creating /usr/share, which would be the shared directory for multiple
architecture.
And again, this new directory under /usr is not commonly shared amongst
machines, bringing more questions about its existence.

On the topic of questioning the current existence of some directories,
many others are criticised. For example, the issues with size aren't
relevant as of today, and thus the distinction between /usr/bin and /bin
and others is minimal.

In so far as booting with a minimal rescue system is concerned, it doesn't
make sense today either, as we can boot from external medium to rescue
a system.

Moreover, having multiple binaries in multiple places poses the issue
of not knowing if you are invoking the right program at the right time,
as there can be many copies of the executable under different places.

It obviously also brings 3 obvious problems: simplicity, maintainability, and
flexibility, which are at stakes.


#Now what - Reviewing it


Ok, it's not that great but are there alternatives.

There are some distributions such as morpheus linux and sabotage linux
that try to stay true to the original Unix spirit.

Morpheus Linux is maintained by the guys at suckless.org so it's peer
reviewed by persons that take that topic at heart.

Sabotage Linux takes the approach a bit differently by completely omitting
the /usr and /sbin directories.

There's also those that take the time to review the concept and reshape
it, make it more reflective of what we need today, bring it to date.

I could find two of those projects:

* objectroot
* GoboLinux

Each of them redefine the hierarchy of the filesystem while at the same
time retaining backward compatibility, which sounds awesome.

GoboLinux has longer user-friendly names for the directories, for instance
executables are all stored under /Programs and under /Programs there are
sub-directories for the programs that stores them by versions.

One thing to notice with GoboLinux is that it only has 6 directories
under the root, just like the original Seventh Edition.

They are the following:

* Depot
* Mount
* System
* Files
* Programs
* Users

Also, it was created by the guy that wrote htop and LuaRocks, so cool.

objectroot is another approach.

While GoboLinux is a Linux distribution, objectroot is more of a new
set of rules, easier to apply on current distributions.

It has 5 directories under the root and they are:

/hosts — Operating system instances. [ More ... ]
/org — Application and system software. [ More ... ]
/users — Users' home directories. [ More ... ]

/boot — Second-stage bootloader. Optional.
/mnt — Temporarily mounted filesystems. Optional.

/boot and /mnt being optional, that makes only 3 essential directories.

That seems minimal, why is that?
/users is pretty obvious to understand but what about the others.

/org contains shareable softwares between machines.
/hosts brings the concept of distributed computing, it contains sub
directories with specific files for every machine, /hosts/self being
the machine you are sitting at.
Under that directory you find the typical stuffs.

You can know more about those different hierarchies by taking a look at
their respective websites.

# Conclusion

That's about it.
Now you're enlighten about the dark history of the unix hierarchy of today.

There's not much you can do about it though.
The least you could possibly do is to take a look at the FHS and follow the
standards.

A good thing to mention here is that most unix variants have their own
hier(7) manpage and they're pretty similar, minus the little details.
So stick to that.

Also, don't get the bright idea of splitting stuffs into a new directory
under the root, just no, stop that.

It just won't make it easier...
No, no, I said no, don't do that, bad idea.

Will we ever learn?
pranomostro
Long time nixers
See also sta.li, the forever vaporware by suckless distro.

That sabotage.tech link is worthwile, thanks.

What I find especially interesting is that files in unix are categorized by type or usage,
not by association. One could also have a system where everything
except the most basic system components is in something called /pkg in
a directory corresponding to the package name.

plan9 throws everything out and starts anew, with /sys/src (or was it /sys/lib/src?), /usr for users,
/n for virtual network files and /env for environment variable files.

And we will, of course, never learn. Just install your stuff to / and hope it works (it never does).

Ok, objectroot is really cool. They even reinvented mounting file systems from
other machines, only 10 years left until somebody discovers something
like plan9 namespaces! Hurray!

Edit: seriously, it looks like that guy knows plan9. He reinvents venti with
/hosts/self_time, for snapshots, and he sorta reinvents virtual network file systems.

Weird.
venam
Administrators
(25-10-2016, 04:32 PM)pranomostro Wrote: What I find especially interesting is that files in unix are categorized by type or usage,
not by association. One could also have a system where everything
except the most basic system components is in something called /pkg in
a directory corresponding to the package name.
Interesting, I disregarded the topic of categorization: The why stuffs are split in categories the way they are.
Apart from the alternatives and original layouts I didn't discuss much.
Maybe someone could unfold that.
venam
Administrators
Here's an interesting email reply I got from the man behind http://objectroot.org/ , Timo Lehtinen, which I think enriches the discussion.

Quote:> There are a lot of things I've just brushed upon and didn't dig
> into. For instance, I haven't discussed anything regarding why
> we're restricting ourselves to the default hierarchy and why it's
> structured the way it is. That's the topic of categorization,
> which you've revisited with your project.
>
> What's your take on that?

The Unix model has been successful because of its component
architecture: if you have a better idea of how to implement, say, a
file compressor or some other type of data processing filter, just
write your own. And then others can plug it into their processing
pipelines without even thinking. This evolution has been going on for
40+ years now.

But, unfortunately, at the file system organisation level, similar
evolution has not been possible: programmers, sysadmins, and
installation scripts expect to find things in their traditional,
familiar places. I.e. you can't change things just for your needs,
as it will affect other peoples wares as well.

> Someone on the extended thread, where we discuss the subject, has
> compared the way you mount filesystems in your hierarchy to the
> one of plan9.

What Plan 9 did was open our minds to the possibility that we can
innovate in the file system hierarchy. Coming from the creators of
Unix itself, it shook our minds loose, and let us view the file system
more as a database.

What my proposal does is move away from type-orientedness (organizing
files by their type) to containment by authority. This is to solve
namespace collisions (or at least delegate them futher), and allow
more natural sandboxing, among other things.

My thinking has been affected by Plan 9, for sure, but more in overall
'empowerment' than in specific ideas.
pranomostro
Long time nixers
Aha, interesting. He is right in the way that we have to disrupt
(i hate that word, but here it fits) the unix file hierarchy.

(03-11-2016, 08:56 AM)venam Wrote: What my proposal does is move away from type-orientedness (organizing
files by their type) to containment by authority. This is to solve
namespace collisions (or at least delegate them futher), and allow
more natural sandboxing, among other things.

This is a really good way of viewing it. We have that already, in a weird semi-way,
by creating directories like /home/bin and so on, but he found out the underlying theme
of this development, and extracted it.

Oh, and his twitter feed is hilarious, imho. He is really serious about white genocide and it's risks.
venam
Administrators
Let's bump this thread, by mentioning a piece of software that is taking a new view on the hierarchy: systemd.
man 7 file-hierarchy describes a "modern" Linux hierarchy for systems that want to use systemd.
There isn't anything especially new in this hierarchy other than it including the XDG and systemd dirs.

What's new is that issuing systemd-path would help find where is what. For example to get the $PATH instead of the environment variable you can do:
Code:
systemd-path search-binaries
So now there's a disconnection with environment variables and instead we have a centralized mechanism. Software don't have to be preloading env anymore.

The decoupling is advantageous in a sense but it is arguably very Windows registry-like in my opinion and probably comes with the same issues. And as with all systemd software you definitely feel the eagerness of wrapping everything under an abstract "micro-service-like" framework so that you never touch the underlying system.
jkl
Long time nixers
As you already have noticed, much of systemd introduces many Windows concepts into the Linux world. It won’t stop to amaze me that this is a Good Thing™ now.

--
<mort> choosing a terrible license just to be spiteful towards others is possibly the most tux0r thing I've ever seen
mcol
Nixers
(02-02-2021, 06:22 PM)venam Wrote: The decoupling is advantageous in a sense but it is arguably very Windows registry-like in my opinion and probably comes with the same issues.

What issues might people be worried about when moving in this direction? What are those issues associated with the Windows registry (I'm not familiar with Windows)?
venam
Administrators
(06-02-2021, 07:33 AM)mcol Wrote: What issues might people be worried about when moving in this direction? What are those issues associated with the Windows registry
Having another centralized config or key-store mechanism is troublesome in a couple of ways. It's similar to the situation that GConf/gsettings/DConf brings where you have to learn a new system that can easily get cluttered. Soon you won't find or know how to get what you're looking for or which settings or configuration is where. That's unless everything starts to rely on that new centralized mechanism. These are normally good for enterprise services though.
However, here I don't think it's so much of an issue other than it being a not-so-new file hierarchy that programs will assume is fixed. As long as their concept of a hierarchy is anchored in reality this is fine, once it diverts then it can create discrepancies.
freem
Nixers
(06-02-2021, 08:24 AM)venam Wrote: Soon you won't find or know how to get what you're looking for or which settings or configuration is where. That's unless everything starts to rely on that new centralized mechanism.

Even with that... there's the lack of "hackability" issue. On windows, even people with _lot_ of knowledge can't easily mess with regedit: it's full of hashes which represent pointers, hidden values, and the like.
It's also a set of binaries disseminated in various (undocumented, of course) places, and can't be easily versioned.
A system you can't easily version, is also a system you can't easily patch, and thus, moving it toward a stable state will be harder (using tools like cfengine3, ansible, drist, rex, etc).
mcol
Nixers
Now software can check both the environment and systemd-path!