Logs in the Unix World - Servers Administration, Networking, & Virtualization
Users browsing this thread: 3 Guest(s)
|
|||
I wrote a transcript for this. It's a very old episode, not very deep and badly researched but still interesting.
Especially, that it's the subject of the day on IRC because members are working on related projects (freem, mort). The transcript is clearer than the actual recording. # Logs in the Unix World What are logs and what do we do with them. Logs are like the pieces of bread left on the road in the Hansel and Gretle story. They could lead you to the witch or to save your brother. Logs are used to record activities from your programs, and those can be use for analysis, troubleshooting, debugging, checking for inappropriate activities, etc.. What should we log, where should we store those logs, and how should we store those logs? As in what format should we store them in. Let's start with software based logs. Logs that are specific to a single software. The developers took the leisure to do their own logging system,to store the logs in a location of their choice, and in a format they conceived. We can't really discuss these types of logs as the subject is very broad and the developers can do absolutely whatever they want. Print on screen, or on files, format could be binary or textual, using a third party software to access them, different levels of debugging with different meanings. Whatever goes. You're free to do whatever you want. Let's discuss instead system based logs. They are more widely used and more important than software based logs. Unix is known for its flexible and powerful logging system. It enables you to record almost anything and to manipulate it to retrieve the info you require. In a system based log you have a centralized system that gathers those logs. It could mean that all the logs go to a single place and that they are easy to search and analyze. However, as we'll see, it's not really that accurate. It's true that it's going to one direction, one system, but not that the logs themselves, as files, will end up in one location. We still refer to this as system based logs. The typical location the logs end up in is `/var/log` but as we're going to see later on it could be any other place. The problem, the hick with that is that it is only useful if all the logs are in text format. Making them searchable with simple text manipulation tools When in one single place you can `grep` them use `awk`, `sed`, `cut`, etc.. If they aren't in a single place it's going to be harder to do these things. Having logs centralized in a binary format will make them useless for the manipulation part. Multiple binary logs file in one location is meaningless. When the logs are centralized they can be kept on a remote disk as one big chunk. Let's say a partition `/log` that is separated on another device. Thus, it's easier to replicate, backup, and we can avoid the loss of information. This decoupling also allows to debug things if the main server goes down as the logs will still be accessible. This is useful in critical situations. In the Unix world the main implementation of the centralized log system is called syslog. It provides general purpose logging. You send info to syslog and it logs it. There are two views of centralization. It could mean that a single system takes all the messages and logs, or it could mean that all the logs file are in the same location. syslog is the de-facto standard, because everyone agrees to use it. There are no specifications, just the basics RFC by the ietf, so different implementations are known. Like anything in the open source world, everyone comes up with their own brand. Note that the internet engineer task force made the RFC for syslog. rsyslog is the main implementation, a lightweight daemon installed on most common unix-like distributions. A few systems that have it: fedora (was using it switched to journald), open suse, debian(journald), ubuntu (journald), redhat(journald), solaris, gentoo, archlinux (switched to journald). Most have deprecated rsyslog in favor of journald. The second most common implementation of syslog is syslogng. In theory, anything can act as a syslogging system. Any queuing daemon that takes a message, queues it, and pipes it on the other end to a file or system, could be used for logging. There are implementations using logstash (with elasticsearch), fluentd, anything. Programs send their log entry to the syslog daemon which then will consult a configuration file `/etc/syslogd.conf`, `/etc/syslog`, or anything else depending on the implementation you have. After consulting this file it will check if the message sent matches something in the configuration, if it does it will write the log to the corresponding file. Thus, from that you can imagine syslog as a routing system for logs. It queues the message looks for a key matching it and redirects it to the right file. How does it write the message to the file, distribute those logs? It achieves it by using "tags", which are defined as rules in the configuration file `/etc/syslogd.conf` that will contain the appropriate details in it. Inside it you will find "terms", "priority", and "selector". The term is the identifier describing the application or process or anything that generated the log. Ex: passwd, ftp, kernel, mail. The priority is the importance of the message which is graded by levels defined in the RFC. Ex: NOTICE, WARNING, ERROR. The selection is used to filter the logs, it splits them by file, a topic matcher along with an action. The action that goes with the selector can anything, like sending an email or push the log to another logging system. How to send messages to syslog from the command line? The logger command nudges the syslog daemon and, consequently, invokes the creation of logs. We can use it to debug syslog and if the configurations are set right. The format is as follows: `logger -f <file> -p <priority> -t <tag> <message>`. Example: `logger -p local0.notice -t host -f somefile`. It's going to add an entry to the default file if we don't have any rules for local0. local0 is the term and the priority is notice, they are concatenated with a dot ".". Where are they stored? We said they were centralized, however that is ambiguous because that depends on the implementation and on the configuration. Usually they go in /var/log but that's because the system is by default configured that way. The routing rules in the config could write the logs in any other place. That's the hick, we said it's centralized but that doesn't mean the log files are centralized in a common place. That's also useful if you want to write to subdirectories within a centralized directory. What happens when the directory syslog routes to doesn't exists? Then usually the syslog daemon will create the directory, however it needs the right permissions to do so. If it doesn't have the right access permission syslog will log an error on itself. Introspectively, the logging system has logs for itself. Let's dig in the configuration. The format is flexible and all about routing. For rsyslog we have `/etc/rsyslog` and `/etc/rsyslog.conf` (depends on implementation). A typical syslog file look like a series of line, and on each of them there is a representation of a message received and how it will handle it. The format is divided into columns separated by tabs. On the complete left it has the term.priority, aka the global directive. Additionally, there's a catchall message for a certain priority (ex: `*.debug`). On the complete right it has the location where it will push the log file to. This field is also called the rule field as the location is specified as an action. These actions can be either: device (file), user, pipe, another syslog host (@). In between these two, there are other less used options such as the template and output channel. The template is the format in which the log is going to be saved (kind of like printf with some built-in variables such as hostname, time, etc..). If you want extra flexibility points, you will be happy to hear that some syslogd implementation listen on a socket too. So you can contact them over the network via a socket, remote logging. That means you can have a centralized machine which job is to log for all other machines. Another concept in the log world is when the log files get too big and have to handle them, what we refer to as log rotation. Logs grow very fast and very big and consume a large amount of disk space. Many utilities will come to your help such as newsylog and logrotate. Those tools are usually called through a cronjob because you want to keep repeating the rotation at a coherent interval. Tarballing the logs, and erasing old ones, etc.. This could obviously be implemented from scratch using `stat`. Let's get to systemd, the fun... not so fun... replacement for syslog. It has its own logging system called a journal. Therefore, logging a syslog daemon is no longer required to read the logs. Now they are stored in binary format and you need a special command called `journalctl` to access them. By default those logs are also stored in `/var/log` but inside a subdirectory called journal. Unlike syslog it's not going to recreate the directory if you erase it. systemd could instead, if configured in a non-persistent way, store them in `/run/systemd/journal`, or if in a persistent way it will recreate the directory. journal stores the logs in a binary format, that is lighter, but you can't read them using the usual tools. You are forced to use a third party software to read them. The journal configuration file allows you to make the files either persistent or not. Taking a look in `/var/log/journal` you'll see files with an md5sum name and there's a single file or rotated files for all software. That means logrotate becomes meaningless as the system manages itself based on the size limit set. This is configured in `/etc/systemd/journald.conf` in it you'll find configs related to compression, splitting, syncing interval (for when it will actually write to disk), max use, maximum runtime, if forwarded to syslog, max storage, level of priority, etc.. Forwarding them to syslog should save you if you want to run syslog alongside systemd. By default this is enabled but if syslog isn't running it goes back to the default behavior. Targeting and monitoring logs. If in text format you can use text manipulation tools to do that. Otherwise, you have to use the third party software given to you. For reporting, which is mostly missing in journald but not in syslog, you can use `logwatch` to monitor system logs and email you in case there's something weird happening. `awstats` can be used as a sort of apache web server monitor logs. Text files are very flexible! I hope you learned a thing or two about the logging system. |
|||
Messages In This Thread |
Logs in the Unix World - by venam - 24-06-2016, 02:47 PM
RE: Logs in the Unix World - by josuah - 03-07-2016, 08:34 AM
RE: Logs in the Unix World - by venam - 16-08-2016, 05:55 AM
RE: Logs in the Unix World - by venam - 16-02-2021, 04:21 PM
RE: Logs in the Unix World - by freem - 17-02-2021, 09:38 AM
RE: Logs in the Unix World - by venam - 17-02-2021, 10:29 AM
RE: Logs in the Unix World - by jkl - 17-02-2021, 03:30 PM
|