Like all good operating systems, UNIX allows you the privilege of storing information indefinitely (or at least until the next disk crash) in abstract data containers called files. The organisation, placement and usage of these files comes under the general umbrella of the file hierarchy. As a system administrator, you will need to be very familiar with the file hierarchy. You will use it on a day to day basis as you maintain the system, install software and manage user accounts.
At a first glance, the file hierarchy structure of a typical Linux host (we will use Linux for the basis of our discussion) may appear to have been devised by a demented genius who'd been remiss with their medication. Why, for example, does the root directory contain something like:
bin
etc
lost+found root
usr
boot
home
mnt sbin
var
dev
lib
proc tmp
Why was it done like this?
Historically, the location of certain files and utilities has not always been standard (or fixed). This has lead to problems with development and upgrading between different "distributions" of Linux [Linux is distributed from many sources, two major sources are the Slackware and Red Hat package sets]. The Linux directory structure (or file hierarchy) was based on existing flavours of UNIX, but as it evolved, certain inconsistencies developed. These were often small things like the location (or placement) of certain configuration files, but it resulted in difficulties porting software from host to host.
To combat this, a file standard was developed. This is an evolving process, to date resulting in a fairly static model for the Linux file hierarchy. In this chapter, we will examine how the Linux file hierarchy is structured, how each component relates to the overall OS and why certain files are placed in certain locations.
Linux File System Standard
The location and purposes of files and directories on a Linux machine are defined by the Linux File Hierarchy Standard. The Resource Materials section of the 85321 Web site contains a pointer to it.
The top level of the Linux file hierarchy is referred to as the root (or /). The root directory typically contains several other directories including:
|
Directory |
Contains |
|
|
bin/ |
Required Boot-time binaries |
|
|
boot/ |
Boot configuration files for the OS loader and kernel image |
|
|
dev/ |
Device files |
|
|
etc/ |
System configuration files and scripts |
|
|
home/ |
User/Sub branch directories |
|
|
lib/ |
Main OS shared libraries and kernel modules |
|
|
Lost+found/ |
Storage directory for "recovered" files |
|
|
mnt/ |
Temporary point to connect devices to |
|
|
proc/ |
Pseudo directory structure containing information about the kernel, currently running processes and resource allocation |
|
|
root/ |
Linux (non-standard) home directory for the root user. Alternate location being the / directory itself |
|
|
sbin/ |
System administration binaries and tools |
|
|
tmp/ |
Location of temporary files |
r |
| usr/ | ||
|
Difficult to define - it contains almost everything else including local binaries, libraries, applications and packages (including X Windows) |
||
|
var/ |
Variable data, usually machine specific. Includes spool directories for mail and news |
Table 4.1
Major Directories
Generally, the root should not contain any additional files - it is considered bad form to create other directories off the root, nor should any other files be placed there.
The name “root” is based on the analogous relationship between the UNIX files system structure and a tree! Quite simply, the file hierarchy is an inverted tree.
I can
personally never visiualise an upside down tree – what this phrase
really means is that the “top” of the file heirarchy is at one point,
like the root of a tree, the bottom is spread out, like the branches of a
tree. This is probably a silly analogy because if you turn a tree
upside down, you have lots of spreading roots, dirt and several thousand
very unhappy worms!
Every part of the file system eventually can be traced back to one central point, the root. The concept of a “root” structure has now been (partially) adopted by other operating systems such as Windows NT. However, unlike other operatings systems, UNIX doesn't have any concept of “drives”. While this will be explained in detail in a later chapter, it is important to be aware of the following:
The file system may be spread over several physical devices; different parts of the file heirarchy may exist on totally separate partitions, hard disks, CD-ROMs, network file system shares, floppy disks and other devices.
This separation is transparent to the file system heirarchy, user and applications.
Different “parts” of the file system will be “connected” (or mounted) at startup; other parts will be dynamically attached as required.
The remainder of this chapter examines some of the more important directory structures in the Linux file hierarchy.
The /home
directory structure contains the the home directories for most login-enabled
users (some notable exceptions being the root user and (on some systems) the
www/web user). While most small systems will contain user directories
directly off the /home directory (for
example, /home/jamiesob), on larger
systems is common to subdivide the home structure based on classes (or
groups) of users, for example:
/home/admin
# Administrators
/home/finance #
Finance users
/home/humanres # Human
Resource users
/home/mgr
# Managers
/home/staff
# Other people
/root is the home directory for the root user. If, for some strange reason, the /root directory doesn't exist, then the root user will be logged in in the / directory - this is actually the traditional location for root users.
There is some debate as to allowing the root user to have a special directory as their login point - this idea encourages the root user to set up their .profile, use "user" programs like elm, tin and netscape (programs which require a home directory in which to place certain configuration files) and generally use the root account as a beefed up user account. A system administrator should never use the root account for day to day user-type interaction; the root account should only be used for system administration purposes only.
Be
aware that you must be extremely careful when allowing a user to have a
home directory in a location other than the /home
branch. The problem occurs when you, as a system administrator, have
to back-up the system - it is easy to miss a home directory if it isn't
grouped with others in a common branch (like /home).
It is often slightly confusing to see that /usr and /var both contain similar directories:
/usr
X11R6
games libexec
src
bin
i486-linux-libc5 local
tmp
dict
include
man
doc
info
sbin
etc
lib
share
/var
catman
local log
preserve spool
lib
lock nis
run tmp
It becomes even more confusing when you start examining the the maze
of links which intermingle the two major branches.
Links
are a way of referencing a file or directory by many names and many
locations within the file hierarchy. They are effectively like
"pointers" to files - think of them as like leaving a post-it
note saying "see this file". Links will be explained in
greater detail in the next chapter.
To put it simply, /var is for VARiable data/files. /usr is for USeR accessible data, programs and libraries. Unfortunately, history has confused things - files which should have been placed in the /usr branch have been located in the /var branch and vice versa. Thus to "correct" things, a series of links have been put in place. Why the reason for the separation? Does it matter. The answer is: Yes, but No :)
Yes in the sense that the file standard dictates that the /usr branch should be able to be mounted (another way of saying "attached" to the file hierarchy - this will be covered in the next chapter) READ ONLY (thus can't contain variable data). The reasons for this are historical and came about because of something called NFS exporting.
NFS
exporting is the process of one machine (a server) "exporting"
its copy of the /usr
structure (and others) to the
network for other systems to use.
If several systems were "sharing" the same /usr structure, it would not be a good idea for them all to be writing logs and variable data to the same area! It is also used because minimal installations of Linux can use the /usr branch directly from the CDROM (a read-only device).
However, it is "No" in the sense that:
§ /usr is usually mounted READ-WRITE-EXECUTE on Linux systems anyway
§ In the author's experience, exporting /usr READ-ONLY via NFS isn't entirely successful without making some very non-standard modifications to the file hierarchy!
The following are a few highlights of the /var and /usr directory branches:
All software that is installed on a system
after the operating system package itself should be placed in the
/usr/local directory. Binary files should be located in the
/usr/local/bin (generally /usr/local/bin
should be included in a user's PATH
setting). By placing all installed software in this branch, it makes backups
and upgrades of the system far easier - the system administrator can back-up
and restore the entire /usr/local
system with more ease than backing-up and restoring software packages from
multiple branches (i.e.. /usr/src,
/usr/bin etc.).
An example of a /usr/local directory
is listed below:
bin
games lib
rsynth
cern
man sbin
volume-1.11 info
mpeg speak
www etc
java
netscape src
As you can see, there are a few standard directories (bin, lib and src) as well as some that contain installed programs.
Linux is a very popular platform for C/C++, Java and Perl program development. As we will discuss in later chapters, Linux also allows the system administrator to actually modify and recompile the kernel. Because of this, compilers, libraries and source directories are treated as "core" elements of the file hierarchy structure.
The /usr structure plays host to three important directories:
/usr/include holds most of the standard C/C++ header files - this directory will be referred to as the primary include directory in most Makefiles.
Makefiles
are special script-like files that are processed by the make
program for the purposes of compiling, linking and building programs.
/usr/lib holds most static libraries as well as hosting subdirectories containing libraries for other (non C/C++) languages including Perl and TCL. It also plays host to configuration information for ldconfig.
/usr/src holds the source files for most packages installed on the system. This is traditionally the location for the Linux source directory (/usr/src/linux), for example:
linux
linux-2.0.31 redhat
Unlike
DOS/Windows based systems, most Linux programs usually come as source and
are compiled and installed locally
This directory has the potential for causing a system administrator a bit of trouble as it is used to store (possibly) large volumes of temporary files associated with printing, mail and news. /var/spool may contain something like:
at
lp
lpd mqueue
samba
uucppublic
cron
mail rwho
uucp
In this case, there is a printer spool directory called lp (used for storing print request for the printer lp) and a /var/spool/mail directory that contains files for each user’s incoming mail.
Keep
an eye on the space consumed by the files and directories found in /var/spool.
If a device (like the printer) isn't working or a large volume of e-mail
has been sent to the system, then much of the hard drive space can be
quickly consumed by files stored in this location.
X-Windows provides UNIX with a very flexible graphical user interface. Tracing the X Windows file hierarchy can be very tedious, especially when your are trying to locate a particular configuration file or trying to removed a stale lock file.
A lock
file is used to stop more than one instance of a program executing at
once, a stale lock is a lock file that was not removed when a program
terminated, thus stopping the same program from restarting again
Most of X Windows is located in the /usr structure, with some references made to it in the /var structure.
Typically, most of the action is in the /usr/X11R6 directory (this is usually an alias or link to another directory depending on the release of X11 - the X Windows manager). This will contain:
bin doc
include lib man
The main X Windows binaries are located in /usr/X11R6/bin. This may be accessed via an alias of /usr/bin/X11 .
Configuration files for X Windows are located in /usr/X11R6/lib. To really confuse things, the X Windows configuration utility, xf86config, is located in /usr/X11R6/bin, while the configuration file it produces is located in /etc/X11 (XF86Config)!
Because of this, it is often very difficult to get an "overall picture" of how X Windows is working - my best advice is read up on it before you start modifying (or developing with) it.
A very common mistake amongst first time UNIX users is to incorrectly assume that all "bin" directories contain temporary files or files marked for deletion. This misunderstanding comes about because:
§ People associate the word "bin" with rubbish
§ Some unfortunate GUI based operating systems use little icons of "trash cans" for the purposes of storing deleted/temporary files.
However, bin is short for binary - binary or executable files. There are four major bin directories (none of which should be used for storing junk files :)
§ /bin
§ /sbin
§ /usr/bin
§ /usr/local/bin
Why so many?
All of the bin directories serve similar but distinct purposes; the division of binary files serves several purposes including ease of backups, administration and logical separation. Note that while most binaries on Linux systems are found in one of these four directories, not all are.
This directory must be present for the OS to boot. It contains utilities used during the startup; a typical listing would look something like:
Mail df
gzip mount
stty
arch
dialog head
mt
su
ash
dircolors hostname
mt-GNU sync
bash
dmesg ipmask
mv
tar
cat
dnsdomainname kill
netstat tcsh
chgrp
domainname killall
ping telnet
chmod
domainname-yp
ln
ps
touch
chown
du
login pwd
true
compress
echo ls
red
ttysnoops
cp
ed
mail rm
umount
cpio
false mailx
rmdir umssync
csh
free mkdir
setserial uname
cut
ftp mkfifo
setterm zcat
date
getoptprog mknod
sh
zsh
dd
gunzip more
sln
Note that this directory contains the shells and some basic file and text utilities (ls, pwd, cut, head, tail, ed etc). Ideally, the /bin directory will contain as few files as possible as this makes it easier to take a direct copy for recovery boot/root disks.
/sbin Literally "System Binaries". This directory contains files that should generally only be used by the root user, though the Linux file standard dictates that no access restrictions should be placed on normal users to these files. It should be noted that the PATH setting for the root user includes /sbin, while it is (by default) not included in the PATH of normal users.
The /sbin directory should contain essential system administration scripts and programs, including those concerned with user management, disk administration, system event control (restart and shutdown programs) and certain networking programs.
As a general rule, if users need to run a program, then it should not be located in /sbin. A typical directory listing of /sbin looks like:
adduser ifconfig
mkfs.minix rmmod
agetty
init
mklost+found rmt
arp
insmod
mkswap
rootflags
badblocks
installpkg mkxfs
route
bdflush
kbdrate modprobe
runlevel
chattr
killall5 mount
setup
clock
ksyms
netconfig setup.tty
debugfs
ldconfig
netconfig.color shutdown
depmod
lilo
netconfig.tty swapdev
dosfsck
liloconfig pidof
swapoff
dumpe2fs
liloconfig-color pkgtool
swapon
e2fsck
lsattr
pkgtool.tty telinit
explodepkg
lsmod
plipconfig tune2fs
fdisk
makebootdisk ramsize
umount
fsck
makepkg rarp
update
fsck.minix
mkdosfs rdev
vidmode
genksyms
mke2fs
reboot
xfsck
halt
mkfs removepkg
The
very important ldconfig
program is also located in /sbin.
While not commonly used from the shell prompt, ldconfig
is an essential program for the management of dynamic libraries (it is
usually executed at boot time). It will often have to be manually run
after library (and system) upgrades.
You should also be aware of:
/usr/sbin
- used for non-essential admin
tools.
/usr/local/sbin
- locally installed admin tools.
This directory contains most of the user binaries - in other words, programs that users will run. It includes standard user applications including editors and email clients as well as compilers, games and various network applications.
A listing of this directory will contain some 400 odd files. Users should definitely have /usr/bin in their PATH setting.
To this point, we have examined directories that contain programs that are (in general) part of the actual operating system package. Programs that are installed by the system administrator after that point should be placed in /usr/local/bin. The main reason for doing this is to make it easier to back up installed programs during a system upgrade, or in the worst case, to restore a system after a crash.
The /usr/local/bin
directory should only contain binaries and scripts - it should not contain
subdirectories or configuration files.
/etc is one place where the root user will spend a lot of time. It is not only the home to the all important passwd file, but contains just about every configuration file for a system (including those for networking, X Windows and the file system).
The /etc branch also contains the skel, X11 and rc.d directories.
/etc/skel contains the skeleton user files that are placed in a user's directory when their account is created.
/etc/X11 contains configuration files for X Windows.
/etc/rc.d is contains rc directories - each directory is given by the name rcn.d (n is the run level) - each directory may contain multiple files that will be executed at the particular run level. A sample listing of a /etc/rc.d directory looks something like:
init.d
rc.local rc0.d
rc2.d
rc4.d rc6.d
rc
rc.sysinit rc1.d
rc3.d
rc5.d
Linux maintains a particular area in which to place logs (or files which contain records of events). This directory is /var/log .
This directory usually contains:
cron
lastlog maillog.2
samba-log. secure.2
uucp
cron.1 log.nmb
messages samba.1
sendmail.st wtmp
cron.2 log.smb
messages.1 samba.2
spooler
xferlog
dmesg
maillog
messages.2 secure
spooler.1 xferlog.1
httpd
maillog.1 samba secure.1
spooler.2 xferlog.2
The /proc directory hierarchy contains files associated with the executing kernel. The files contained in this structure contain information about the state of the system's resource usage (how much memory, swap space and CPU is being used), information about each process and various other useful pieces of information. We will examine this directory structure in more depth in later chapters.
The /proc
file system is the main source of information for a program called top.
This is a very useful administration tool as it displays a
"live" readout of the CPU and memory resources being used by
each process on the system.
We will be discussing /dev in detail in the next chapter, however, for the time being, you should be aware that this directory is the primary location for special files called device files.
Because Linux is a dynamic OS, there will no doubt be changes to its file system as well. Two current issues that face Linux are:
§ Porting Linux on to may architectures and requiring a common location for hardware independent data files and scripts - the current location is /usr/share - this may change.
§ The location of third-party commercial software on Linux systems - as Linux's popularity increases, more software developers will produce commercial software to install on Linux systems. For this to happen, a location in which this can be installed must be provided and enforced within the file system standard. Currently, /opt is the likely option.
Because of this, it is advisable to obtain and read the latest copy of the file system standard so as to be aware of the current issues. Other information sources are easily obtainable by searching the web.
You should also be aware that while (in general), the UNIX file hierarchy looks similar from version to version, it contains differences based on requirements and the history of the development of the operating system implementation.
Where are man pages kept? Explain the format of the man page directories. (Hint: I didn't explain this anywhere in this chapter - you may have to do some looking)
As a system administrator, you are going to install the following programs, in each case, state the likely location of each package:
§ Java compiler and libraries
§ DOOM (a loud, violent but extremely entertaining game)
§ A network sniffer (for use by the sys admin only)
§ A new kernel source
A X Windows manager binary specially optimised for your new monitor