Linux File Descriptors [In-Depth Tutorial]

In this tutorial we will explain everything you need to know about Linux File Descriptors.

In Linux, file descriptors are a fundamental concept used to represent and manage input and output streams between processes and files, sockets, pipes, and other sources or destinations of data. A file descriptor is a unique integer value that serves as a handle or identifier for an open file or communication channel.

The purpose of file descriptors is to provide a way for processes to read from and write to files and other input/output devices without needing to know the underlying implementation details of these devices. Instead, they can simply use the file descriptor to interact with the device in a standard way, regardless of its specific implementation.

File descriptors are a key building block for many core Linux utilities and commands, including shell scripts, system daemons, and user-level applications. Understanding how file descriptors work is therefore essential for anyone working with Linux systems and software development.

To check the file descriptors used by a running process in Linux, one can use the lsof (list open files) command. This command shows all open files and their corresponding file descriptors used by the specified process. Additionally, the /proc file system provides a mechanism for inspecting the file descriptors used by a particular process by looking at the contents of the /proc/[pid]/fd directory.

What are File Descriptors in Linux?

In Linux, file descriptors are a mechanism used to represent open files, sockets, pipes, and other input and output streams. A file descriptor is a non-negative integer value that serves as a unique identifier for a file or input/output channel that a process has opened.

File descriptors are used by processes to read data from and write data to files and other input/output devices. They are a fundamental concept in Linux operating system, as they provide a standardized way for processes to interact with input/output devices, regardless of their implementation.

The three standard file descriptors in Linux are:

Standard Input (stdin): This is the file descriptor that represents the standard input stream, which is typically connected to the keyboard or terminal.
Standard Output (stdout): This is the file descriptor that represents the standard output stream, which is typically connected to the display or terminal.
Standard Error (stderr): This is the file descriptor that represents the standard error stream, which is used for error messages and diagnostic output.

File descriptors are also used for inter-process communication (IPC), where one process can pass a file descriptor to another process to share access to an open file or input/output channel.

Different types of File Descriptors

In Linux, there are three types of file descriptors that are commonly used by programs and commands: Standard Input (stdin), Standard Output (stdout), and Standard Error (stderr).

Standard Input (stdin): This is a file descriptor that represents the standard input stream, which is used to read input data from the user or from another program. By default, stdin is connected to the keyboard, but it can be redirected to read input from a file or another program.
Standard Output (stdout): This is a file descriptor that represents the standard output stream, which is used to write output data to the user or to another program. By default, stdout is connected to the display or terminal, but it can be redirected to write output to a file or another program.
Standard Error (stderr): This is a file descriptor that represents the standard error stream, which is used to write error messages and diagnostic output. By default, stderr is also connected to the display or terminal, but it can be redirected to write error messages to a file or another program.

In Unix-like systems, these three file descriptors are commonly referred to as "the standard streams" or "the standard file descriptors." They are numbered 0 (stdin), 1 (stdout), and 2 (stderr), and they are automatically opened by the shell for every new process that is created.

By redirecting these standard streams, programs can control where input, output, and error messages are sent, which allows for greater flexibility and automation in system administration and software development.

System Calls for File Descriptors

In Linux, file descriptors are managed using system calls. Here are some of the most commonly used system calls for working with file descriptors:

open(): This system call is used to open a file or create a new file if it does not exist. It takes a file name and a set of flags as arguments and returns a file descriptor if the operation is successful. The flags determine whether the file should be opened for reading, writing, or both, and whether it should be created if it does not already exist.
read(): This system call is used to read data from a file or other input/output channel. It takes a file descriptor, a buffer, and a number of bytes as arguments, and returns the number of bytes that were actually read. If the end of the file is reached, the return value will be 0.
write(): This system call is used to write data to a file or other input/output channel. It takes a file descriptor, a buffer, and a number of bytes as arguments, and returns the number of bytes that were actually written.
close(): This system call is used to close a file or input/output channel. It takes a file descriptor as an argument and returns 0 if the operation is successful.
dup(): This system call is used to duplicate a file descriptor. It takes a file descriptor as an argument and returns a new file descriptor that refers to the same open file or input/output channel.
dup2(): This system call is used to duplicate a file descriptor and redirect it to a new file or input/output channel. It takes two file descriptors as arguments and returns 0 if the operation is successful.
fcntl(): This system call is used to perform various operations on a file descriptor, such as setting or retrieving file status flags, changing the file descriptor type, or setting file locks.

These system calls provide a powerful and flexible interface for working with file descriptors in Linux. By using them, programmers can open, read from, write to, duplicate, and close file descriptors, as well as perform a wide range of other operations on them.

How to check the number of FDs in use

1. Using the lsof command

The lsof (List Open Files) command can be used to display a list of all open files and their corresponding FDs for a specific process or for the entire system.

The command output gives the total number of file descriptors used:

text

$ lsof | wc -l
124939

To display the number of open FDs for a specific process, you can use the command lsof -p <pid> | wc -l, where <pid> is the process ID.

text

# lsof -p 1374 2>/dev/null | wc -l
457

2. Using the /proc file system

The /proc file system contains a variety of system and process information, including the number of open file descriptors for each process. To display the number of open FDs for a specific process, you can use the command cat /proc/<pid>/limits | grep "Max open files", where <pid> is the process ID.

text

# cat /proc/1374/limits | grep "Max open files"
Max open files            1048576              1048576              files

The output of the command shows that the maximum number of open files for the process is 1048576. This means that the process can open up to 1048576 files or input/output channels at the same time. The "files" column indicates the unit of measurement, which is in this case "files" for the number of open files.

The second column also shows 1048576, which is the "soft" limit, while the third column shows the "hard" limit, which is also 1048576. The soft limit is the current limit for the process, which can be increased up to the hard limit if needed. The hard limit is the absolute maximum limit for the process, which cannot be exceeded.

3. Using the /proc/sys/fs/file-nr file

The /proc/sys/fs/file-nr file contains information about the number of allocated file handles, the number of unused file handles, and the maximum number of file handles. To display the number of open FDs, you can use the command:

text


# cat /proc/sys/fs/file-nr
7136    0   9223372036854775807

# cat /proc/sys/fs/file-nr | awk '{print $1 - $2}'
7328

The command cat /proc/sys/fs/file-nr displays the current usage statistics of file handles (file descriptors) for the Linux kernel. The output consists of three columns separated by tabs:

The first column shows the total number of allocated file handles (i.e., file descriptors) in the system.
The second column shows the number of currently used file handles.
The third column shows the maximum number of file handles that can be allocated for the system.

In the output above, the first column shows the total number of allocated file handles is 7136. The second column shows that there are currently no used file handles, as it displays 0. The third column shows the maximum number of file handles that can be allocated is 9223372036854775807, which is the maximum value of a 64-bit unsigned integer on Linux.

Check and Modify File Descriptor Limit

In Linux, there are limits on the number of file descriptors that can be open at any given time. These limits are set by the kernel and can vary depending on the system configuration. By default, the maximum number of open file descriptors per process is often set to 1024.

Method-1: Using the ulimit command (non-persistent)

The ulimit command can be used to set or display the file descriptor limit for the current shell session.

To check the current limits on the number of file descriptors, you can use the ulimit command with the -n option. For example, the command "ulimit -n" will display the maximum number of open file descriptors allowed for the current user.

To adjust the limits on the number of file descriptors, you can use the ulimit command with the -n option and a new value. For example, the command "ulimit -n 2048" will set the maximum number of open file descriptors for the current user to 2048.

However, changing the limit on the number of file descriptors requires root privileges. Therefore, to change the limits for all users or system-wide, you would need to modify the system configuration files.

Note that the ulimit command only displays the current file descriptor limit for the current session or user. To permanently change the file descriptor limit for all users or system-wide, you need to modify the system configuration files, such as /etc/security/limits.conf or /etc/sysctl.conf.

Method-2: Modifying the limits.conf file

In Linux operating systems, there is a limit (file descriptor limit) for the number of files a user can open at the same time. This limit is usually configured in the /etc/security/limits.conf file or other files under the /etc/security/limits.d/ folder. In these files, you can set individual limit values for each user.

text


$ vi /etc/security/limits.conf

<username>  hard  nofile  <limit>
foc         hard   nofile  1200
foc         soft   nofile  1000

Here we will set the file descriptor limit of user <username>. The "soft" and "hard" parameters indicate that the user will not be allowed to exceed the limit value. The "nofile" parameter tells us that we have set a limit for file descriptors. <limit> is the specified limit value.

Method-3: Modifying the sysctl.conf file

The sysctl.conf file allows administrators to modify various kernel parameters, including the maximum number of file descriptors. To set a new limit system-wide, you can edit this file and append the following line:

text

$ sudo vi /etc/sysctl.conf

then the following values are entered, the file is saved and exited:

text

fs.file-max = 100000

Then update the changes using (reboot is not required)

text

sysctl --system

Then you can verify the new FD limit with the following command:

text

$ cat /proc/sys/fs/file-max
100000

Method-4: Modifying the systemd configuration

If your system uses systemd, you can modify the file descriptor limit by creating a new service configuration file. For example, to set the maximum number of open file descriptors to 5000 for a specific service, you can create a new file /etc/systemd/system/myservice.service with the following content:

text


[Unit]
Description=My Service

[Service]
LimitNOFILE=5000
ExecStart=/path/to/myservice

[Install]
WantedBy=multi-user.target

This sets the maximum number of open file descriptors to 5000 for the "myservice" service

Show user's open file limit

To show a user's open file limit in Linux, you can use the ulimit command with the -n option. The ulimit command is used to set or display resource limits, including the maximum number of open file descriptors. Here's how to use it:

Open a terminal window and type the following command and press Enter:

text

ulimit -n

This command will display the current file descriptor limit for the current shell session.

To check the file descriptor limit for a specific user, you can switch to that user first using the su command command. For example, if you want to check the file descriptor limit for the user "john", you can type the following command and press Enter:

text

su - john -c 'ulimit -n'

This command will switch to the user "john" and display their current file descriptor limit.

To get the hard value, you can use:

text

su - john -c 'ulimit -Hn'
524288

The soft value is:

text

su - john -c 'ulimit -Sn'
1024

Description of parameters:

-S use the `soft' resource limit
-H use the `hard' resource limit
-n the maximum number of open file descriptors

Number of Open Files vs Open FD. Are these same?

In Linux, a file is represented by an inode, which contains information about the file's location, permissions, and other attributes. When a process opens a file, it creates a new file descriptor (FD) that refers to the inode of the file. The FD is used by the process to read from or write to the file.

However, a single file can have multiple FDs, each of which points to the same inode. This means that the number of open files is not the same as the number of open file descriptors. For example, if a process opens the same file twice, it will have two FDs but only one open file.

On the other hand, a process can have multiple open files, each of which has its own FD. This means that the number of open file descriptors is not the same as the number of open files. For example, a process that opens three different files will have three FDs but only three open files.

The distinction between open files and open file descriptors is important because the maximum number of open file descriptors is usually limited by the system, while the maximum number of open files is limited by the available system resources, such as memory and disk space.

To avoid running out of open file descriptors, processes must carefully manage their use of FDs and close them when they are no longer needed. This is especially important in long-running processes, such as servers or daemons, that may open many files over time.

How to troubleshoot "Too many open files"?

One common error related to file descriptors (FDs) in Linux is the "Too many open files" error. This error occurs when a process tries to open more file descriptors than the system allows, either because the maximum FD limit has been reached or because the process is leaking FDs.

The "Too many open files" error can cause a variety of issues, such as:

Programs or processes failing to open new files or input/output channels.
Programs or processes failing to write to or read from files or input/output channels.
System instability or crashes due to resource exhaustion.

To identify or fix issues related to running out of open FDs, you can take the following steps:

Check the current FD usage: You can use the lsof command to list all open files and their corresponding FDs for a specific process or for the entire system. For example, to list all open files and their corresponding FDs for the current process, you can use the command lsof -p $$.
Check the maximum FD limit: You can use the ulimit command to display the maximum number of open FDs for the current user or process. For example, to display the maximum number of open FDs for the current user, you can use the command ulimit -n.
Adjust the FD limit: If the current FD usage is close to the maximum FD limit, you may need to adjust the limit to avoid running out of FDs. You can use the ulimit command to increase the maximum number of open FDs temporarily for the current user or process. For example, to set the maximum number of open FDs to 10000, you can use the command ulimit -n 10000.
Check for leaked FDs: Leaked FDs can cause the system to run out of FDs, even if the maximum FD limit has not been reached. To identify leaked FDs, you can use the lsof command to check for open files or input/output channels that are no longer being used by any process. You can then close these leaked FDs to free up resources.

Summary

In Linux, file descriptors (FDs) are a fundamental concept in input/output operations. A file descriptor is a unique identifier that represents an open file or input/output channel, such as a socket or pipe. FDs are used to perform various operations on files and input/output devices, including reading, writing, and closing them.

There are three standard FDs in Linux: Standard Input (stdin), Standard Output (stdout), and Standard Error (stderr). These standard FDs are automatically created for each process and can be redirected or piped to other input/output channels.

FDs are managed using system calls, such as open(), read(), write(), and close(). These system calls provide a flexible interface for working with file descriptors in Linux, allowing programmers to open, read from, write to, duplicate, and close FDs, as well as perform a wide range of other operations on them.

To avoid issues related to running out of FDs, it is important to monitor the FD usage, adjust the FD limit, and check for leaked FDs. By carefully managing the use of FDs, processes can ensure that they have access to the files and input/output channels they need, while avoiding resource exhaustion and system instability.

For more detailed information on changing the FD limit, you can check our "Working with ulimit in Linux Beginners Guide" article.

References

en.wikipedia.org - File descriptor
docs.oracle.com - <a href="https://docs.oracle.com/cd/E19476-01/821-0505/file-descriptor-requirements.html" target="_blank" rel="noopener"

File Descriptor Requirements (Linux Systems)

What are File Descriptors in Linux?

Different types of File Descriptors

System Calls for File Descriptors

How to check the number of FDs in use

Check and Modify File Descriptor Limit

Show user's open file limit

Number of Open Files vs Open FD. Are these same?

How to troubleshoot "Too many open files"?

Summary

References

Related Articles

Convert Decimal to Fraction [7 Programming Languages]

Where to set environment variables in Linux?

Add User to Group in Linux Efficiently

Search GoLinuxCloud