Table of Contents
Sorting the sizes of files/directories on the disk is a necessity in Linux operating systems. For example:
- It can help manage your disk space efficiently.
- In case of a problem, the file with a large file size may need to be intervened.
- File copying can help speed up backup processes.
If you want to sort files by size for these and many other reasons, this article is for you.
To sort files based on size using the du
command, you can use the following command:
du -ah <directory_path> | grep -v "/$" | sort -hr
This command uses the du
command with the -ah
option to display the disk usage of all files in the specified directory in human-readable format, including all files and directories in the specified directory. The output of the du
command is then piped to grep -v "/$"
command to exclude directories from the output. The -v
option is used to select non-matching lines, and the /$
pattern matches lines that end with a forward slash (i.e. directory names). Finally, the output is piped to sort
command with the -h
and -r
options to sort the output by file size in human-readable format in reverse order (largest files first).
Method-1: Sort files in by size
To display the 10 largest files in the /var/log
directory, you can use the following command:
du -sh /var/log/* | sort -hr | head -n 10 4.2G /var/log/journal 952M /var/log/bcmt 303M /var/log/onekey.log 101M /var/log/sa 49M /var/log/messages-20230308 48M /var/log/messages-20230310 47M /var/log/messages-20230311 46M /var/log/messages-20230313 44M /var/log/messages-20230312 43M /var/log/messages-20230309
Method-2: Sort by size for a specific file type
To sort files by a specific file type by size using the du
command, you can use the following command:
du -ah --block-size=GB /path/to/folder/*.<file_extension> | sort -hr
Replace <file_extension>
with the file extension of the file type you want to sort (e.g., pdf
, mp3
, txt
). This command will display the size of each file in gigabytes and sort them in reverse order (largest to smallest).
Here is a breakdown of the command:
du
: displays the disk usage of files and directories-ah
: prints the sizes of all files and directories, including hidden ones--block-size=GB
: sets the block size to gigabytes/path/to/folder/*.<file_extension>
: specifies the path to the folder containing the files you want to sort and the file extension you want to filter forsort -hr
: sorts the output by size in reverse order, with theh
flag displaying sizes in a human-readable format (e.g., 1.5G instead of 1500000000) and ther
flag reversing the order of the sorting.
For example:
#du -ah --block-size=GB /var/log/*.log | sort -hr 1GB /var/log/yum.log 1GB /var/log/sudo.log 1GB /var/log/services.deny.log 1GB /var/log/secpam-boot.log 1GB /var/log/oneey.log 1GB /var/log/multus.log 1GB /var/log/hconfig.log 1GB /var/log/cloud-init.log 1GB /var/log/alternatives.log 0GB /var/log/kube-scheduler.log 0GB /var/log/kube-controller-manager.log
To perform a recursive search and sort files by size, you can use the -r
or -R
flag with the du
command.
du -ah --block-size=GB /path/to/folder/ | grep '\.<file_extension>$' | sort -hr
Replace <file_extension>
with the file extension of the file type you want to sort (e.g., pdf
, mp3
, txt
).
Here's a breakdown of the command:
du
: displays the disk usage of files and directories-ah
: prints the sizes of all files and directories, including hidden ones--block-size=GB
: sets the block size to gigabytes/path/to/folder/
: specifies the path to the directory containing the files you want to sortgrep '\.<file_extension>$'
: filters the output to show only files with the specified file extension. The$
character signifies the end of the line.sort -hr
: sorts the output by size in reverse order, with theh
flag displaying sizes in a human-readable format (e.g., 1.5G instead of 1500000000) and ther
flag reversing the order of the sorting.
Here's an example
# du -ah --block-size=GB /var/log | grep '\.journal$' | sort -hr 1GB /var/log/journal/d43c4bf3a461ef63cf1b638ac2415658/system.journal 1GB /var/log/journal/c45dfbc104ef4747414300b14b290379/user-1027.journal 1GB /var/log/journal/c45dfbc104ef4747414300b14b290379/system.journal 1GB /var/log/journal/6c00eef24a1c43adaaca058897ebe201/user-1000.journal 1GB /var/log/journal/6c00eef24a1c43adaaca058897ebe201/user-1000@e5d9578073d14d53976df4861debda2d-0000000000eed101-0005f64a55715959.journal 1GB /var/log/journal/6c00eef24a1c43adaaca058897ebe201/user-1000@e5d9578073d14d53976df4861debda2d-0000000000d2c6b7-0005f5e61a450544.journal 1GB /var/log/journal/6c00eef24a1c43adaaca058897ebe201/user-1000@e5d9578073d14d53976df4861debda2d-0000000000d25af2-0005f5e46f8545aa.journal 1GB /var/log/journal/6c00eef24a1c43adaaca058897ebe201/system.journal
Method-3: Sort files larger/smaller than a certain size
To sort and print files which are bigger than a certain size, you can use the find
command to locate files of a specific size or larger, and then use the du
and sort
commands to sort the output by file size in human-readable format. Here's an example command that will sort and print files larger than 100 MB in the /var/log
directory:
find /var/log -type f -size +100M -exec du -h {} + | sort -hr
This command searches for all files in the /var/log
directory (including subdirectories) that are larger than 100 MB, using the -size
option in combination with the +
sign and the M
suffix to specify a file size of 100 megabytes or larger. The -type f
option ensures that only regular files (not directories) are included in the search.
The find
command then passes the list of matching files to the du
command with the -h
option to display the disk usage of each file in human-readable format. The {}
and +
symbols in the command allow find
to pass multiple file names to du
at once, which can improve performance.
Finally, the output of the du
command is piped to sort
command with the -h
and -r
options to sort the output by file size in human-readable format in reverse order (largest files first).
Alternatively, you can use the du
command to filter and display only files that are larger than a certain size.
du -ah /var/log | awk '$1 ~ /M$/ && $1+0 > 100' | sort -hr
This command uses du
to display the disk usage of all files in the /var/log
directory (including subdirectories) in human-readable format, followed by the file path. The output is then piped to awk
, which filters the output to include only lines where the file size is greater than 100 MB. The $1 ~ /M$/
condition checks if the first field (the file size) ends with the "M" character, indicating that the size is in megabytes. The $1+0 > 100
condition checks if the file size (converted to a numeric value using the +0
expression) is greater than 100 MB.
Finally, the output of awk
is piped to sort
command with the -h
and -r
options to sort the output by file size in human-readable format in reverse order (largest files first).
To print all files larger than 100MB
# du -ah /var/log | awk '$1 ~ /M$/ && $1+0 > 100' | sort -hr 952M /var/log/bcmt/perf.log 952M /var/log/bcmt 303M /var/log/onekey.log 129M /var/log/journal/6c00eef24a1c43adaaca058897ebe201/system@57a9a462eb7e41d182679b34f8a3c576-00000000010dd753-0005f6b8f7f5d5f2.journal 129M /var/log/journal/6c00eef24a1c43adaaca058897ebe201/system@57a9a462eb7e41d182679b34f8a3c576-00000000010bad54-0005f6b0d2f303f5.journal 129M /var/log/journal/6c00eef24a1c43adaaca058897ebe201/system@57a9a462eb7e41d182679b34f8a3c576-0000000001097eb1-0005f6a92333bf1f.journal 129M /var/log/journal/6c00eef24a1c43adaaca058897ebe201/system@57a9a462eb7e41d182679b34f8a3c576-000000000107553d-0005f6a0e5fd5d16.journal 129M /var/log/journal/6c00eef24a1c43adaaca058897ebe201/system@57a9a462eb7e41d182679b34f8a3c576-0000000001052a3b-0005f698e545dd52.journal 129M /var/log/journal/6c00eef24a1c43adaaca058897ebe201/system@57a9a462eb7e41d182679b34f8a3c576-0000000001030140-0005f690d8d8847d.journal 129M /var/log/journal/6c00eef24a1c43adaaca058897ebe201/system@57a9a462eb7e41d182679b34f8a3c576-000000000100d93d-0005f688f03cca23.journal 129M /var/log/journal/6c00eef24a1c43adaaca058897ebe201/system@57a9a462eb7e41d182679b34f8a3c576-0000000000fead20-0005f681cf19d9f3.journal
To print all files larger than 1 GB
# du -ah /var/log | awk '$1 ~ /G$/ && $1+0 > 1' | sort -hr 6.0G /var/log 4.3G /var/log/journal 4.2G /var/log/journal/6c00eef24a1c43adaaca058897ebe201
Summary
The du
command is a powerful tool that can be used to sort files by size in various ways. One of the most common ways to use the du
command is to sort files by size recursively within a directory. This method is useful when you want to sort files by size in nested folders within a directory. By using the -r
or -R
flag with the du
command, you can perform a recursive search and sort files by size.
Another way to use the du
command to sort files by size is to filter files by their file extension. This method is useful when you want to sort files of a specific type by size. By using the grep
command along with the du
command, you can filter files by their extension and sort them by size.
In addition to these methods, the du
command can be used to sort files by size in ascending or descending order. This is useful when you want to sort files by size from smallest to largest or vice versa. By using the sort
command with the du
command, you can sort files by size and display the results in ascending or descending order.
References
15+ Tips to PROPERLY sort files in Linux [Cheat Sheet]
serverfault.com - How can I sort du -h output by size