How to use "du" to sort by size? [SOLVED]


Written By - Omer Cakmak
Advertisement

Sorting the sizes of files/directories on the disk is a necessity in Linux operating systems. For example:

  • It can help manage your disk space efficiently.
  • In case of a problem, the file with a large file size may need to be intervened.
  • File copying can help speed up backup processes.

If you want to sort files by size for these and many other reasons, this article is for you.

To sort files based on size using the du command, you can use the following command:

du -ah <directory_path> | grep -v "/$" | sort -hr

This command uses the du command with the -ah option to display the disk usage of all files in the specified directory in human-readable format, including all files and directories in the specified directory. The output of the du command is then piped to grep -v "/$" command to exclude directories from the output. The -v option is used to select non-matching lines, and the /$ pattern matches lines that end with a forward slash (i.e. directory names). Finally, the output is piped to sort command with the -h and -r options to sort the output by file size in human-readable format in reverse order (largest files first).

 

Method-1: Sort files in by size

To display the 10 largest files in the /var/log directory, you can use the following command:

du -sh /var/log/* | sort -hr | head -n 10
4.2G    /var/log/journal
952M    /var/log/bcmt
303M    /var/log/onekey.log
101M    /var/log/sa
49M     /var/log/messages-20230308
48M     /var/log/messages-20230310
47M     /var/log/messages-20230311
46M     /var/log/messages-20230313
44M     /var/log/messages-20230312
43M     /var/log/messages-20230309

 

Method-2: Sort by size for a specific file type

To sort files by a specific file type by size using the du command, you can use the following command:

du -ah --block-size=GB /path/to/folder/*.<file_extension> | sort -hr

Replace <file_extension> with the file extension of the file type you want to sort (e.g., pdf, mp3, txt). This command will display the size of each file in gigabytes and sort them in reverse order (largest to smallest).

Here is a breakdown of the command:

Advertisement
  • du: displays the disk usage of files and directories
  • -ah: prints the sizes of all files and directories, including hidden ones
  • --block-size=GB: sets the block size to gigabytes
  • /path/to/folder/*.<file_extension>: specifies the path to the folder containing the files you want to sort and the file extension you want to filter for
  • sort -hr: sorts the output by size in reverse order, with the h flag displaying sizes in a human-readable format (e.g., 1.5G instead of 1500000000) and the r flag reversing the order of the sorting.

For example:

#du -ah --block-size=GB /var/log/*.log | sort -hr
1GB     /var/log/yum.log
1GB     /var/log/sudo.log
1GB     /var/log/services.deny.log
1GB     /var/log/secpam-boot.log
1GB     /var/log/oneey.log
1GB     /var/log/multus.log
1GB     /var/log/hconfig.log
1GB     /var/log/cloud-init.log
1GB     /var/log/alternatives.log
0GB     /var/log/kube-scheduler.log
0GB     /var/log/kube-controller-manager.log

To perform a recursive search and sort files by size, you can use the -r or -R flag with the du command.

du -ah --block-size=GB /path/to/folder/ | grep '\.<file_extension>$' | sort -hr

Replace <file_extension> with the file extension of the file type you want to sort (e.g., pdf, mp3, txt).

Here's a breakdown of the command:

  • du: displays the disk usage of files and directories
  • -ah: prints the sizes of all files and directories, including hidden ones
  • --block-size=GB: sets the block size to gigabytes
  • /path/to/folder/: specifies the path to the directory containing the files you want to sort
  • grep '\.<file_extension>$': filters the output to show only files with the specified file extension. The $ character signifies the end of the line.
  • sort -hr: sorts the output by size in reverse order, with the h flag displaying sizes in a human-readable format (e.g., 1.5G instead of 1500000000) and the r flag reversing the order of the sorting.

Here's an example

# du -ah --block-size=GB /var/log | grep '\.journal$' | sort -hr
1GB     /var/log/journal/d43c4bf3a461ef63cf1b638ac2415658/system.journal
1GB     /var/log/journal/c45dfbc104ef4747414300b14b290379/user-1027.journal
1GB     /var/log/journal/c45dfbc104ef4747414300b14b290379/system.journal
1GB     /var/log/journal/6c00eef24a1c43adaaca058897ebe201/user-1000.journal
1GB     /var/log/journal/6c00eef24a1c43adaaca058897ebe201/user-1000@e5d9578073d14d53976df4861debda2d-0000000000eed101-0005f64a55715959.journal
1GB     /var/log/journal/6c00eef24a1c43adaaca058897ebe201/user-1000@e5d9578073d14d53976df4861debda2d-0000000000d2c6b7-0005f5e61a450544.journal
1GB     /var/log/journal/6c00eef24a1c43adaaca058897ebe201/user-1000@e5d9578073d14d53976df4861debda2d-0000000000d25af2-0005f5e46f8545aa.journal
1GB     /var/log/journal/6c00eef24a1c43adaaca058897ebe201/system.journal

 

Method-3: Sort files larger/smaller than a certain size

To sort and print files which are bigger than a certain size, you can use the find command to locate files of a specific size or larger, and then use the du and sort commands to sort the output by file size in human-readable format. Here's an example command that will sort and print files larger than 100 MB in the /var/log directory:

find /var/log -type f -size +100M -exec du -h {} + | sort -hr

This command searches for all files in the /var/log directory (including subdirectories) that are larger than 100 MB, using the -size option in combination with the + sign and the M suffix to specify a file size of 100 megabytes or larger. The -type f option ensures that only regular files (not directories) are included in the search.

The find command then passes the list of matching files to the du command with the -h option to display the disk usage of each file in human-readable format. The {} and + symbols in the command allow find to pass multiple file names to du at once, which can improve performance.

Finally, the output of the du command is piped to sort command with the -h and -r options to sort the output by file size in human-readable format in reverse order (largest files first).

 

Alternatively, you can use the du command to filter and display only files that are larger than a certain size.

du -ah /var/log | awk '$1 ~ /M$/ && $1+0 > 100' | sort -hr

This command uses du to display the disk usage of all files in the /var/log directory (including subdirectories) in human-readable format, followed by the file path. The output is then piped to awk, which filters the output to include only lines where the file size is greater than 100 MB. The $1 ~ /M$/ condition checks if the first field (the file size) ends with the "M" character, indicating that the size is in megabytes. The $1+0 > 100 condition checks if the file size (converted to a numeric value using the +0 expression) is greater than 100 MB.

Finally, the output of awk is piped to sort command with the -h and -r options to sort the output by file size in human-readable format in reverse order (largest files first).

To print all files larger than 100MB

# du -ah /var/log | awk '$1 ~ /M$/ && $1+0 > 100' | sort -hr
952M	/var/log/bcmt/perf.log
952M	/var/log/bcmt
303M	/var/log/onekey.log
129M	/var/log/journal/6c00eef24a1c43adaaca058897ebe201/system@57a9a462eb7e41d182679b34f8a3c576-00000000010dd753-0005f6b8f7f5d5f2.journal
129M	/var/log/journal/6c00eef24a1c43adaaca058897ebe201/system@57a9a462eb7e41d182679b34f8a3c576-00000000010bad54-0005f6b0d2f303f5.journal
129M	/var/log/journal/6c00eef24a1c43adaaca058897ebe201/system@57a9a462eb7e41d182679b34f8a3c576-0000000001097eb1-0005f6a92333bf1f.journal
129M	/var/log/journal/6c00eef24a1c43adaaca058897ebe201/system@57a9a462eb7e41d182679b34f8a3c576-000000000107553d-0005f6a0e5fd5d16.journal
129M	/var/log/journal/6c00eef24a1c43adaaca058897ebe201/system@57a9a462eb7e41d182679b34f8a3c576-0000000001052a3b-0005f698e545dd52.journal
129M	/var/log/journal/6c00eef24a1c43adaaca058897ebe201/system@57a9a462eb7e41d182679b34f8a3c576-0000000001030140-0005f690d8d8847d.journal
129M	/var/log/journal/6c00eef24a1c43adaaca058897ebe201/system@57a9a462eb7e41d182679b34f8a3c576-000000000100d93d-0005f688f03cca23.journal
129M	/var/log/journal/6c00eef24a1c43adaaca058897ebe201/system@57a9a462eb7e41d182679b34f8a3c576-0000000000fead20-0005f681cf19d9f3.journal

To print all files larger than 1 GB

# du -ah /var/log | awk '$1 ~ /G$/ && $1+0 > 1' | sort -hr
6.0G	/var/log
4.3G	/var/log/journal
4.2G	/var/log/journal/6c00eef24a1c43adaaca058897ebe201

 

Summary

The du command is a powerful tool that can be used to sort files by size in various ways. One of the most common ways to use the du command is to sort files by size recursively within a directory. This method is useful when you want to sort files by size in nested folders within a directory. By using the -r or -R flag with the du command, you can perform a recursive search and sort files by size.

Another way to use the du command to sort files by size is to filter files by their file extension. This method is useful when you want to sort files of a specific type by size. By using the grep command along with the du command, you can filter files by their extension and sort them by size.

In addition to these methods, the du command can be used to sort files by size in ascending or descending order. This is useful when you want to sort files by size from smallest to largest or vice versa. By using the sort command with the du command, you can sort files by size and display the results in ascending or descending order.

 

References

15+ Tips to PROPERLY sort files in Linux [Cheat Sheet]
serverfault.com - How can I sort du -h output by size

 

Didn't find what you were looking for? Perform a quick search across GoLinuxCloud

If my articles on GoLinuxCloud has helped you, kindly consider buying me a coffee as a token of appreciation.

Buy GoLinuxCloud a Coffee

For any other feedbacks or questions you can either use the comments section or contact me form.

Thank You for your support!!

Leave a Comment