How to install and setup ZFS in Rocky Linux 9

Introduction to ZFS

The file system landscape in Linux has been continuously evolving, and ZFS (Zettabyte File System) is one of the most advanced and robust options available today. Originally developed by Sun Microsystems for the Solaris operating system, ZFS offers a myriad of features such as data integrity, snapshots, and encryption, making it a go-to choice for many Linux enthusiasts and professionals alike.

Rocky Linux, a free and open-source enterprise-class operating system that provides a stable and reliable platform, has gained significant traction in the Linux community. In this article, we will guide you through the process of installing and setting up ZFS on your Rocky Linux 9 system. Whether you are a seasoned sysadmin or a Linux newcomer, this comprehensive tutorial will provide all the necessary information to help you unlock the full potential of ZFS, ensuring optimal performance and data protection for your storage needs. Let's dive into the world of ZFS and explore its powerful capabilities together!

Pre-requisites and System Requirements

Before diving into the installation process, it is important to ensure that your system meets the necessary hardware and software requirements for ZFS.

Minimum hardware requirements for ZFS:
- At least 1 GB of RAM, though 2 GB or more is recommended for better performance
- A 64-bit processor
- Sufficient disk space for your intended storage needs
Recommended hardware for optimal performance:
- At least 8 GB of RAM, preferably with ECC (Error-Correcting Code) support
- A multi-core 64-bit processor
- High-performance storage devices, such as SSDs
Software requirements for installing ZFS on Rocky Linux 9:
- A fresh installation of Rocky Linux 9 with the latest updates
- Access to a user account with administrative privileges (sudo)

Understanding DKMS and kABI Module

DKMS (Dynamic Kernel Module Support) and kABI (Kernel Application Binary Interface) are two different mechanisms for maintaining compatibility between kernel modules and the Linux kernel. They play an essential role in the management of kernel modules like ZFS when kernel updates occur.

DKMS:

Dynamic Kernel Module Support (DKMS) is a framework used to generate Linux kernel modules whose sources generally reside outside the kernel source tree. The primary purpose of DKMS is to simplify the process of rebuilding kernel modules when a new kernel version is installed or updated. DKMS automatically compiles and installs the required kernel module for the new kernel, ensuring that the module remains functional and compatible.

In the context of ZFS, DKMS is used to compile and manage the ZFS kernel module across different kernel versions. When a new kernel is installed, DKMS rebuilds the ZFS kernel module, allowing it to work seamlessly with the updated kernel.

kABI:

Kernel Application Binary Interface (kABI) is a stable interface provided by the Linux kernel that allows kernel modules to maintain binary compatibility across kernel updates. By adhering to kABI, kernel modules can continue to work with new kernel versions without requiring recompilation.

For ZFS, using a kABI-tracking kmod package means that the ZFS kernel module is built to be compatible with the kernel's kABI. This eliminates the need to recompile the module every time the kernel is updated, making the process more efficient.

Installing ZFS on Rocky Linux 9

Let's install ZFS, one of the best file systems, on Rocky Linux 9.

Step 1: Adding the required repositories

First, enable the EPEL (Extra Packages for Enterprise Linux) repository by running the following command:

sudo dnf install epel-release -y

Step 2: Installing the ZFS package

Now, install the ZFS package from the <a href="https://github.com/zfsonlinux/zfsonlinux.github.com/tree/master/epel" target="_blank" rel="noopener"

official OpenZFS repository:

sudo dnf install -y https://zfsonlinux.org/epel/zfs-release-2-2.el9.noarch.rpm

Step 3: Install either DKMS or kABI based ZFS package

Option-1: Using DKMS-based ZFS package

By default, the ZFS repository provides the DKMS package. To ensure you are using the DKMS repository, you can run the following command:

sudo dnf config-manager --enable zfs

Before installing the ZFS DKMS package, you need to install the necessary development tools and kernel development headers. The two commands you've mentioned are required for the installation of the DKMS-based ZFS package because it involves building the ZFS kernel module from source code, and this requires specific development tools and kernel headers.

sudo dnf groupinstall "Development Tools" -y
sudo dnf install kernel-devel -y

This command installs the kernel-devel package, which provides the kernel headers and source files required for building external kernel modules, like the ZFS kernel module. The kernel headers are necessary because they contain the definitions and structures needed to interface with the Linux kernel. Without the kernel headers, it's impossible to build a kernel module that can interact correctly with the kernel.

NOTE

Make sure that the kernel and kernel-devel package have the same version or else the system will fail to load zfs module at later stage. Check uname -r and compare the version with rpm -qa | grep -E '^kernel(-devel)?-[0-9]' output to make sure both kernel and kernel-devel have same version.

The execution of this command can take several minutes depending upon your system performance so patiently wait for the command execution to complete:

sudo dnf install zfs-dkms -y

Option-2: Using kABI-based ZFS package

By default, the ZFS repository provides the DKMS package. To use the kABI-tracking kmod package, you need to disable the DKMS repository and enable the kmod repository:

sudo dnf config-manager --disable zfs
sudo dnf config-manager --enable zfs-kmod

The execution of this command can take several minutes depending upon your system performance so patiently wait for the command execution to complete:

sudo dnf install zfs -y

Step 6: Load the ZFS kernel module:

sudo modprobe zfs

To ensure that the ZFS module is loaded automatically at boot, add it to the /etc/modules-load.d/zfs.conf file:

echo "zfs" | sudo tee /etc/modules-load.d/zfs.conf

Now, you have installed the DKMS-based ZFS package on Rocky Linux 9. With this package, the ZFS kernel module will be recompiled automatically when the kernel is updated, ensuring compatibility across different kernel versions.

Basic ZFS Concepts and Terminologies

Pools, vdevs, and datasets:

Pools: A ZFS pool is a collection of storage devices that provide space for datasets. Pools allow you to combine multiple physical storage devices into a single logical storage unit.
vdevs (Virtual Devices): vdevs are the building blocks of ZFS pools. They can be single disks, mirrors, RAIDZ groups, or other types of devices. A ZFS pool consists of one or more vdevs.
Datasets: Datasets are the primary way to store and manage data in ZFS. They can be filesystems, volumes, or snapshots.

ZFS Filesystem:

A ZFS filesystem is a hierarchical storage system that behaves like a traditional Unix-style filesystem.
It supports all standard file and directory operations like creating, deleting, and modifying files and directories.
ZFS filesystems inherit properties from their parent datasets, such as compression, deduplication, and encryption settings.
ZFS filesystems can be snapshotted, cloned, and rolled back, making them suitable for managing versioned file storage and backups.
They automatically mount under the /pool_name/filesystem_name directory and can be accessed like any other directory.

ZFS Volume:

A ZFS volume is a block device that can be formatted with any filesystem, such as ext4 or XFS.
It is a fixed-size storage unit that acts as a virtual disk, presenting raw storage to other systems or applications.
ZFS volumes do not inherit properties from their parent datasets and do not support ZFS-specific features like snapshots and clones.
They need to be formatted and mounted before they can be accessed and used.
ZFS volumes are useful for providing storage to applications that require block-level access, such as virtual machines, databases, or iSCSI targets.

Snapshots and clones:

Snapshots: A ZFS snapshot is a point-in-time copy of a dataset. Snapshots are space-efficient, as they only store the differences between the current state and the snapshot state.
Clones: A ZFS clone is a writable copy of a snapshot. It shares its storage space with the snapshot, only using additional space for the changes made to the clone.

RAID levels and redundancy:

ZFS supports various RAID levels and redundancy mechanisms, such as mirrors, RAIDZ (similar to RAID 5), RAIDZ2 (similar to RAID 6), and RAIDZ3. These configurations protect your data against disk failures and improve overall data integrity.

Also Read

Configure software Linear RAID 0 in Linux
Configure Software RAID 0 in Linux
Configure Software RAID 1 in Linux
Configure Software RAID 4 in Linux
Configure Software RAID 5 in Linux
Configure Hybrid Software RAID 10 in Linux

Creating and Managing ZFS Pools

ZFS pools are created using the zpool create command, followed by the pool name and the devices or vdevs that will be part of the pool. Pool names should be descriptive and follow a consistent naming convention for better organization and management.

Creating a ZFS pool:

To list the available disks we can use lsblk command:

Here we have 3 disks attached to our server i.e. /dev/sda, /dev/sdb and /dev/sdc. We can't use /dev/sda as it is used for storing our system and OS data. So we have /dev/sdb and /dev/sdc on which we can create a pool.

sudo zpool create mypool mirror /dev/sdb /dev/sdc

This command creates a ZFS pool named mypool with a mirror vdev configuration using two disk devices /dev/sdb and /dev/sdc. The mirror keyword specifies that the pool will use a mirrored configuration, providing redundancy by keeping identical copies of data on both disks.

Alternative options:

Replace mirror with raidz or raidz2 for RAIDZ or RAIDZ2 configurations, respectively, providing more fault tolerance with increased storage capacity.
Add more disks to the mirror or RAIDZ configurations to increase storage capacity and fault tolerance.

Check the ZFS pool status after creating the pool:

$ sudo zpool status mypool
  pool: mypool
 state: ONLINE
config:

    NAME        STATE     READ WRITE CKSUM
    mypool      ONLINE       0     0     0
      mirror-0  ONLINE       0     0     0
        sdb     ONLINE       0     0     0
        sdc     ONLINE       0     0     0

errors: No known data errors

This command will show you the status of the mypool ZFS pool, including the pool state, disk devices, and any potential errors.

Creating a ZFS filesystem:

sudo zfs create mypool/myfilesystem

This command creates a new ZFS filesystem named myfilesystem inside the mypool pool. The filesystem can be used to store files and directories just like any other filesystem.

Verify the ZFS filesystem after creating it:

$ sudo zfs list mypool/myfilesystem
NAME                  USED  AVAIL     REFER  MOUNTPOINT
mypool/myfilesystem    24K  1.75G       24K  /mypool/myfilesystem

This command lists the mypool/myfilesystem dataset, showing the available space, used space, and other details.

Setting properties for the filesystem:

sudo zfs set compression=lz4 mypool/myfilesystem
sudo zfs set quota=50G mypool/myfilesystem

These commands set the compression and quota properties for the mypool/myfilesystem dataset.

compression=lz4: This command enables the LZ4 compression algorithm on the filesystem. This helps reduce storage space usage by compressing data stored in the filesystem. Alternative compression algorithms include gzip (with varying levels from 1 to 9) and zle (zero-length encoding).
quota=50G: This command sets a storage quota of 50 GB for the filesystem, limiting its maximum storage space usage.

Check the properties of the filesystem after setting them:

$ sudo zfs get all mypool/myfilesystem
NAME                 PROPERTY              VALUE                  SOURCE
mypool/myfilesystem  type                  filesystem             -
mypool/myfilesystem  creation              Wed Apr 26 21:57 2023  -
mypool/myfilesystem  used                  24K                    -
mypool/myfilesystem  available             1.75G                  -
...

This command displays all the properties of the mypool/myfilesystem dataset, including the compression and quota settings you've applied.

Creating a ZFS volume:

sudo zfs create -V 1G mypool/myvolume

This command creates a new ZFS volume named myvolume inside the mypool pool with a size of 1 GB. ZFS volumes are block devices that can be used as virtual disks or for other purposes, such as iSCSI targets or LVM physical volumes.

Verify the ZFS volume after creating it

$ sudo zfs list mypool/myvolume
NAME              USED  AVAIL     REFER  MOUNTPOINT
mypool/myvolume  1.03G  1.75G       12K  -

This command lists the mypool/myvolume dataset, showing the available space, used space, and other details.

Creating a snapshot of the filesystem:

sudo zfs snapshot mypool/myfilesystem@mysnapshot

This command creates a snapshot named mysnapshot of the mypool/myfilesystem dataset. Snapshots are point-in-time, read-only copies of a dataset that store the differences between the current state and the snapshot state. They are space-efficient and can be used for backups or to revert the dataset to a previous state.

Check the snapshot after creating it:

$ sudo zfs list -t snapshot
NAME                             USED  AVAIL     REFER  MOUNTPOINT
mypool/myfilesystem@mysnapshot     0B      -       24K  -

This command lists all the ZFS snapshots on your system. You should see the mypool/myfilesystem@mysnapshot snapshot in the output.

Cloning a snapshot:

sudo zfs clone mypool/myfilesystem@mysnapshot mypool/myclonedfilesystem

This command creates a new filesystem named myclonedfilesystem as a clone of the mypool/myfilesystem@mysnapshot snapshot. Clones are writable copies of snapshots that share storage space with the original snapshot, only using additional space for the changes made to the clone. Clones can be useful for testing changes to data without affecting the original dataset or creating multiple independent copies of a dataset.

Verify the cloned filesystem after cloning the snapshot

$ sudo zfs list mypool/myclonedfilesystem
NAME                        USED  AVAIL     REFER  MOUNTPOINT
mypool/myclonedfilesystem     0B   734M       24K  /mypool/myclonedfilesystem

This command lists the mypool/myclonedfilesystem dataset, showing the available space, used space, and other details.

Accessing the ZFS filesystem

After creating the ZFS pool mypool, you can access and work with it by using the datasets (filesystems and volumes) created within the pool. In our previous examples, we created a filesystem named myfilesystem and a volume named myvolume within the mypool pool.

You can list the mounted filesystem using:

# df -h | grep mypool
mypool                     734M  128K  734M   1% /mypool
mypool/myfilesystem        734M  128K  734M   1% /mypool/myfilesystem
mypool/myclonedfilesystem  734M  128K  734M   1% /mypool/myclonedfilesystem

As you can see by default, ZFS filesystems are automatically mounted under /mypool/myfilesystem. You can interact with this filesystem like any other directory on your system. For example:

To create a new directory:

mkdir /mypool/myfilesystem/new_directory

To create a new file:

touch /mypool/myfilesystem/new_file.txt

To copy files to the ZFS filesystem:

cp /path/to/source/file /mypool/myfilesystem/

Working with the ZFS volume

A ZFS volume is a block device, and you need to format it with a filesystem before using it. For example, you can format the ZFS volume with the ext4 filesystem and mount it to use the storage:

Format the ZFS volume with ext4:

$ sudo mkfs.ext4 /dev/zvol/mypool/myvolume
mke2fs 1.46.5 (30-Dec-2021)
Discarding device blocks: done                            
Creating filesystem with 262144 4k blocks and 65536 inodes
Filesystem UUID: f11a72b8-27c9-4828-b895-455f18e35b5a
Superblock backups stored on blocks: 
    32768, 98304, 163840, 229376

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

Create a mount point for the volume:

sudo mkdir /mnt/myvolume

Mount the ZFS volume to the mount point:

sudo mount /dev/zvol/mypool/myvolume /mnt/myvolume

Now you can access and work with the mounted ZFS volume like any other directory on your system.

Destroy or Delete a ZFS Pool

Here's the sequence of commands to destroy ZFS objects, considering the previous examples we've been working with:

Delete the cloned filesystem (if any):

sudo zfs destroy mypool/myclonedfilesystem

Delete snapshots:

sudo zfs destroy mypool/myfilesystem@mysnapshot

Delete the ZFS filesystem:

sudo zfs destroy mypool/myfilesystem

Delete the ZFS volume:

sudo zfs destroy mypool/myvolume

List all ZFS datasets (filesystems, volumes, snapshots, and clones):

$ sudo zfs list -t all
NAME     USED  AVAIL     REFER  MOUNTPOINT
mypool   159K  1.75G       24K  /mypool

Finally, if you want to destroy the ZFS pool itself:

sudo zpool destroy mypool

Check the ZFS pool status:

$ sudo zpool status mypool
cannot open 'mypool': no such pool

Troubleshooting

Fix: cannot destroy 'mypool/myfilesystem': filesystem has children use '-r' to destroy the following datasets: mypool/myfilesystem@mysnapshot

The error message indicates that the filesystem you're trying to delete (mypool/myfilesystem) has dependent child objects, such as snapshots or clones. In this case, you have a snapshot named mysnapshot. To delete the filesystem along with its dependent snapshots, you can use the -r flag, which stands for "recursive":

sudo zfs destroy -r mypool/myfilesystem

This command will delete the mypool/myfilesystem filesystem and all its dependent snapshots in one operation. Please note that you should exercise caution when using the -r flag, as it will delete all child objects associated with the specified filesystem, which may result in data loss.

Fix: cannot destroy 'mypool/myfilesystem': filesystem has dependent clones use '-R' to destroy the following datasets:

If you have a dependent clone, you need to use the -R flag instead of -r to destroy the filesystem and all its dependent clones and snapshots:

sudo zfs destroy -R mypool/myfilesystem

The -R flag stands for "recursive destroy and also any dependent clones." This command will delete the mypool/myfilesystem filesystem along with all its dependent snapshots and clones, including mypool/myclonedfilesystem.

Please exercise caution when using the -R flag, as it will delete all dependent objects associated with the specified filesystem, which may result in data loss. Make sure to verify that you no longer need these objects before proceeding with the deletion.

Summary

In this article, we covered the basics of ZFS and how to install and set it up on Rocky Linux 9. We discussed the key concepts of ZFS, including pools, vdevs, datasets, snapshots, and clones.

To get started with ZFS, we installed the ZFS package with both DKMS and kABI module, which allows ZFS to automatically recompile when a new kernel is installed. We then created a ZFS pool with a single device and explored ZFS-specific pool properties such as compression and checksumming.

Next, we demonstrated how to create and manage ZFS filesystems and volumes, including setting dataset properties and mounting and unmounting filesystems. We also covered advanced features such as creating and managing ZFS snapshots and clones.

Finally, we discussed how to delete ZFS objects, including volumes, filesystems, snapshots, and clones. We also provided tips on how to verify the success status of each command.

By following this guide, you should now have a good understanding of how to install, configure, and manage ZFS on Rocky Linux 9, and be able to use it to efficiently manage your storage needs.