Beginners guide to how LVM works in Linux (architecture)

In this tutorial we will learn about Logical Volume Manager i.e. LVM used in Linux and Unix. In the early days of Linux servers, storage was handled by creating partitions on disks. Even if this approach does work, there are some disadvantages, the most important of which is that disks are inflexible. That is why the Logical Volume Manager was introduced. We will learn more about this in this tutorial.

1. LVM Architecture

Here is a diagram of LVM architecture where you can see there are multiple layers:

Beginners guide to how LVM works in Linux (architecture)

The lowest layer i.e. the foundation is based on Storage Devices. These can be any storage devices, such as complete disks, partitions, logical units (LUNs) on a storage-area network (SAN), and whatever else is made possible in modern storage topologies.
Now optionally these storage devices can be presented as a Partition instead of raw disk for the Logical Volume Manager (LVM). In this diagram as you see I have created different partitions for each available disk. When we usepartitions, the partition type should be set toLinux LVMwith an ID of8Eusing thefdiskpartitioning tool.
Next we need to convert these individual partitions as "Physical Volume" which makes them usable in an LVM environment. So a Physical Volume is a one-to-one mapping of individual partition.
A storage device that is a physical volume can be added to the "Volume Group", which is the abstraction of all available storage. “The abstraction” means that the volume group is not something that is fixed, but that it can be resized when needed, which makes it possible to add more space on the volume group level when volumes are running out of disk space. In layman's terms, you are running out of space, add a new disk -> create a partition -> create a physical volume -> extend your volume group and you have additional space for your Logical Volumes.
On top of the volume group are the "Logical Volumes" which represents the block device that can be shared. These do not act on disks directly but get their disk space from available disk space in the volume group. That means that a logical volume may consist of available storage from multiple physical volumes. In this diagram, I have two logical volumes and still have some unallocated space in the volume group.
The actual File Systems are created on the logical volumes. As the logical volumes are flexible with regard to size, that makes the file systems flexible as well. If a file system is running out of disk space, it is relatively easy to extend the file system, or to reduce it if the file system allows that.

2. Features of Logical Volume Manager (LVM)

There are several features with LVM which makes it a "must have" solution in production environment.

The most important feature would be LVM's flexibility to manage storage. There are no restrictions in Logical Volumes compared to Physical Drives. They can be easily extended (or even reduced in some cases) based on the available size in Volume Groups. You also have the option to add additional disks and extend the size of Volume Groups to further increase the size of Logical Volumes.
Another important feature in LVM is the support of snapshots. A snapshot keeps the current state of a logical volume and can be used to revert to a previous situation or to make a backup of the file system on the logical volume if the volume is open.
You also get an option to move data across different available Physical Volumes. This is highly useful when you realise for failing hardware where you can add extra disks and create a Physical Volume and later using pvmove command you can move the data from existing to new PV. Once the data is moved, you can safely remove the old Physical Volume and replace the faulty HW without any downtime.
Instead of pvmove, you can also use lvconvert to mirror a Physical Volume into another one which can also be used to migrate data between PVs.

3. LVM Components

These are the basic components which make up the Logical Volume Manager:

Physical volumes: This represents the raw disk space as disk partitions. Physical Volume places a LABEL near the start of the underlying physical storage device of an LVM logical volume. This LABEL is used to initialize the underlying storage block device as Physical Volume in an LVM logical volume.
Volume groups: This aggregates physical volumes together so that the disk space can be consumed to logical volumes.
Logical volumes: Thisrepresents the block device that can be shared. It consumes spacethat is allocated from volume groups.

4. How LVM works

Within a volume group, the disk space available for allocation is divided into units of a fixed-size called extents.
An extent is the smallest unit of space that can be allocated. Within a physical volume, extents are referred to as physical extents.
A logical volume is allocated into logical extents of the same size as the physical extents.
The extent size is thus the same for all logical volumes in the volume group. The default size is 4 MB, but if you are going to create logical volumes bigger than 256 GB, you should use a larger extent size. (For example, a physical extent size of 512 MB will limit a logical volume size to 32 Terabyte.)
The volume group maps the logical extents to physical extents.A file system created on a logical volume is mapped to a collection of logical extents, which in turn contain the blocks of the file system
The job of LVM is to translate a file system block number to a logical extent number and an offset within the extent.
Next, it has to figure out which physical extent the logical extent maps to.
Once these translations are made, the LVM pseudo driver passes the read/write request to the hardware driver responsible for the physical disk on which the mapped physical extent exists.
In this manner, the offset within a logical extent is converted to an offset within a physical extent.

5. Different types of Logical Volumes

There are different types of LVM which you can configure in your environment based on your requirement. Let me give you a brief overview on different available types of LVM in Linux and Unix.

5.1 Linear Logical Volume

This is basic and most used type of LVM wherein the physical volumes are concatenated to create one or more logical volumes.

Here the physical volumes are combined together to be used inside the volume group.
The physical extents from Physical Volume is mapped to the Logical Volumes. In this example we have both PV with same size but it is not mandatory and all the PVs can have different sizes.
The default physical extent size is 4MB so we have two PV with 200MB each i.e. 50 physical extents.
Now you can create a Linear volume (or more than one) with a combined size of upto 400 MB with logical extents between 1 to 100. In this example I have just shown a single LVM with the entire available extents

5.2 Stripped Logical Volume

With Linear Volume the data from the Logical Volume is written on the individual Physical Volume based on the availability of extents and size. With Stripped Volume you can control the way the data is written to the Physical Volume by creating a Stripped Logical Volume.

Striping gives comparatively better read/write performance but then you should look out for the number of striped you create as this can also degrade the performanceThe maximum number of stripes in LVM is 128. It is recommended to have same number of stripes based on disk count in case of physical disks connected via SAS while for SAN there is no advantage of using Stripes.

The following illustration shows data being striped across three physical volumes. In this figure:

the first stripe of data is written to the first physical volume
the second stripe of data is written to the second physical volume
the third stripe of data is written to the third physical volume
the fourth stripe of data is written to the first physical volume
...

In a striped logical volume, the size of the stripe cannot exceed the size of an extent. The smallest block of storage that LVM can allocate is an extent

When creating a striped volume, an extent is allocated from each physical volume at each level of the stripe. When there is space left over after allocating the logical volume, there will be a similar amount of space on all the PV's in the raid-set.

For example, if the stripe is across 12 devices and the volume only uses 2.5 extents, half an extent will be spare on 12 devices (total 6 x extent_size).In this example, The logical volume would be rounded-up in size to consume 3 extents on all the devices, and the logical volume size will be increased to match.

Let us take an example to understand how size is allocated in Stripped Logical Volumes,Consider a LVM volume group with the following values:

Physical Volume count:12
Extent Size: 8192 # 4 Megabytes
extent_count: 36
stripe_count: 12
stripe_size: 128 # 64 Kilobytes

If we try to create a stripped logical volume of 1 GB with these stats, then:

Firstly, we need to determine how many Physical Extents (PEs) are required for this device:

1GB of storage / 4MB Physical Extent Size = 256 Physical Extents

Figure out how many Physical extents are required on each stripe:

256 extents / 12 stripe_count = 21.333 extents on each Physical Volume.

21.333 extents will be rounded up to 22 extents on each volume. Lets calculate the size of the rounded-up volume:

22 extents * 12 stripe_count = 264 total extents required.
264 extents * 4MB Physical Extent size = 1056MB = 1.03125GB

So our 1GB logical volume split across 12 physical volumes with a Physical Extent Size of 4MB will actually become a 1.03 GB volume.

5.3 RAID Logical Volumes

I hope you are familiar with different types of RAID. LVM supports RAID0/1/4/5/6/10. An LVM RAID volume has the following characteristics:

RAID logical volumes created and managed by means of LVM leverage the MD kernel drivers.
RAID1 images can be temporarily split from the array and merged back into the array later.
LVM RAID volumes support snapshots.

In most cases I would recommend to go for Hardware RAID (or software RAID) instead of using LVM with RAID. You can easily have a HW or SW raid and then create linear LVMs on top of them so that you can utilize all the features of LVM

Red Hat has created an online tool "LVM RAID Calculator" to help you determine optimal parameters for creating LVM on RAID.

5.4 Thinly-Provisioned Logical Volumes

In all the different types of LVM we learned so far, the logical volumes were allowed to be created till the size of Physical Extent only. But Logical Volumes can be thin provisioned as well. This allows you to create logical volumes that are larger than the available extents.
Imagine you have combined Physical Volume with storage size of 100 GB, now assuming each LV created with these PV requests for 100 GB so the operator can create and allocate 100 GB to each of the Logical Volumes but the difference is, the allocated size is virtual i.e. even though 100 Gb is assigned, only the amount of actually data used by each Logical Volume would be considered for the overall usage.
We already use similar option in Virtual environment while allocating the storage such as Virtual Box, ESXi, Workstation etc
But it is important to NOTE that once all the operator starts to fill up the allocated space then the storage would run out of space very soon as it is already over-committed.
To make sure that all available space can be used, LVM supports data discard. This allows for re-use of the space that was formerly used by a discarded file or other block range.

5.5 Snapshot Volumes

We have already covered everything about LVM snapshots in detail with commands and examples.
A snapshot volume is associated with a logical volume and keeps track of changes made to the logical volume's data. It is a frozen, read only image of the logical volume.
We can then merge the snapshot back into the logical volume to roll back the data.
To create an LVM snapshot, you need available space in the volume group. The snapshot must be able to store the original blocks of all files that have changed during the lifetime of the snapshot. For example, if you want to take a snapshot of lvdata logical volume with a size of 5 GB then you would need additional unallocated 5 GB in the Volume group for the snapshot.
While working with snapshots, you should be aware that a snapshot is not a replacement for a backup. Snapshots are linked to the original volume. If the original volume just dies, the snapshot will die with it. A snapshot is a tool to help you create a reliable backup.
A snapshot is frozen at the moment of creation, but the real logical volume can be used and will be changed. The snapshot logical volume must have enough space to keep the changes on the real LV during the lifetime of the snapshot.

6. LVM Commands (Cheat Sheet)

Here I have consolidated some of the most frequently used Linux LVM commands:

Command	Purpose
lvmdiskscan	Displays all storage devices
vgscan	Scans all physical devices, searches for VGs
pvdata	Displays debugging information about PV, reads VGDA
pvscan	Scans PVs and displays active
pvcreate	Creates a PV from 8e type partition
vgcreate	Creates VG using PVs
pvmove	Moves data from one PV to another inside one VG
vgreduce	Removes PV from VG
vgdisplay, pvdisplay, lvdisplay	Displays information about VG, PV or LV
vgchange	Activates or deactivates VG
vgexport	Makes VGs unknown to the system, used prior to importing them on a different system
vgimport	Imports VG from a different system
vgsplit	Splits PV from existing VG into new one
vgmerge	Merges two VGs
lvcreate	Creates LV inside VG
lvextend	Increases the size of LV
lvreduce	Decreases the size of LV

Summary

In this tutorial I gave you a brief overview on Logical Volume Manager (LVM) used in Linux and Unix. I also explained the different available types of LVM. Although as of now I skipped the part with the commands required to create these Physical Volumes, Volume Groups and Logical Volumes, I have saved that for the next tutorial as that would just make this post longer. So now you should have an idea of how LVM works and the different components and terminologies involved in LVM.

References

I have used following external references for this tutorial:

RHEL 8: Logical Volumes