How to generate cloud-init user-data file for Ubuntu 20.04 [Step-by-Step]


Ubuntu

In this article I will share step by step instructions to generate user-data file which is required by cloud-init. But before we jump to this. Let me give some overview on user-data and cloud-init concept.

 

What is cloud-init?

This is a newly added feature in Ubuntu 20.04 which supersedes the preseeded mode of installation which was performed using debian-installer. The one big advantage of using cloud-init over debian-installer is that you don't need to specify answer to each installation option in cloud-init. By default cloud-init will consider the default value and proceed with the installation. It will fail only when there are no default values assigned unlike debian-installer (preseed) which will halt the installation for any and every missing input.

Although since this is a new feature so there is a lack of good documentation and debugging can be quite time consuming. You can read more about this on cloud-init official page.

 

How cloud-init works?

There are five stages to a cloud-init boot:

  1. The generator is the first one, and the simplest one: it will determine whether we are even trying to run cloud-init, and based on that, whether it should enable or disable the processing of data files. Cloud-init will not run if there are kernel command-line directives to disable it, or if a file called /etc/cloud/cloud-init.diabled exists.
  2. The local phase tries to find the data that we included for the boot itself, and then it tries to create a running network configuration. This is a relatively simple task performed by a systemd service called cloud-init-local.service, which will run as soon as possible and will block the network until it's done. The concept of blocking services and targets is used a lot in cloud-init initialization; the reason is simple – to ensure system stability. Since cloud-init procedures modify a lot of core settings for a system, we cannot afford to let the usual startup scripts run and create a parallel configuration that could overrun the one created by cloud-init
  3. The network phase is the next one, and it uses a separate service called cloud-init.service. This is the main service that will bring up the previously configured network and try to configure everything we scheduled in the data files. This will typically include grabbing all the files specified in our configuration, extracting them, and executing other preparation tasks. Disks will also be formatted and partitioned in this stage if such a configuration change is specified. Mount points will also get created, including those that are dynamic and specific to a particular cloud platform.
  4. The config stage follows, and it will configure the rest of the system, applying different parts of our configuration. It uses cloud-init modules to further configure our template. Now that the network is configured, it can be used to add repositories (the yum_repos or apt modules), add an SSH key (the ssh-import-id module), and perform similar tasks in preparation for the next phase, in which we can actually use the configuration done in this phase.
  5. The final stage is the part of the system boot that runs things that would probably belong in userland – installing the packages, the configuration management plugin deployment, and executing possible user scripts.

After all this has been done, the system will be completely configured and up and running.

 

Understanding important cloud-init autoinstall configuration files

cloud-init consumes and acts upon user data, metadata, and vendor data.

  • user-data: This contains the directives required to perform an unattended automated installation such as packages to install, partition layout, network configuration etc
  • meta-data:  This includes data associated with a specific datasource, for example, metadata can include a server name and instance ID.
  • vendor-data: This is optionally provided by the organization (for example, a cloud provider) and includes information that can customize the image to better fit the environment where the image runs.

 

Which all Linux variants support cloud-init?

The way cloud-init was conceived was to be as multiplatform as possible and to encompass as many operating systems as can reasonably be done. Currently, it supports the following:

  • Ubuntu
  • SLES/openSUSE
  • RHEL/CentOS
  • Fedora
  • Gentoo Linux
  • Debian
  • Arch Linux
  • FreeBSD

We enumerated all the distributions, but cloud-init, as its name suggests is also cloud-aware, which means that cloud-init is able to automatically detect and use almost any cloud environment.

 

Can we generate user-data automatically?

If you have a preseed file which is used with debian-installer then you can use the autoinstall-generator to convert the preseed file into autoinstall configuration file.

For example, here I have a dummy seed file:

# No language support packages.
d-i     pkgsel/install-language-support boolean false
# Only ask the UTC question if there are other operating systems installed.
d-i     clock-setup/utc-auto    boolean true
# Verbose output and no boot splash screen.
d-i     debian-installer/quiet  boolean false
d-i     debian-installer/splash boolean false
# Install the debconf oem-config frontend (if in OEM mode).
d-i     oem-config-udeb/frontend        string debconf
# Wait for two seconds in grub
d-i     grub-installer/timeout  string 2
# Add the network and tasks oem-config steps by default.
oem-config      oem-config/steps        multiselect language, timezone, keyboard, user, network, tasks

# This makes partman automatically partition without confirmation, provided
# that you told it what to do using one of the methods above.
d-i partman-partitioning/confirm_write_new_label boolean true
d-i partman/choose_partition select finish
d-i partman/confirm boolean true
d-i partman/confirm_nooverwrite boolean true

This is not complete, but just to demonstrate how this works. We will convert these parameters to autoinstall configuration file.

But first we need to install snap:

deepak@ubuntu:~$ sudo apt install snapd -y
[sudo] password for deepak:
Reading package lists... Done
Building dependency tree
Reading state information... Done
snapd is already the newest version (2.51.1+20.04ubuntu2).
snapd set to manually installed.
0 upgraded, 0 newly installed, 0 to remove and 49 not upgraded.

Next we will install autoinstall-generator using snap

deepak@ubuntu:~$ sudo snap install autoinstall-generator
autoinstall-generator 0.1.0 from Dan Bungert (dbungert) installed

Now that autoinstall-generator is installed, we can start with our conversion:

root@ubuntu:~# autoinstall-generator sample-file.seed my-autoinstall.yaml

After the conversion, here is my autoinstall config file content:

root@ubuntu:~# cat my-autoinstall.yaml
apt:
  primary:
  - arches:
    - default
    uri: http://archive.ubuntu.com/ubuntu
debconf-selections: oem-config      oem-config/steps        multiselect language,
  timezone, keyboard, user, network, tasks
keyboard:
  layout: us
locale: en_US.UTF-8
network:
  ethernets:
    any:
      match:
        name: en*
  version: 2
version: 1
#  2: Unsupported: d-i auto-install/enable boolean true
#  3: Unsupported: d-i debconf/priority select critical
#  6:   Directive: d-i debian-installer/locale string en_US.UTF-8
#       Mapped to: {locale: en_US.UTF-8}
#  7: Unsupported: d-i localechooser/supported-locales multiselect en_US.UTF-8
#  9: Unsupported: d-i console-setup/ask_detect boolean false
# 10:   Directive: d-i keyboard-configuration/xkb-keymap select us
#       Mapped to: keyboard: {layout: us}
# 13:   Directive: d-i netcfg/choose_interface select auto
#       Mapped to: network:
#                    ethernets:
#                      any:
#                        match: {name: en*}
#                    version: 2
# 15: Unsupported: d-i netcfg/get_hostname string U1804n1
# 16: Unsupported: d-i netcfg/get_domain string unassigned-domain
# 18: Unsupported: d-i hw-detect/load_firmware boolean true
# 22: Unsupported: d-i mirror/country string manual
# 23:   Directive: d-i mirror/http/hostname string archive.ubuntu.com
# 24:         And: d-i mirror/http/directory string /ubuntu
#       Mapped to: apt:
#                    primary:
#                    - arches: [default]
#                      uri: http://archive.ubuntu.com/ubuntu
# 25: Unsupported: d-i mirror/http/proxy string
# 28: Unsupported: d-i     partman-auto/init_automatically_partition       string some_device_lvm
# 29: Unsupported: d-i     partman-auto/init_automatically_partition       seen false
# 31: Unsupported: d-i     pkgsel/language-pack-patterns   string
# 33: Unsupported: d-i     pkgsel/install-language-support boolean false
# 35: Unsupported: d-i     clock-setup/utc-auto    boolean true
# 37: Unsupported: d-i     debian-installer/quiet  boolean false
# 38: Unsupported: d-i     debian-installer/splash boolean false
# 40: Unsupported: d-i     oem-config-udeb/frontend        string debconf
# 42: Unsupported: d-i     grub-installer/timeout  string 2
# 44:   Directive: oem-config      oem-config/steps        multiselect language, timezone, keyboard, user, network, tasks
#       Mapped to: {debconf-selections: 'oem-config      oem-config/steps        multiselect language,
#                      timezone, keyboard, user, network, tasks'}
# 48: Unsupported: d-i partman-partitioning/confirm_write_new_label boolean true
# 49: Unsupported: d-i partman/choose_partition select finish
# 50: Unsupported: d-i partman/confirm boolean true
# 51: Unsupported: d-i partman/confirm_nooverwrite boolean true

This usage will convert the preseed file to the autoinstall data contained within the autoinstall section. Depending on how the data will be loaded to the installer, it may be needed in autoinstall or in cloud-config formats. This usage adds debug information to the end of the yaml in the form of comments.

For directives which are not supported with auto-install will be prefixed with Unsupported: string.

 

What will you need to generate user-data for autoinstall manually?

If you are aware of the user-data file syntax then you probably would only need one Ubuntu host server with read and write access to a file and you can basically start preparing your user-data file. But this may not be the case with most of the users who are starting to use this for the first time. So we intend to bring up one server using cloud-init. Any server which is installed via cloud-init will automatically have /var/log/installer/autoinstall-user-data file with the content used to bring up that particular server.

This is something similar to what we had with Kickstart, it is just that for kickstart we could have logged in to any Red Hat or CentOS server and we could find /root/anaconda.cfg which would act a default template for any kickstart installation.

Now we also had system-config-kickstart which used to create such kickstart file for us, at the time of writing this article there are no such online tools which cna generate user-data for you based on your requirement.

So coming back to the question, what will you need to create this user-data file?

  1. Ubuntu live-server image (which we will use to perform the installation using cloud-init)
  2. An extra VM or server or we can also use KVM (which will be used to bring up a server using cloud-init)

 

Step-1: Install cloud-init

We will need cloud-init package to validate our autoinstall configuration file. So let's go ahead and install it:

sudo apt-get install cloud-init

 

Step-2: Create a autoinstall config file (user-data)

We will create a small user-data file with only user and password details which can help us login to our server once it is installed. As we only intend to get a default user-data template which we can further use for automated installation of Ubuntu 20.04

#cloud-config
autoinstall:
  version: 1
  identity:
    hostname: ubuntu-server
    password: "$1$RboTxu8T$jlgbAY9f9M7s3hcM7K/kO0"
    username: ubuntu
  interactive-sections:
    - storage

You can generate the encrypted password using following command. Replace PASSWORD with your password

openssl passwd -1 -stdin <<< PASSWORD

As I informed earlier, cloud-init will by default consider all the default values for the installation which is not provided in the user-data file. So I have intentionally added storage as interactive so that the autoinstall will prompt for these sections where we can modify the partition layout.

But not all the sections in user-data can be interactive so you must check their official documentation for this detail.

We have created user-data under /tmp

root@ubuntu:~# ls -l /tmp/user-data
-rw-r--r-- 1 root root 2034 Jan 12 09:10 /tmp/user-data

 

Step-3: Validate the user-data configuration

Next we must validate the autoinstall configuration file to make sure there are no syntax errors:

# cloud-init devel schema --config-file /ks/user-data
Valid cloud-config: /ks/user-data

So looks like our config data is in good shape.

 

Step-4: Create autoinstall image

Next we will create an autoinstall image which can be used to bring up a server using cloud-init. You can create an autoinstall image using the steps as shared at Ubuntu Autoinstall Generator

The steps are fairly simple. You just need to execute the script with the user-data file as an input and it will create an ISO image for you which you can just directly use to bring up a Ubuntu server using cloud-init autoinstall configuration file.

But first let's install the pre-requisite packages:

# apt install xorriso sed curl gpg isolinux -y

Next download the script from the github page

# wget https://raw.githubusercontent.com/covertsh/ubuntu-autoinstall-generator/main/ubuntu-autoinstall-generator.sh

Next execute the script as shown below. This will download the Ubuntu image and add the autoinstall configuration file inside the image and later generate on ISO file for us to use which is specified with -d option.

root@ubuntu:~# bash ubuntu-autoinstall-generator.sh -a -u /tmp/user-data -d /home/first-test.iso
[2020-12-23 15:05:07] 👶 Starting up...
[2020-12-23 15:05:07] 📁 Created temporary working directory /tmp/tmp.8djLqkox5z
[2020-12-23 15:05:07] 🔎 Checking for required utilities...
[2020-12-23 15:05:07] 👍 All required utilities are installed.
[2020-12-23 15:06:07] 🌎 Downloading current daily ISO image for Ubuntu 20.04 Focal Fossa...
[2020-12-23 15:08:11] 👍 Downloaded and saved to /home/user/ubuntu-original-2020-12-23.iso
[2020-12-23 15:08:11] 🌎 Downloading SHA256SUMS & SHA256SUMS.gpg files...
[2020-12-23 15:08:12] 🌎 Downloading and saving Ubuntu signing key...
[2020-12-23 15:08:12] 👍 Downloaded and saved to /home/user/843938DF228D22F7B3742BC0D94AA3F0EFE21092.keyring
[2020-12-23 15:08:12] 🔐 Verifying /home/user/ubuntu-original-2020-12-23.iso integrity and authenticity...
[2020-12-23 15:08:16] 👍 Verification succeeded.[2020-12-23 15:05:07] 🔧 Extracting ISO image...
[2020-12-23 15:08:18] 👍 Extracted to /tmp/tmp.8djLqkox5z
[2020-12-23 15:08:18] 🧩 Adding autoinstall parameter to kernel command line...
[2020-12-23 15:08:18] 👍 Added parameter to UEFI and BIOS kernel command lines.
[2020-12-23 15:08:18] 🧩 Adding user-data and meta-data files...
[2020-12-23 15:08:18] 👍 Added data and configured kernel command line.
[2020-12-23 15:08:18] 👷 Updating /tmp/tmp.8djLqkox5z/md5sum.txt with hashes of modified files...
[2020-12-23 15:08:18] 👍 Updated hashes.
[2020-12-23 15:08:18] 📦 Repackaging extracted files into an ISO image...
[2020-12-23 15:08:25] 👍 Repackaged into /home/first-test.iso
[2020-12-23 15:08:25] ✅ Completed.
[2020-12-23 15:08:26] 🚽 Deleted temporary working directory /tmp/tmp.8djLqkox5z

Verify that the ISO is created successfully

root@ubuntu:/tmp# ls -l /home/first-test.iso
-rw-r--r-- 1 root root 1303511040 Jan 12 15:05 /home/first-test.iso

 

Step-5: Bring up a host or VM using cloud-init

There are some steps explained on the Ubuntu Official Documentation to bring up a server using KVM on your local Ubuntu host. Unfortunately for me it didn't work as my host was not supporting KVM and it required some more changes at BIOS to enable virtualization. But in case your host supports virtualization then you can follow those steps.

I just mounted the ISO which I created in the previous step i.e. first-test.iso to my physical server and started the installation in good old fashion.

Once the server boots, the integrity check will be performed

How to generate cloud-init user-data file for Ubuntu 20.04 [Step-by-Step]

 

Next the cloud-init will start initializing the autoinstall configuration file which we had added in the image:

How to generate cloud-init user-data file for Ubuntu 20.04 [Step-by-Step]

 

Once the initializing is complete, we will get a prompt to configure our disk as if you remember we had added storage as interactive configuration. So here you can go ahead and configure the storage as per your requirement

How to generate cloud-init user-data file for Ubuntu 20.04 [Step-by-Step]

 

I will not be able to cover this part, but this looks pretty much straight forward. You can navigate around using keyboard arrows and setup your partition layout. We get an option to configure either LVM or RAID or both depending upon your requirement. Click on Done once you have completed the configuration.

How to generate cloud-init user-data file for Ubuntu 20.04 [Step-by-Step]

 

Click on Continue to confirm the partition layout:

How to generate cloud-init user-data file for Ubuntu 20.04 [Step-by-Step]

 

Now the installation has started..

How to generate cloud-init user-data file for Ubuntu 20.04 [Step-by-Step]

 

Our installation has gone through and now we have a login shell:

How to generate cloud-init user-data file for Ubuntu 20.04 [Step-by-Step]

 

Step-6: Copy autoinstall-data from the target node to local host

Now we can login using our credential ubuntu/ubuntu which we added in our user-data file:

How to generate cloud-init user-data file for Ubuntu 20.04 [Step-by-Step]

 

Use sudo to login as root using sudo su - which will prompt for ubuntu user password. Next copy /var/log/installer/autoinstall-data from this node to your host server using scp

How to generate cloud-init user-data file for Ubuntu 20.04 [Step-by-Step]

 

After copying, we now have a working user-data file which we can use for automated installation:

root@ubuntu:/tmp# ls -l /tmp/autoinstall-user-data
-rw------- 1 root root 2230 Jan 12 15:55 /tmp/autoinstall-user-data

Here is the content of my autoinstall-user-data file:

root@ubuntu:/tmp# cat /tmp/autoinstall-user-data
#cloud-config
autoinstall:
  apt:
    geoip: true
    preserve_sources_list: false
    primary:
    - arches:
      - amd64
      - i386
      uri: http://archive.ubuntu.com/ubuntu
    - arches:
      - default
      uri: http://ports.ubuntu.com/ubuntu-ports
  identity:
    hostname: ubuntu-server
    password: $1$RboTxu8T$jlgbAY9f9M7s3hcM7K/kO0
    realname: ubuntu
    username: ubuntu
  kernel:
    package: linux-generic
  keyboard:
    layout: us
    toggle: null
    variant: ''
  locale: en_US.UTF-8
  network:
    ethernets:
      eno49:
        dhcp4: true
      eno50:
        dhcp4: true
    version: 2
  ssh:
    allow-pw: true
    authorized-keys: null
    install-server: false
  storage:
    config:
    - ptable: gpt
      serial: 3600508b1001c576619b6670156e25877
      wwn: '0x600508b1001c576619b6670156e25877'
      path: /dev/sda
      wipe: superblock
      preserve: false
      name: ''
      grub_device: true
      type: disk
      id: disk-sda
    - device: disk-sda
      size: 1048576
      flag: bios_grub
      number: 1
      preserve: false
      grub_device: false
      type: partition
      id: partition-3
    - device: disk-sda
      size: 1073741824
      wipe: superblock
      flag: ''
      number: 2
      preserve: false
      grub_device: false
      type: partition
      id: partition-4
    - fstype: ext4
      volume: partition-4
      preserve: false
      type: format
      id: format-2
    - device: disk-sda
      size: 899074228224
      wipe: superblock
      flag: ''
      number: 3
      preserve: false
      grub_device: false
      type: partition
      id: partition-5
    - name: ubuntu-vg
      devices:
      - partition-5
      preserve: false
      type: lvm_volgroup
      id: lvm_volgroup-1
    - name: ubuntu-lv
      volgroup: lvm_volgroup-1
      size: 107374182400B
      wipe: superblock
      preserve: false
      type: lvm_partition
      id: lvm_partition-1
    - fstype: ext4
      volume: lvm_partition-1
      preserve: false
      type: format
      id: format-3
    - path: /
      device: format-3
      type: mount
      id: mount-3
    - path: /boot
      device: format-2
      type: mount
      id: mount-2
  updates: security
  version: 1

 

Bonus Tips

Now in this autoinstall file we can delete the following entry as this was causing issues for me while doing automated installation. But if you want you can give it a try in your environment as well:

  kernel:
    package: linux-generic

Modify the following entry

  ssh:
    allow-pw: true
    authorized-keys: null
    install-server: false

To following as shown below to enable SSHD service

  ssh:
    allow-pw: true
    authorized-keys: []
    install-server: true

That's all for now.

 

Summary

In this tutorial I gave you an overview on cloud-init and it's usage. We also learned about different steps which can be used to generate or own autoinstall user-data configuration file. We also learned to convert an existing preseed file to the cloud=init supported YAML file structure. I know the steps explained in this tutorial may not be the best way but it definitely works and saves alot of debugging time involved to fix the user-data configuration file.

The best way to avoid deployment failures is to perform the automated install on the same VM/node/server where you actually plan to perform the automated installation using cloud-init. As that way the risk of inconsistency reduces specially related to disk layout.

 

Further Reading

More details on individual syntax used in autoinstaller config file
Automated Server Installs Config File Reference

 

Deepak Prasad

Deepak Prasad

Deepak Prasad is the founder of GoLinuxCloud, bringing over a decade of expertise in Linux, Python, Go, Laravel, DevOps, Kubernetes, Git, Shell scripting, OpenShift, Networking, and Security. His extensive experience spans development, DevOps, networking, and security, ensuring robust and efficient solutions for diverse projects.

Certifications and Credentials:

  • Certified Kubernetes Application Developer (CKAD)
  • Go Developer Certification
  • Linux Foundation Certified System Administrator (LFCS)
  • Certified Ethical Hacker (CEH)
  • Python Institute PCAP (Certified Associate in Python Programming)
You can connect with him on his LinkedIn profile and join his Facebook and LinkedIn page.

Can't find what you're searching for? Let us assist you.

Enter your query below, and we'll provide instant results tailored to your needs.

If my articles on GoLinuxCloud has helped you, kindly consider buying me a coffee as a token of appreciation.

Buy GoLinuxCloud a Coffee

For any other feedbacks or questions you can send mail to admin@golinuxcloud.com

Thank You for your support!!

20 thoughts on “How to generate cloud-init user-data file for Ubuntu 20.04 [Step-by-Step]”

  1. Looks like 22.04 does not have isolinux directory and the xorriso command use that directory name with -b flag. Can you add a script that is using correct xorriso flag? Thanks.

    Reply

Leave a Comment