Solved 'No valid host was found. not enough hosts available'

I got this error while performing my Overcloud deployment using TripleO. During my trial run of Openstack configuration I got this error message No valid host was found. There are not enough hosts available which can occur due to various reasons. But one of the most common reason when seen during overcloud deployment is when there is a difference between the target nova host and expect profile properties.

Here I have tried to share some of the possible scenarios which I can think of along with the proposed solution.

How to fix "No valid host was found. There are not enough hosts available"

Error Message(s):

You may get below error message on the console while performing an overcloud deployment from the undercloud director

text


2018-08-14 06:42:59Z [overcloud.Controller.0.Controller]:
CREATE_FAILED ResourceInError: resources.Controller: Went to status
ERROR due to "Message: No valid host was found. There are not enough hosts available., Code: 500"
2018-08-14 06:42:59Z [overcloud.Controller.0]: CREATE_FAILED Resource
CREATE failed: ResourceInError: resources.Controller: Went to status
ERROR due to "Message: No valid host was found. There are not enough hosts available., Code: 500"

Similarly in the nova-conductor.log, you may find below messages

text


nova-conductor.log:2018-08-12 22:50:24.855 11372 WARNING
nova.scheduler.utils [req-1bcec69e-4ad1-4460-bdd0-506e701dd81b ...]
Failed to compute_task_build_instances: No valid host was found.
There are not enough hosts available.

In your nova-scheduler.log, below message may appear

text


nova-scheduler.log:2018-08-13 16:22:48.035 1475 INFO nova.filters
Filtering removed all hosts for the request with instance ID
'f6ec9c8e-4dd9-4995-92b7-b570bb06790f'.
Filter results: ['RetryFilter: (start: 3, end: 3)',
'TripleOCapabilitiesFilter: (start: 3, end: 3)',
'ComputeCapabilitiesFilter: (start: 3, end: 1)',
'AvailabilityZoneFilter: (start: 1, end: 1)', 'RamFilter: (start: 1, end: 1)',
'DiskFilter: (start: 1, end: 0)']

Analysis and Solution:

Scenario 1 (Check the maintenance mode status):

First of all, make sure that all the nodes are in available state and not in maintenance mode, and not already used by an existing instance.

[stack@undercloud-director ~]$ openstack baremetal node list

+--------------------------------------+---------------------------+---------------+-------------+--------------------+-------------+
| UUID                | Name           | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+---------------------------+---------------+-------------+--------------------+-------------+
| 8ef9b862-9a4f-4961-813c-4b81be7c8e48 | overcloud-controller   | None     | power off | available     |False   |
| 7995f1f2-4af7-4c5d-9099-fc928c4c73b3 | overcloud-compute.example | None     | power off | available     |False   |
| 7c84cdf2-c5b2-47fb-a741-30c025b54183 | overcloud-ceph.example  | None     | power off | available     |False   |
+--------------------------------------+---------------------------+---------------+-------------+--------------------+-------------+

Scenario 2 (Validate IPMI Connectivity):

Check if the undercloud node is able to connect to the overcloud hypervisors using the power credentials (e.g. ipmi_address, ipmi_username and ipmi_password) because if the undercloud node fails to connect to overcloud hypervisors then it will not be able to get the hypervisor node details.

[stack@undercloud-director ~]$ ipmitool -I lanplus -H 10.43.138.12 -L ADMINISTRATOR -p 6320 -U admin -R 3 -N 5 -P redhat power status
Chassis Power is off

[stack@undercloud-director ~]$ ipmitool -I lanplus -H 10.43.138.12 -L ADMINISTRATOR -p 6321 -U admin -R 3 -N 5 -P redhat power status
Chassis Power is off

[stack@undercloud-director ~]$ ipmitool -I lanplus -H 10.43.138.12 -L ADMINISTRATOR -p 6320 -U admin -R 3 -N 5 -P redhat power status
Chassis Power is off

It is possible that ipmitool fails to connect to the hypervisor hence the add node functionality breaks during overcloud deployment

Scenario 3 (Compute services must be running):

Also make sure that you have Compute services running and enabled:

[stack@undercloud-director ~]$ openstack compute service list --service nova-compute
+----+--------------+-----------------------------+------+---------+-------+----------------------------+
| ID | Binary   | Host            | Zone | Status | State | Updated At        |
+----+--------------+-----------------------------+------+---------+-------+----------------------------+
| 4 | nova-compute | undercloud-director.example | nova |enabled|up  | 2018-08-14T12:51:23.000000 |
+----+--------------+-----------------------------+------+---------+-------+----------------------------+

By default,after 10 consecutive build failures a Compute service stops running (disabled). This is to ensure that new build requests will not route to a broken Compute service. If it is the case, make sure to fix the source of the failures, then re-enable it:

$ openstack compute service set --enable<COMPUTE HOST>nova-compute

Scenario 4 (Properties of node and flavors must be identical):

Finally make sure the node's properties matches completely with the assigned flavor's properties field. The node's property field populates at the introspection stage.

For example my node's property

[stack@undercloud-director ~]$ openstack baremetal node show ece1651a-6adc-4826-9f77-5d47891c6c9b -c properties
+------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| Field   | Value                                                                     |
+------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| properties | {u'memory_mb': u'10240', u'cpu_arch': u'x86_64', u'local_gb': u'49', u'cpus': u'4', u'capabilities':                      |
|      | u'profile:control,cpu_aes:true,cpu_hugepages:true,boot_option:local'}                                     |
+------------+-------------------------------------------------------------------------------------------------------------------------------------------------+

Here as we see above, the profile for this node is control so we need to check the properties of "control" flavor using the below command.

[stack@undercloud-director ~]$ openstack flavor show control -c properties -f value
capabilities:boot_option='local', capabilities:cpu_aes='true', capabilities:cpu_hugepages='true', capabilities:profile='control', cpu_arch='x86_64'

Here as you see both the property field match exactly.

How to update "node's" property content in Openstack?

You can manually update the node's property content, anyhow we do not recommend that, since the property field automatically populates during the introspection stage of overcloud configuration and we should not overwrite or change them unless you know what you are doing. You should modify your flavor's property to match the node's content. But still we can update the node's property using the below command

$ ironic node-update overcloud-controller.example add properties/capabilities='profile:control,cpu_aes:true,cpu_hugepages:true,boot_option:local'

How to update flavor's property field in Openstack?

To update the property section of a flavor use the below command

$ openstack flavor set --property "capabilities:profile"="control" --property "capabilities:cpu_aes"="true" --property "capabilities:cpu_hugepages"="true" --property "capabilities:boot_option"="local" control

NOTE

Below link contains more options or arguments which can be used in the above commands OpenStack Ironic Nova flavor configuration

Certainly there can be many more possibilities, these were few which I faced during my testing stage. Anyhow, if the above didn't helped you then for more information on the issue, you can check below log files

/var/log/ironic/*
/var/log/nova/*

Lastly I hope the steps from the article to troubleshoot "No valid host was found" on OpenStack were helpful. Let me know your suggestions and feedback using the comment section.

Next: configure high availability on controller nodes and move keystone behind a load balancer.

How to fix "No valid host was found. There are not enough hosts available"

Scenario 1 (Check the maintenance mode status):

Scenario 2 (Validate IPMI Connectivity):

Scenario 3 (Compute services must be running):

Scenario 4 (Properties of node and flavors must be identical):

How to update "node's" property content in Openstack?

How to update flavor's property field in Openstack?

Related Articles

Cinder vs Swift storage in OpenStack - Basic Difference and Comparison

Deploy Openstack using Kolla Ansible [Step-by-Step]

How to install multi node openstack on virtualbox with packstack on CentOS

Search GoLinuxCloud