I got this error while performing my Overcloud deployment using TripleO. During my trial run of Openstack configuration I got this error message "No valid host was found. There are not enough hosts available" which can occur due to various reasons. But one of the most common reason when seen during overcloud deployment is when there is a difference between the target nova host and expect profile properties.
Here I have tried to share some of the possible scenarios which I can think of along with the proposed solution.
How to fix "No valid host was found. There are not enough hosts available"
Error Message(s):
You may get below error message on the console while performing an overcloud deployment from the undercloud director
2018-08-14 06:42:59Z [overcloud.Controller.0]: CREATE_FAILED Resource CREATE failed: ResourceInError: resources.Controller: Went to status ERROR due to "Message: No valid host was found. There are not enough hosts available., Code: 500"
Similarly in the nova-conductor.log, you may find below messages
In your nova-scheduler.log, below message may appear
Analysis and Solution:
Scenario 1 (Check the maintenance mode status):
First of all, make sure that all the nodes are in available state and not in maintenance mode, and not already used by an existing instance.
[stack@undercloud-director ~]$ openstack baremetal node list +--------------------------------------+---------------------------+---------------+-------------+--------------------+-------------+ | UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance | +--------------------------------------+---------------------------+---------------+-------------+--------------------+-------------+ | 8ef9b862-9a4f-4961-813c-4b81be7c8e48 | overcloud-controller | None | power off | available | False | | 7995f1f2-4af7-4c5d-9099-fc928c4c73b3 | overcloud-compute.example | None | power off | available | False | | 7c84cdf2-c5b2-47fb-a741-30c025b54183 | overcloud-ceph.example | None | power off | available | False | +--------------------------------------+---------------------------+---------------+-------------+--------------------+-------------+
Scenario 2 (Validate IPMI Connectivity):
Check if the undercloud node is able to connect to the overcloud hypervisors using the power credentials (e.g. ipmi_address, ipmi_username and ipmi_password) because if the undercloud node fails to connect to overcloud hypervisors then it will not be able to get the hypervisor node details.
[stack@undercloud-director ~]$ ipmitool -I lanplus -H 10.43.138.12 -L ADMINISTRATOR -p 6320 -U admin -R 3 -N 5 -P redhat power status Chassis Power is off [stack@undercloud-director ~]$ ipmitool -I lanplus -H 10.43.138.12 -L ADMINISTRATOR -p 6321 -U admin -R 3 -N 5 -P redhat power status Chassis Power is off [stack@undercloud-director ~]$ ipmitool -I lanplus -H 10.43.138.12 -L ADMINISTRATOR -p 6320 -U admin -R 3 -N 5 -P redhat power status Chassis Power is off
It is possible that ipmitool fails to connect to the hypervisor hence the add node functionality breaks during overcloud deployment
Scenario 3 (Compute services must be running):
Also make sure that you have Compute services running and enabled:
[stack@undercloud-director ~]$ openstack compute service list --service nova-compute +----+--------------+-----------------------------+------+---------+-------+----------------------------+ | ID | Binary | Host | Zone | Status | State | Updated At | +----+--------------+-----------------------------+------+---------+-------+----------------------------+ | 4 | nova-compute | undercloud-director.example | nova | enabled | up | 2018-08-14T12:51:23.000000 | +----+--------------+-----------------------------+------+---------+-------+----------------------------+
By default, after 10 consecutive build failures a Compute service stops running (disabled). This is to ensure that new build requests will not route to a broken Compute service. If it is the case, make sure to fix the source of the failures, then re-enable it:
$ openstack compute service set --enable <COMPUTE HOST> nova-compute
Scenario 4 (Properties of node and flavors must be identical):
Finally make sure the node's properties matches completely with the assigned flavor's properties field. The node's property field populates at the introspection stage.
For example my node's property
[stack@undercloud-director ~]$ openstack baremetal node show ece1651a-6adc-4826-9f77-5d47891c6c9b -c properties
+------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| properties | {u'memory_mb': u'10240', u'cpu_arch': u'x86_64', u'local_gb': u'49', u'cpus': u'4', u'capabilities': |
| | u'profile:control,cpu_aes:true,cpu_hugepages:true,boot_option:local'} |
+------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
Here as we see above, the profile for this node is control so we need to check the properties of "control" flavor using the below command.
[stack@undercloud-director ~]$ openstack flavor show control -c properties -f value capabilities:boot_option='local', capabilities:cpu_aes='true', capabilities:cpu_hugepages='true', capabilities:profile='control', cpu_arch='x86_64'
Here as you see both the property field match exactly.
How to update "node's" property content in Openstack?
You can manually update the node's property content, anyhow we do not recommend that, since the property field automatically populates during the introspection stage of overcloud configuration and we should not overwrite or change them unless you know what you are doing. You should modify your flavor's property to match the node's content. But still we can update the node's property using the below command
$ ironic node-update overcloud-controller.example add properties/capabilities='profile:control,cpu_aes:true,cpu_hugepages:true,boot_option:local'
How to update flavor's property field in Openstack?
To update the property section of a flavor use the below command
$ openstack flavor set --property "capabilities:profile"="control" --property "capabilities:cpu_aes"="true" --property "capabilities:cpu_hugepages"="true" --property "capabilities:boot_option"="local" control
NOTE:
Certainly there can be many more possibilities, these were few which I faced during my testing stage. Anyhow, if the above didn't helped you then for more information on the issue, you can check below log files
/var/log/ironic/* /var/log/nova/*
Lastly I hope the steps from the article to troubleshoot "No valid host was found. There are not enough hosts available" on Openstack was helpful. So, let me know your suggestions and feedback using the comment section.
In my upcoming articles I will share the steps to manually configure high availability on your controller nodes and steps to move your keystone endpoint behind the load balancer to improve redundancy.
Check that you are not using scheduler_hints in your node-info.yaml.
I had a problem with “No valid host was found” when my file looked like this:
if you see this in nova logs:
Node tagged None does not match requested node controller-0 host_passes /usr/lib/python3.6/site-packages/tripleo_common/filters/capabilities_filter.py:46
Try changing the node-info to something like:
After this change my deployment was successful.
Thank you for sharing!
I had a problem with ‘No valid host was found’, where scheduler_hints were wrong in my
node-info.yaml
,my original file looked like this:
But then I see the following messages in compute log:
‘Node tagged None does not match requested node
controller-0
‘I then fixed my
node-info.yaml
to the format below:And deployment run well.
Hello,
In scenario 3, nova-compute the status is enabled but state is down. In this case what should I do?
Thanks
Jean
The nova-compute service must be UP, so you can manually try to start the service.