ganeti: Hail fails with FailN1 for an EXT instance allocation in 2.16
Hey,
A strange issue caught our attention when trying to update snf-ganeti to 2.16. Hail fails to allocate an instance with EXT disk template. This used to work in 2.15.
The output:
➜ ok0-mc1.dev /var/log/ganeti # gnt-instance add -I hail -o snf-image+jessie -B memory=2G,vcpus=4 -t ext --disk=0:size=10G,provider=rbd,name=alexvol --net 0:network=snf-net-1,ip=pool --no-name-check --no-ip-check test
Failure: prerequisites not met for this operation:
error type: insufficient_resources, error details:
Can't compute nodes using iallocator 'hail': Request failed: Group default (preferred): No valid allocation solutions, failure reasons: FailN1: 3
while with -n everything works as expected:
➜ ok0-mc1.dev /var/log/ganeti # gnt-instance add -n ok0-00.dev.okeanos.grnet.gr -o snf-image+jessie -B memory=2G,vcpus=4 -t ext --disk=0:size=10G,provider=rbd,name=alexvol --net 0:network=snf-net-1,ip=pool --no-name-check --no-ip-check test
Thu Jun 8 11:50:54 2017 - INFO: NIC/0 inherits netparams ['snf-link-1', 'routed', u'']
Thu Jun 8 11:50:54 2017 - INFO: Chose IP <ip> from network snf-net-1
Thu Jun 8 11:50:55 2017 * disk 0, size 10.0G
Thu Jun 8 11:50:55 2017 * creating instance disks...
Thu Jun 8 11:50:56 2017 adding instance test to cluster config
Thu Jun 8 11:50:56 2017 adding disks to cluster config
Thu Jun 8 11:50:56 2017 - INFO: Waiting for instance test to sync disks
Thu Jun 8 11:50:57 2017 - INFO: Instance test's disks are in sync
Thu Jun 8 11:50:57 2017 - INFO: Waiting for instance test to sync disks
Thu Jun 8 11:50:57 2017 - INFO: Instance test's disks are in sync
Thu Jun 8 11:50:57 2017 * running the instance OS create scripts...
Thu Jun 8 11:51:53 2017 * starting instance...
The same goes with DRBD disks:
➜ ok0-mc1.dev /var/log/ganeti # gnt-instance add -I hail -o snf-image+jessie -B memory=2G,vcpus=4 -t drbd --disk=0:size=10G --net 0:network=snf-net-1,ip=pool --no-name-check --no-ip-check test
Thu Jun 8 14:15:51 2017 - INFO: Selected nodes for instance test via iallocator hail: ok0-01.dev.okeanos.grnet.gr, ok0-00.dev.okeanos.grnet.gr
Thu Jun 8 14:15:51 2017 - INFO: NIC/0 inherits netparams ['snf-link-1', 'routed', u'']
Thu Jun 8 14:15:51 2017 - INFO: Chose IP <ip> from network snf-net-1
Thu Jun 8 14:15:53 2017 * creating instance disks...
So, I tried to debug this with gnt-debug (thanks @apoikos!):
root@ok0-mc1:~# gnt-debug allocator --dir in --mode allocate --mem 2G --disks 1G -t ext -o no_such_os no_such_instance > h-alloc-ext.json
root@ok0-mc1:~# /usr/lib/ganeti/iallocators/hail -v -p h-alloc-ext.json
Received request: Allocate (Instance {name = "no_such_instance", alias = "no_such_instance", mem = 2048, dsk = 1024, disks = [Disk {dskSize = 1024, dskSpindles = Nothing}], vcpus = 1, runSt = Running, pNode = 0, sNode = 0, idx = -1, util = DynUtil {cpuWeight = 1.0, memWeight = 1.0, dskWeight = 1.0, netWeight = 1.0}, movable = True, autoBalance = True, diskTemplate = DTExt, spindleUse = 1, allTags = [], exclTags = [], dsrdLocTags = fromList [], locationScore = 0, arPolicy = ArNotEnabled, nics = [Nic {mac = Just "00:11:22:33:44:55", ip = Nothing, mode = Nothing, link = Nothing, bridge = Nothing, network = Nothing}], forthcoming = False}) (AllocDetails 1 Nothing) Nothing
Initial cluster status:
F Name t_mem n_mem i_mem x_mem f_mem u_mem r_mem t_dsk f_dsk pcpu vcpu pcnt scnt p_fmem p_fdsk r_cpu lCpu lMem lDsk lNet
ok0-02.dev.okeanos.grnet.gr 193809 4096 3072 -5372 192013 186641 7168 0 0 24 47 16 7 0.9630 1.0000 1.96 16.000 16.000 23.000 16.000
- ok0-mc1.dev.okeanos.grnet.gr 0 4096 0 -4096 0 -4096 0 0 0 0 0 0 0 -Infinity 1.0000 NaN 0.000 0.000 0.000 0.000
- ok0-mc2.dev.okeanos.grnet.gr 0 4096 0 -4096 0 -4096 0 0 0 0 0 0 0 -Infinity 1.0000 NaN 0.000 0.000 0.000 0.000
ok0-00.dev.okeanos.grnet.gr 193809 4096 5120 -7049 191642 184593 4096 0 0 24 49 17 5 0.9524 1.0000 2.04 17.000 17.000 22.000 17.000
- ok0-mc0.dev.okeanos.grnet.gr 0 4096 0 -4096 0 -4096 0 0 0 0 0 0 0 -Infinity 1.0000 NaN 0.000 0.000 0.000 0.000
ok0-01.dev.okeanos.grnet.gr 193809 4096 4096 -6023 191640 185617 8192 0 0 24 48 17 6 0.9577 1.0000 2.00 17.000 17.000 23.000 17.000
{"success":false,"info":"Request failed: Group default (preferred): No valid allocation solutions, failure reasons: FailN1: 3","result":[]}
Final cluster status:
F Name t_mem n_mem i_mem x_mem f_mem u_mem r_mem t_dsk f_dsk pcpu vcpu pcnt scnt p_fmem p_fdsk r_cpu lCpu lMem lDsk lNet
ok0-02.dev.okeanos.grnet.gr 193809 4096 3072 -5372 192013 186641 7168 0 0 24 47 16 7 0.9630 1.0000 1.96 16.000 16.000 23.000 16.000
- ok0-mc1.dev.okeanos.grnet.gr 0 4096 0 -4096 0 -4096 0 0 0 0 0 0 0 -Infinity 1.0000 NaN 0.000 0.000 0.000 0.000
- ok0-mc2.dev.okeanos.grnet.gr 0 4096 0 -4096 0 -4096 0 0 0 0 0 0 0 -Infinity 1.0000 NaN 0.000 0.000 0.000 0.000
ok0-00.dev.okeanos.grnet.gr 193809 4096 5120 -7049 191642 184593 4096 0 0 24 49 17 5 0.9524 1.0000 2.04 17.000 17.000 22.000 17.000
- ok0-mc0.dev.okeanos.grnet.gr 0 4096 0 -4096 0 -4096 0 0 0 0 0 0 0 -Infinity 1.0000 NaN 0.000 0.000 0.000 0.000
ok0-01.dev.okeanos.grnet.gr 193809 4096 4096 -6023 191640 185617 8192 0 0 24 48 17 6 0.9577 1.0000 2.00 17.000 17.000 23.000 17.000
While with DRBD:
root@ok0-mc1:~# gnt-debug allocator --dir in --mode allocate --mem 2G --disks 10G -t drbd -o no_such_os no_such_instance > h-alloc-drbd.json
root@ok0-mc1:~# /usr/lib/ganeti/iallocators/hail h-alloc-drbd.json
{"success":true,"info":"Request successful: Selected group: default, Group default (preferred): score: 0.57086361, successes 6, failures 0 () for node(s) ok0-02.dev.okeanos.grnet.gr/ok0-00.dev.okeanos.grnet.gr","result":["ok0-02.dev.okeanos.grnet.gr","ok0-00.dev.okeanos.grnet.gr"]}
(also, hail works as expected using --no-capacity-checks)
Trying to reduce the output of the gnt-debug allocator command I noticed that the problem appears when there is an existing DRBD instance in the cluster and we try to allocate an EXT instance. A minimal input file for hail that triggers the error is here. This makes hail fail with FailN1 in 2.16 while it works fine in 2.15!
I tried to solve this but I’m not sure that I get this straight! Let me give it a shot:
- When trying to allocate an EXT instance, cluster status reports total and free disk space (
t_dsk/f_dsk) 0 as the EXT resources are not “accountable” hailtries to test for N+1 redundant allocation by invokingcanEvacuateNode(insrc/Ganeti/HTools/GlobalN1.hs)canEvacuateNodetries to failover DRBD instances by invokingmove(fromsrc/Ganeti/HTools/Cluster/Moves.hs) withFailovermoveinvokesapplyMoveEx(same file) withforce = TrueandFailoverapplyMoveExcallsaddPriExwhich tries to computenew_dsk_forthwithdecIf uses_disk (fDskForth t) (Instance.dsk inst)whereuses_disk = Instance.usesLocalStorage instandlocalStorageTemplates = [ T.DTDrbd8, T.DTPlain ].- This results in
new_dsk_forth <= 0which raises aBad T.FailDisk - The
FailDiskappears as anFailN1because of failed monadic computations incollectionToSolution(insrc/Ganeti/HTools/Cluster/AllocationSolution.hs)
This might be triggered by @aehlig’s commit which enables the N+1 redundancy in tryAlloc:
commit 02132b44fd7471c02891f11ce41971d301227b70
Author: Klaus Aehlig <aehlig@google.com>
Date: Fri Apr 17 14:54:28 2015 +0200
Make tryAlloc honor global N+1 redundancy
When looking for an allocation, make htools restrict to
those that are globally N+1 redundant. As checking for
N+1 redundancy is an expensive operation, we first
look for the best allocations and filter out later.
For the time being, we do not change the semantics of
iterateAlloc; i.e., for iterateAlloc we will pretend
that capacity checks are ignored.
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Petr Pudlak <pudlak@google.com>
Then again, the above scenario might not make any sense! 😃 Anyway, we think that this is a rather serious issue that might even be blocking for the next release…
Let me know if I can help somehow!
On a side note, it seems a bit strange to me that n_mem is different in 2.15 and 2.16 but I 'd say that this has nothing to do with the reported issue.
About this issue
- Original URL
- State: open
- Created 7 years ago
- Comments: 15 (11 by maintainers)
Hey fellas,
any update on this issue? Is there a fix coming?
As a workaround, I’ve added the
--no-capacity-checksoptions to thedefault-iallocato-paramscluster configuration option:gnt-cluster modify --default-iallocator-params='-no-capacity-checks'It seems to work, bypassing the new
GlobalN+1checks.From the documentation:
From the NEWS: