firecracker: InstanceStart Causes Firecracker to Crash

Platform

  • Hardware: t1.small.x86 Bare metal server from Packet with 1 x Intel Atom C2550 @ 2.4Ghz processor, 8GB RAM, 80 GB SSD
  • OS: Ubuntu 16.04.5 LTS
  • Kernel: 4.4.0-134-generic

Issue Runing InstanceStart action causes Firecracker process to crash with logging error.

Error Message 018-11-28T06:51:31.336002913 [:ERROR:vmm/src/lib.rs:1157] Failed to log metrics on abort. Failed to log metrics. Logger was not initialized.:?

Steps to reproduce

  • Download the firecracker executable from Firecracker release page
  • Start firecracker process using ./firecracker --api-sock /tmp/firecracker.sock command. No messages seen here.
  • On a second console download kernel and rootfs to same directory as firecracker binary from links : kernel and rootfs
  • Set guest kernel using command curl --unix-socket /tmp/firecracker.sock -i \ -X PUT 'http://localhost/boot-source' \ -H 'Accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "kernel_image_path": "./hello-vmlinux.bin", "boot_args": "console=ttyS0 reboot=k panic=1 pci=off" }'
  • set the guest rootfs using commandcurl --unix-socket /tmp/firecracker.sock -i \ -X PUT 'http://localhost/drives/rootfs' \ -H 'Accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "drive_id": "rootfs", "path_on_host": "./hello-rootfs.ext4", "is_root_device": true, "is_read_only": false }'
  • Start VM using command curl --unix-socket /tmp/firecracker.sock -i \ -X PUT 'http://localhost/actions' \ -H 'Accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "action_type": "InstanceStart" }'

Summary

  • Logger is not initialised causing the process to crash when relevant logging is initiated code
  • InstanceStart action fails leading to above case (?)

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 48 (21 by maintainers)

Commits related to this issue

Most upvoted comments

Thanks all for the logs. Seems like the culprit here is the KVM_EXIT_SHUTDOWN. We are investigating it and we will post here any update we have.

Tried to spawn a (nested) VM on an AMD (v)CPU, and got the same issue. Apparently, the problem is (the lack of) this CPU flag: pdpe1gb, i.e. 1GB (huge)pages support by the CPU.

The guys from solo5 had a similar issue and solved it by switching to 2MB pages for the guest page tables.

I checked crosvm and they have fixed this issue by doing the same thing. I applied their patch to firecracker and it seems to work.

@psomas - confirmed that this works for my X7460 CPU

Patch against firecracker source: (REMOVED) This is now on master at https://github.com/firecracker-microvm/firecracker/commit/497b4fe358c5b20ede760dccec178aedba3c8de5

update: running tests fail with this patch so more work is needed

@itwars - Firecracker doesn’t need root privileges. We recommend running it unprivileged.

Hi, I figure out what’s going on! On Ubuntu it’s not :

sudo ./firecracker --api-sock /tmp/firecracker.sock

but

sudo ./firecracker --api-sock /tmp/firecracker.socket

So use the following steps:

sudo rm /tmp/firecracker.socket
sudo ./firecracker --api-sock /tmp/firecracker.socket &

then

sudo curl --unix-socket /tmp/firecracker.socket -i \
    -X PUT 'http://localhost/boot-source'   \
    -H 'Accept: application/json'           \
    -H 'Content-Type: application/json'     \
    -d '{
        "kernel_image_path": "./hello-vmlinux.bin",
        "boot_args": "console=ttyS0 reboot=k panic=1 pci=off"
    }'

sudo curl --unix-socket /tmp/firecracker.socket -i \
    -X PUT 'http://localhost/drives/rootfs' \
    -H 'Accept: application/json'           \
    -H 'Content-Type: application/json'     \
    -d '{
        "drive_id": "rootfs",
        "path_on_host": "./hello-rootfs.ext4",
        "is_root_device": true,
        "is_read_only": false
    }'

sudo curl --unix-socket /tmp/firecracker.socket -i \
    -X PUT 'http://localhost/actions'       \
    -H  'Accept: application/json'          \
    -H  'Content-Type: application/json'    \
    -d '{
        "action_type": "InstanceStart"
     }'

BR, Vincent

@dianpopa Are there daily or frequent snapshots that would include this patch-set? Or are local builds needed until the next release?

There are no daily or frequent snapshots. You can test it with a local build.

Just recompiled from master and it works on Scaleway C2S entry-baremetal now 👍

Hi all, The PR that is supposed to fix this issue has been merged. See #731. Thus, I am closing this issue. However, If anybody still encounters the problem, feel free to reopen it.

Can anyone say whether this CPU should work? Because then I can stop digging 😄 (this is Scaleway C2S, so very low power – I’m super curious how far you can take such a low powered thing)

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 77
model name	: Intel(R) Atom(TM) CPU  C2550  @ 2.40GHz
stepping	: 8
microcode	: 0x12a
cpu MHz		: 2393.902
cache size	: 1024 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 4
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes rdrand lahf_lm 3dnowprefetch ida arat epb dtherm retpoline kaiser tpr_shadow vnmi flexpriority ept vpid tsc_adjust smep erms
bugs		: cpu_meltdown spectre_v1 spectre_v2
bogomips	: 4787.80
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

@corvinux you are using a very old Intel processor (Merom based). This processor did not have support for EPT (extended page table) and various other features. Firecracker also does not work on a newer (but still quite old) Nehalem processor which does have EPT (see https://github.com/firecracker-microvm/firecracker/issues/597).

Depending on the hardware, KVM (in the Linux kernel) will offer some additional features/capabilities. There are plans to improve the capability feature check to provide better error messages (see https://github.com/firecracker-microvm/firecracker/issues/287)