KubeArmor: Visibility logs with `Result: Unknown error` for network operations

Bug Report

General Information

  • Environment description (GKE, VM-Kubeadm, vagrant-dev-env, minikube, microk8s, …) - any
  • Kernel version (run uname -a) - 5.15.0-46-generic
  • Orchestration system version in use (e.g. kubectl version, …) - 1.23.17
  • Link to relevant artifacts (policies, deployments scripts, …) - NA
  • Target containers/pods - NA

To Reproduce

  1. Deploy kubearmor
  2. Deploy multiubuntu (or any other server of your choice).
  3. Try to see Network operation logs using karmor logs --logFilter=all --operation=Network.
  4. Exec into one of the ubuntu containers and try to curl localhost:8000 which would create a TCP_ACCEPT syscall by the running python server
  5. You’ll find a log with Result: Unknown error in logs even though the request passes.

Expected behavior The log should have Result: Passed instead.

Logs

{
  "Timestamp": 1678430968,
  "UpdatedTime": "2023-03-10T06:49:28.840099Z",
  "ClusterName": "default",
  "HostName": "kubearmor-dev-next",
  "NamespaceName": "multiubuntu",
  "PodName": "ubuntu-1-deployment-5bd4dff469-fwn2v",
  "Labels": "container=ubuntu-1,group=group-1",
  "ContainerID": "1584076e2121453ceae9de662c834d73c6bd9a2357432d77bf9527d441b00c13",
  "ContainerName": "ubuntu-1-container",
  "ContainerImage": "docker.io/kubearmor/ubuntu-w-utils:0.1@sha256:b4693b003ed1fbf7f5ef2c8b9b3f96fd853c30e1b39549cf98bd772fbd99e260",
  "ParentProcessName": "/bin/bash",
  "ProcessName": "/usr/bin/python2.7",
  "HostPPID": 8858,
  "HostPID": 8962,
  "PPID": 1,
  "PID": 7,
  "Type": "ContainerLog",
  "Source": "/usr/bin/python2.7",
  "Operation": "Network",
  "Resource": "remoteip=127.0.0.1 port=8000 protocol=TCP",
  "Data": "kprobe=tcp_accept domain=AF_INET",
  "Result": "Unknown error"
}

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 24 (24 by maintainers)

Most upvoted comments

/assign

Hey @xiao-jay were you able to find anything?

I know nothing about ebpf,Please ask @stefin9898 to help solve this problem.I hope my previous comments can help you.

I try exec curl localhost:8000 at multiubuntu pod inside,but get tcp_connect and passed result.For this very big retva number I found some regularity. If you convert them to binary, they have 48 bits, the first 16 bits are same.

-11101100010010110110111000010110011101000000000
-11101100010010110100110101000010101110100000000
-11101100010010101001001000100011100001011000000

I am not familar with BPF and the c code.So it’s hard for me to solve the problem single. I found the code about accopt, found context.retval = PT_REGS_RC(ctx);,retval is set by PT_REGS_RC(ctx) func ,but I not found the PT_REGS_RC func info, so could you please give me some help for the c code and BPF. @DelusionalOptimist code in system_monitor.c

SEC("kretprobe/__x64_sys_inet_csk_accept")
int kretprobe__inet_csk_accept(struct pt_regs *ctx)
{
    if (skip_syscall())
        return 0;

    struct sock *newsk = (struct sock *)PT_REGS_RC(ctx);
    if (newsk == NULL)
        return 0;

    // Code from https://github.com/iovisor/bcc/blob/master/tools/tcpaccept.py with adaptations
    u16 protocol = 1;
    int gso_max_segs_offset = offsetof(struct sock, sk_gso_max_segs);
    int sk_lingertime_offset = offsetof(struct sock, sk_lingertime);

    if (sk_lingertime_offset - gso_max_segs_offset == 2)
#if LINUX_VERSION_CODE >= KERNEL_VERSION(5, 6, 0)
        protocol = READ_KERN(newsk->sk_protocol);
#else
        protocol = newsk->sk_protocol;
#endif
    else if (sk_lingertime_offset - gso_max_segs_offset == 4)
    // 4.10+ with little endian
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
        protocol = READ_KERN(*(u8 *)((u64)&newsk->sk_gso_max_segs - 3));
    else
        // pre-4.10 with little endian
        protocol = READ_KERN(*(u8 *)((u64)&newsk->sk_wmem_queued - 3));
#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
        // 4.10+ with big endian
        protocol = READ_KERN(*(u8 *)((u64)&newsk->sk_gso_max_segs - 1));
    else
        // pre-4.10 with big endian
        protocol = READ_KERN(*(u8 *)((u64)&newsk->sk_wmem_queued - 1));
#else
#error "Fix your compiler's __BYTE_ORDER__?!"
#endif

    if (protocol != IPPROTO_TCP)
        return 0;

    struct sock_common conn = READ_KERN(newsk->__sk_common);
    struct sockaddr_in sockv4;
    struct sockaddr_in6 sockv6;
    sys_context_t context = {};
    args_t args = {};
    u64 types = ARG_TYPE0(STR_T) | ARG_TYPE1(SOCKADDR_T);
    init_context(&context);
    context.argnum = get_arg_num(types);
    context.retval = PT_REGS_RC(ctx);

    if (context.retval >= 0 && drop_syscall(_NETWORK_PROBE))
    {
        return 0;
    }

    if (get_connection_info(&conn, &sockv4, &sockv6, &context, &args, _TCP_ACCEPT) != 0)
    {
        return 0;
    }

    args.args[0] = (unsigned long)conn.skc_prot->name;
    set_buffer_offset(DATA_BUF_TYPE, sizeof(sys_context_t));
    bufs_t *bufs_p = get_buffer(DATA_BUF_TYPE);
    if (bufs_p == NULL)
        return 0;

    save_context_to_buffer(bufs_p, (void *)&context);
    save_args_to_buffer(types, &args);
    events_perf_submit(ctx);

    return 0;
}

@xiao-jay right, which is the problem we’re trying to solve. You’re getting these logs from Calico as it is making tcp_accept syscalls. The syscall pass successfully so these logs should be having Result: Passed. However the logs are having this big error code instead. You can also tryout by sending a curl request to a server. (try nginx running in your setup) When the server accepts your requests it’ll create a tcp_accept syscall and kubearmor will capture it, and show it in the logs.

Thank you for your suggestion,i will try it now.

yes sure, I will first try to see what happens in my system.