jaeger-operator: spark-dependencies job is failing

Setup:

  • Installed elasticsearch cluster manually. Authentication not enabled
  • jaeger-operator schedules spark-dependencies but it fails,

Error:

19/05/10 03:59:30 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.io.IOException: failure to login
	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:700)
	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:571)
	at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2391)
	at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2391)
	at scala.Option.getOrElse(Option.scala:121)
	at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2391)
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:295)
	at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
	at io.jaegertracing.spark.dependencies.elastic.ElasticsearchDependenciesJob.run(ElasticsearchDependenciesJob.java:185)
	at io.jaegertracing.spark.dependencies.elastic.ElasticsearchDependenciesJob.run(ElasticsearchDependenciesJob.java:172)
	at io.jaegertracing.spark.dependencies.DependenciesSparkJob.run(DependenciesSparkJob.java:54)
	at io.jaegertracing.spark.dependencies.DependenciesSparkJob.main(DependenciesSparkJob.java:40)
Caused by: javax.security.auth.login.LoginException: java.lang.NullPointerException: invalid null input: name
	at com.sun.security.auth.UnixPrincipal.<init>(UnixPrincipal.java:71)
	at com.sun.security.auth.module.UnixLoginModule.login(UnixLoginModule.java:133)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
	at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
	at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
	at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
	at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:675)
	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:571)
	at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2391)
	at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2391)
	at scala.Option.getOrElse(Option.scala:121)
	at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2391)
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:295)
	at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
	at io.jaegertracing.spark.dependencies.elastic.ElasticsearchDependenciesJob.run(ElasticsearchDependenciesJob.java:185)
	at io.jaegertracing.spark.dependencies.elastic.ElasticsearchDependenciesJob.run(ElasticsearchDependenciesJob.java:172)
	at io.jaegertracing.spark.dependencies.DependenciesSparkJob.run(DependenciesSparkJob.java:54)
	at io.jaegertracing.spark.dependencies.DependenciesSparkJob.main(DependenciesSparkJob.java:40)

	at javax.security.auth.login.LoginContext.invoke(LoginContext.java:856)
	at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
	at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
	at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
	at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:675)
	... 11 more

CR file:

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: jaegerqe
spec:
  ingress:
    security: none
  strategy: production
  collector:
    replicas: 1
    image: jaegertracing/jaeger-collector:1.11
    resources:
      requests:
        cpu: "1"
        memory: "512Mi"
      limits:
        cpu: "1"
        memory: "512Mi"
    options:
      log-level: info
      metrics-backend: prometheus
      collector:
        num-workers: 1
        queue-size: 100000
      es:
        bulk:
          size: 524288
          workers: 3
          flush-interval: 50ms
        tags-as-fields:
          all: false
  query:
    replicas: 1
    image: jaegertracing/jaeger-query:1.11
    resources:
      requests:
        cpu: "500m"
        memory: "512Mi"
      limits:
        cpu: "500m"
        memory: "512Mi"
    options:
      log-level: info
      metrics-backend: prometheus
      query:
        port: 16686
  agent:
    strategy: sidecar
    image: jaegertracing/jaeger-agent:1.11
    resources:
      requests:
        cpu: "500m"
        memory: "256Mi"
      limits:
        cpu: "500m"
        memory: "256Mi"
    options:
      log-level: info
      metrics-backend: prometheus
      processor:
        jaeger-compact:
          server-queue-size: 100000
          workers: 1000
  storage:
    type: elasticsearch
    esIndexCleaner:
      enabled: true
    dependencies:
      enabled: true
    options:
      es:
        server-urls: http://elasticsearch:9200

Elasticsearch access with curl,

$ curl http://elasticsearch:9200
{
  "name" : "elasticsearch-0",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "9-kQC9gMSvuuREL2iOVtfA",
  "version" : {
    "number" : "5.6.10",
    "build_hash" : "b727a60",
    "build_date" : "2018-06-06T15:48:34.860Z",
    "build_snapshot" : false,
    "lucene_version" : "6.6.1"
  },
  "tagline" : "You Know, for Search"
}

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 22 (18 by maintainers)

Most upvoted comments

Not sure if this helps anybody, but I had the same problem in a different context, while creating a custom Spark image for usage in Kubernetes. The incantation that works for me is the following: Dockerfile (note especially the auth required line):

FROM bitnami/spark:3.1.1
ENV TINI_VERSION v0.19.0
ADD https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini /usr/bin/tini
USER root
RUN chmod +x /usr/bin/tini && echo "auth required pam_wheel.so use_uid" >> /etc/pam.d/su && chgrp root /etc/passwd && chmod ug+rw /etc/passwd && chmod ugo+rwx -R /opt/bitnami/spark
USER 1001
ADD entrypoint.sh /opt/
ENTRYPOINT ["/opt/bitnami/scripts/spark/entrypoint.sh", "/opt/entrypoint.sh"]

entrypoint.sh

#!/bin/bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# echo commands to the terminal output
set -ex

# Check whether there is a passwd entry for the container UID
myuid=$(id -u)
mygid=$(id -g)
# turn off -e for getent because it will return error code in anonymous uid case
set +e
uidentry=$(getent passwd $myuid)
set -e

# If there is no passwd entry for the container UID, attempt to create one
if [ -z "$uidentry" ] ; then
    if [ -w /etc/passwd ] ; then
        echo "$myuid:x:$myuid:$mygid:${SPARK_USER_NAME:-anonymous uid}:$SPARK_HOME:/bin/false" >> /etc/passwd
    else
        echo "Container ENTRYPOINT failed to add passwd entry for anonymous UID"
    fi
fi

SPARK_CLASSPATH="$SPARK_CLASSPATH:${SPARK_HOME}/jars/*"
env | grep SPARK_JAVA_OPT_ | sort -t_ -k4 -n | sed 's/[^=]*=\(.*\)/\1/g' > /tmp/java_opts.txt
readarray -t SPARK_EXECUTOR_JAVA_OPTS < /tmp/java_opts.txt

if [ -n "$SPARK_EXTRA_CLASSPATH" ]; then
  SPARK_CLASSPATH="$SPARK_CLASSPATH:$SPARK_EXTRA_CLASSPATH"
fi

if [ "$PYSPARK_MAJOR_PYTHON_VERSION" == "2" ]; then
    pyv="$(python -V 2>&1)"
    export PYTHON_VERSION="${pyv:7}"
    export PYSPARK_PYTHON="python"
    export PYSPARK_DRIVER_PYTHON="python"
elif [ "$PYSPARK_MAJOR_PYTHON_VERSION" == "3" ]; then
    pyv3="$(python3 -V 2>&1)"
    export PYTHON_VERSION="${pyv3:7}"
    export PYSPARK_PYTHON="python3"
    export PYSPARK_DRIVER_PYTHON="python3"
fi

# If HADOOP_HOME is set and SPARK_DIST_CLASSPATH is not set, set it here so Hadoop jars are available to the executor.
# It does not set SPARK_DIST_CLASSPATH if already set, to avoid overriding customizations of this value from elsewhere e.g. Docker/K8s.
if [ -n "${HADOOP_HOME}"  ] && [ -z "${SPARK_DIST_CLASSPATH}"  ]; then
  export SPARK_DIST_CLASSPATH="$($HADOOP_HOME/bin/hadoop classpath)"
fi

if ! [ -z ${HADOOP_CONF_DIR+x} ]; then
  SPARK_CLASSPATH="$HADOOP_CONF_DIR:$SPARK_CLASSPATH";
fi

case "$1" in
  driver)
    shift 1
    CMD=(
      "$SPARK_HOME/bin/spark-submit"
      --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS"
      --deploy-mode client
      "$@"
    )
    ;;
  executor)
    shift 1
    CMD=(
      ${JAVA_HOME}/bin/java
      "${SPARK_EXECUTOR_JAVA_OPTS[@]}"
      -Xms$SPARK_EXECUTOR_MEMORY
      -Xmx$SPARK_EXECUTOR_MEMORY
      -cp "$SPARK_CLASSPATH:$SPARK_DIST_CLASSPATH"
      org.apache.spark.executor.CoarseGrainedExecutorBackend
      --driver-url $SPARK_DRIVER_URL
      --executor-id $SPARK_EXECUTOR_ID
      --cores $SPARK_EXECUTOR_CORES
      --app-id $SPARK_APPLICATION_ID
      --hostname $SPARK_EXECUTOR_POD_IP
    )
    ;;

  *)
    echo "Non-spark-on-k8s command provided, proceeding in pass-through mode..."
    CMD=("$@")
    ;;
esac

# Execute the container CMD under tini for better hygiene
exec /usr/bin/tini -s -- "${CMD[@]}"

I tried again with the operator provided es cluster, I see the same issue.

The self-provisioned ES cluster does not work, it’s not supported at the moment.