argo-workflows: Argo Workflow not working with libreoffice

Summary

What happened/what you expected to happen? I am need to use libreoffice headless to convert docx file to pdf. This is working execellent in Vanilla k8s and Databricks but when i do the same in Kubeflow which uses argo workflow at its backend it does not produce any output.

What version are you running? argoproj.io/v1alpha1 Kubeflow 1.4

Diagnostics

Paste the smallest workflow that reproduces the bug. We must be able to run the workflow.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: libreoffice-pv-claim
spec:
  storageClassName: gp2
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: libreoffice
spec:
  containers:
    - name: libreoffice-container
      image: domnulnopcea/libreoffice-headless:latest
      command: ["libreoffice", "--headless", "--convert-to","pdf" ,"/tests/288.pptx","--outdir", "/tests"]
      volumeMounts:
        - mountPath: "/tests"
          name: libreoffice-storage
  volumes:
    - name: libreoffice-storage
      persistentVolumeClaim:
        claimName: libreoffice-pv-claim
  tolerations:
    - key: project
      operator: Equal
      value: cd-msr
      effect: NoSchedule
---
apiVersion: v1
kind: Pod
metadata:
  name: libreoffice-bash
spec:
  containers:
    - name: libreoffice-container
      image: ubuntu:18.04
      command: ["/bin/sleep", "3650d"]
      volumeMounts:
        - mountPath: "/tests"
          name: libreoffice-storage
  volumes:
    - name: libreoffice-storage
      persistentVolumeClaim:
        claimName: libreoffice-pv-claim
  tolerations:
    - key: project
      operator: Equal
      value: cd-msr
      effect: NoSchedule

This is the yaml I am using. I am then manually copying the input files

kubectl cp ./288.pptx libreoffice-bash:/tests/
kubectl cp ./dummy.pptx libreoffice-bash:/tests/

This is working but when I tries to do the same in Kubeflow it doesnโ€™t was. The script executes without producing any output file.

import kfp
import kfp.components as components
import kfp.dsl as dsl
from kfp.components import InputPath, OutputPath

@components.create_component_from_func
def download_file(s3_folder_path,object_name):
    input_file_path=s3_folder_path+"/"+object_name
    import subprocess
    subprocess.run('pip install boto3'.split())
    # Download file
    import boto3
    s3=boto3.client('s3')
    s3.download_file('qa-cd-msr-20220524050318415700000001', input_file_path, '/tmp/input.pptx')
    print(input_file_path + " file is downloaded...Executing libreoffice conversion")
    subprocess.run("ls -ltr /tmp".split())
def convert_to_pdf():
    import subprocess
    def exec_cmd(cmd)->(any,str):
        print("Executing "+cmd)
        result=subprocess.run(cmd.split(), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
        stdout=result.stdout.decode('utf-8') + '\n'+ result.stderr.decode('utf-8')
        print("stdout: "+stdout)
        return stdout
    exec_cmd("libreoffice --headless --convert-to pdf /files/input.pptx --outdir /files")
    exec_cmd("ls -ltr /files")
convert_to_pdf_op = components.func_to_container_op(convert_to_pdf, base_image= "domnulnopcea/libreoffice-headless:latest") 
@dsl.pipeline(
    name="Libreoffice",
    description="Libreoffice",
)
def sample_pipeline(s3_folder_path:str="/mpsr/decks", object_name:str="Adcetris_master_40.pptx"):
    vop = dsl.VolumeOp(
        name="create-pvc",
        resource_name="my-pvc",
        modes=dsl.VOLUME_MODE_RWO,
        size="1Gi"
    )
    download = download_file(s3_folder_path,object_name).add_pvolumes({"/tmp": vop.volume})
    convert = convert_to_pdf_op().add_pvolumes({"/files": download.pvolume})
    convert.execution_options.caching_strategy.max_cache_staleness = "P0D"
    convert.after(download)
client = kfp.Client()
experiment = client.create_experiment(
    name="Libreoffice", 
    description="Libreoffice",
    namespace="cd-msr"
) 
client.create_run_from_pipeline_func(
    sample_pipeline, 
    arguments={"s3_folder_path":"/mpsr/decks","object_name":"dummy1.pptx"}, 
    run_name="libreoffice", 
    experiment_name="Libreoffice"
)

Output :

image

ignore the error here. I was also getting this in vanilla k8s but it gives the output there.

Impacted by this bug? Give it a ๐Ÿ‘. We prioritise the issues with the most ๐Ÿ‘.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 7
  • Comments: 15 (5 by maintainers)

Most upvoted comments

We upgraded the platform to kubeflow 1.5. It is working there. thnx

No, argoproj/argoexec:latest.