argo-workflows: Argo Workflow not working with libreoffice
Summary
What happened/what you expected to happen? I am need to use libreoffice headless to convert docx file to pdf. This is working execellent in Vanilla k8s and Databricks but when i do the same in Kubeflow which uses argo workflow at its backend it does not produce any output.
What version are you running? argoproj.io/v1alpha1 Kubeflow 1.4
Diagnostics
Paste the smallest workflow that reproduces the bug. We must be able to run the workflow.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: libreoffice-pv-claim
spec:
storageClassName: gp2
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
---
apiVersion: v1
kind: Pod
metadata:
name: libreoffice
spec:
containers:
- name: libreoffice-container
image: domnulnopcea/libreoffice-headless:latest
command: ["libreoffice", "--headless", "--convert-to","pdf" ,"/tests/288.pptx","--outdir", "/tests"]
volumeMounts:
- mountPath: "/tests"
name: libreoffice-storage
volumes:
- name: libreoffice-storage
persistentVolumeClaim:
claimName: libreoffice-pv-claim
tolerations:
- key: project
operator: Equal
value: cd-msr
effect: NoSchedule
---
apiVersion: v1
kind: Pod
metadata:
name: libreoffice-bash
spec:
containers:
- name: libreoffice-container
image: ubuntu:18.04
command: ["/bin/sleep", "3650d"]
volumeMounts:
- mountPath: "/tests"
name: libreoffice-storage
volumes:
- name: libreoffice-storage
persistentVolumeClaim:
claimName: libreoffice-pv-claim
tolerations:
- key: project
operator: Equal
value: cd-msr
effect: NoSchedule
This is the yaml I am using. I am then manually copying the input files
kubectl cp ./288.pptx libreoffice-bash:/tests/
kubectl cp ./dummy.pptx libreoffice-bash:/tests/
This is working but when I tries to do the same in Kubeflow it doesnโt was. The script executes without producing any output file.
import kfp
import kfp.components as components
import kfp.dsl as dsl
from kfp.components import InputPath, OutputPath
@components.create_component_from_func
def download_file(s3_folder_path,object_name):
input_file_path=s3_folder_path+"/"+object_name
import subprocess
subprocess.run('pip install boto3'.split())
# Download file
import boto3
s3=boto3.client('s3')
s3.download_file('qa-cd-msr-20220524050318415700000001', input_file_path, '/tmp/input.pptx')
print(input_file_path + " file is downloaded...Executing libreoffice conversion")
subprocess.run("ls -ltr /tmp".split())
def convert_to_pdf():
import subprocess
def exec_cmd(cmd)->(any,str):
print("Executing "+cmd)
result=subprocess.run(cmd.split(), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout=result.stdout.decode('utf-8') + '\n'+ result.stderr.decode('utf-8')
print("stdout: "+stdout)
return stdout
exec_cmd("libreoffice --headless --convert-to pdf /files/input.pptx --outdir /files")
exec_cmd("ls -ltr /files")
convert_to_pdf_op = components.func_to_container_op(convert_to_pdf, base_image= "domnulnopcea/libreoffice-headless:latest")
@dsl.pipeline(
name="Libreoffice",
description="Libreoffice",
)
def sample_pipeline(s3_folder_path:str="/mpsr/decks", object_name:str="Adcetris_master_40.pptx"):
vop = dsl.VolumeOp(
name="create-pvc",
resource_name="my-pvc",
modes=dsl.VOLUME_MODE_RWO,
size="1Gi"
)
download = download_file(s3_folder_path,object_name).add_pvolumes({"/tmp": vop.volume})
convert = convert_to_pdf_op().add_pvolumes({"/files": download.pvolume})
convert.execution_options.caching_strategy.max_cache_staleness = "P0D"
convert.after(download)
client = kfp.Client()
experiment = client.create_experiment(
name="Libreoffice",
description="Libreoffice",
namespace="cd-msr"
)
client.create_run_from_pipeline_func(
sample_pipeline,
arguments={"s3_folder_path":"/mpsr/decks","object_name":"dummy1.pptx"},
run_name="libreoffice",
experiment_name="Libreoffice"
)
Output :
ignore the error here. I was also getting this in vanilla k8s but it gives the output there.
Impacted by this bug? Give it a ๐. We prioritise the issues with the most ๐.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 7
- Comments: 15 (5 by maintainers)
We upgraded the platform to kubeflow 1.5. It is working there. thnx
No,
argoproj/argoexec:latest
.