pulumi: Python Outputs to string do not work as documented

The output method for outputting Pulumi values to a string do not work, or at least are not documented properly.

___main__.py

➜ cat __main__.py
import pulumi
from pulumi import StackReference, Config, export, output
from typing import Any, Optional

if __name__ == '__main__':
    config = Config()
    company_stack = StackReference(config.require("companyStack"))
    company_name = company_stack.get_output("companyName")
    export("Company Name:", company_name)
    exported_name = company_name.apply(
                                      lambda company_name: company_name
                                     )
    export("exported name:", exported_name)

Output:

➜ pulumi up
Previewing update (tools):

     Type                 Name                           Plan     
     pulumi:pulumi:Stack  SIPTools-INFRASTRUCTURE-tools           
 
Resources:
    1 unchanged

Do you want to perform this update? yes
Updating (tools):

     Type                 Name                           Status     
     pulumi:pulumi:Stack  SIPTools-INFRASTRUCTURE-tools             
 
Resources:
    1 unchanged

Duration: 1s

Permalink: https://app.pulumi.com/sean.brady/SIPTools-INFRASTRUCTURE/tools/updates/34
warning: A new version of Pulumi is available. To upgrade from version '1.7.1' to '1.8.1', visit https://pulumi.com/docs/reference/install/ for manual instructions and release notes.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 41 (16 by maintainers)

Most upvoted comments

As you can see, that’s pretty gnarly. Having to wrap a majority of my S3 bucket creation logic, simply because the .hex attribute isn’t available to me yet, in a nested callback func, just seems gross. Maybe in TypeScript this is more elegant, but in Python it’s just not. I suspect this is mainly due to the fact that lambda keyword in python was never designed for multi-line closures… whereas in TypeScript it’s a do-able thing. Therefore in Python your def ends up before what you’re trying to do, which adds to the confusion w/r/t callbacks and async nature.

@jmgilman One solution is to create a custom provider or function, the following is a small example of a function:

def render_user_data(address, token) -> Output:
  with open(os.path.join(os.path.dirname(__file__), "files/init.sh.j2"), 'r') as f:
    init = jinja2.Template(f.read()).render(vault_address=address, vault_token=token)
    initBytes = base64.b64encode(init.encode("utf-8"))
    initStr = str(initBytes, "utf-8")
  return initStr

Now only the function should be called with an Output.all

address = os.environ['VAULT_ADDR']
token = token.client_token  # The output must be called from the provider of the pulumi vault

user_data = Output.all(address, token).apply(lambda args: render_user_data(args[0], args[1]))

In the first preview, you cannot see the output in a diagnostic message and this is because the resource and data have not been created and are not in the state, but the generated object is added to any resource that needs it.

I hope this example helps and Pulumi can create the documentation of how to generate documents rendered with jinja2 or natively as Terraform does.

Thanks for the clarifications and feedback. I agree we can definitely work toward making the documentation clearer in the way you’re describing. Currently, the place this is best described is in https://www.pulumi.com/docs/intro/concepts/programming-model/#outputs (and happy to take any PRs for suggestions you might have in https://github.com/pulumi/docs from which that document gets built). We’re also considering ways to rewriting parts of this guide to clarify some of the confusing areas you ran into.

You apply_all suggestion is an interesting one. I wonder about the ergonomics of it as it seems like it would encourage nesting multiple callback contexts instead of handling them on an as-needed basis?

OK, Using the clarified tip from lukehoban, I was able to get a somewhat-working prototype. Pasting it here so others can benefit…

from pulumi import StackReference

def check_id(id):
    if id == 'i-0847ff862f76c38ac':
        print('debug:  stings are equal, EC2 id is {}'.format(id))

        # This Exception will be ignored by pulumi engine.  Strange.
        # raise Exception("Can't run pulumi from this EC2")

        # So, I have to use exit() to break out of the program, but very ugly
        exit(1)

stack = StackReference('dev_steve')
ec2_id_output = stack.get_output('ec2_instance_id')
ec2_id_output.apply(check_id)

This will abort the stack update if the stacks’s current EC2 is i-0847ff862f76c38ac with a long un-freindly error and stack trace. However, I am unable to raise my own Exception. So, not elegant, but meets the minimum requirements for what I was originally trying to do. And, I have a better understanding of how it works. Thanks, lukehoban!

But, it still feels like if one can get an output as a string from a one-liner like pulumi stack output -s dev_steve ec2_instance_id, that there should be a way to duplicate that functionality as a stand-alone piece of code. I guess the ideal solution would be a Python method that could return the same output info as the CLI without the overhead of having to run them inside a callback inside an output inside a program inside the pulumi engine inside an update command. So, still wondering why if the CLI can get output as a string, why can’t I write a Python program that does the same? Guess I could always write a Python method that runs the CLI and captures/parses the stdout. 😃

No, that doesn’t do anything. I already attempted that. I think the problem herein lies that Pulumi just fails to do lifting of resources sometimes. It seems to only work when it feels like it. Sometimes I don’t have to use a callback (such as your example even with Output.concat()), sometimes I do. I wish I could determine the “switch” that causes Pulumi to lift some things and not others.

This issue goes back to the OP. Getting the “real” value of the Output[T] in Pulumi is incredibly vague. It makes no sense that the .id attribute is always functional, but other shit that you passed into the Input[T] (which becomes an Output[T]) isn’t available for consumption without a callback. Again, I feel like pulumi needs some mechanism to say "stop the world, I need this value NOW so let’s force all awaits dependent to resolve. Even if you use depends_on in ResourceOptions, you still can’t get the values without the callback…

Having to wrap tons of code in callbacks for some attributes and not others is frustrating to say the least.

I had this problem with the Kubernetes’ provider as well, especially when I wanted to pull .metadata.name off of a created resource. Sometimes it would work, sometimes it wouldn’t. The docs are just incredibly unclear on all of this. If I’m supposed to wrap my entire damn program into an Output.all(resource_foo.name, resource_foo.field_a, resource_bar.name, resource_bar.field_b).apply(), doesn’t that defeat the purpose of being able to hit the getters and setters directly of a CustomResource??? I am confused here. I mean, I think a lot of the issues here are all around async programming in general, and trying to wrap ones’ head around it, but Pulumi offers no helper functions on forcing the async await to happen outside of the scope of a callback (which would be super beneficial in some instances).

The problem this creates, is if you don’t know “what” futures you are going to need ahead of time, you end up programming a bunch of callbacks to handle a myriad of strange cases. Sometimes, in another class, I just want to be able to say, grab subnet.availability_zone_id without having to push the Output future through a call back… just do it now.

The actual apply() method even says: “We don’t return the actual object here, we return the future again so we can track dependencies.” OK? What if I want you do stop the world, right now, and attempt to make these unknowns available to me in the same thread? I know this goes against the async convention, but It’s still useful in some circumstances.

Maybe I’m babbling…

I think we’re talking over each other then… the value of bucket_name is actually a Future all the way until the point it hits the Input[T] for the Bucket() constructor…

The issue I’m having is, having this particular output immediately resolved, and subsequently available to other things. Example, doing a simple print, you can see that __repr__ still wants to use the future instead of calling/waiting the async. The same happens with __str__.

random_bucket_name = random.RandomId(name_key, byte_length=2)
bucket_name = Output.concat(random_bucket_name.hex)
print("Bucket name: ", bucket_name)

This unfortunately, STILL prints out the Output:

Bucket name:  <pulumi.output.Output object at 0x10e074b80>
Bucket name:  <pulumi.output.Output object at 0x10e0821f0>

Honestly I think that’s the whole confusion talking point on this thread. Those who are thinking everything scraped off of a resource is going to be re-used as an Input[T] (which would obviously work, since the resource itself would handle the resolving of the future), and those who literally just want to get the value into a say, native Python type for another method, or to build a hash, etc to consume without having to wonder what value it’s going to get.

import logging

from pulumi import export
from pulumi_aws import s3

logging.basicConfig(level=logging.DEBUG,
                    format='%(asctime)s %(levelname)s %(message)s',
                    filename='./example.log',
                    filemode='w')              
logging.debug('Starting process')

bucket = s3.Bucket('test-bucket')

# Get the bucket's domain name
bucket_dns = bucket.bucket_domain_name

# Log it (if/when it becomes available)
bucket_dns.apply(logging.debug)

# Export it as an output of the stack so that it 
# can be retrieved outside the program with: 
# `pulumi stack output bucket_dns`.
export('bucket_dns', bucket_dns)
$ pulumi up
...
$ cat ./example.log 
2020-01-17 16:26:06,370 DEBUG Starting process
2020-01-17 16:26:14,467 DEBUG test-bucket-b9fd3b1.s3.amazonaws.com
$ pulumi stack output bucket_dns
test-bucket-b9fd3b1.s3.amazonaws.com

BTW - If your core question is ultimately “How do I turn an Output[str] into a str?”, the answer is that you cannot directly do this. An Output[str] represents a value that may at some point in the future (after the resource is provisioned) have a value, but may not yet have that value (during preview, or during an update before creation or replacement). So you can only use the value of the Output inside an apply which will run when the value becomes available, and will hand it back to your callback to transform into a new value, which will then be a new Output (like the bucket_uri above or the bucket policy in the example I linked). This is discussed in detail in the docs at: https://www.pulumi.com/docs/intro/concepts/programming-model/#outputs.