azure-sdk-for-python: Failing to find Subscription ID when targeting AzureUSGovernment Tenant

  • Package Name: MLClient
  • Package Version: SDK V2
  • Operating System: AML Compute Instance STANDARD_DS11_V2
  • Python Version: Python 3.10

Describe the bug After initializing an instance of the MLClient module, executing any of it’s methods results in the error below.

To Reproduce Steps to reproduce the behavior:

Pre-requirements:

  1. Have an AzureUSGovernment tenant and subscrition
  2. Have an AML Workspace created, along with a Compute Instance
  3. Have a Service Principal created in the above subscription, and given a “Contributor” role assignment to the AML Workspace
  4. Run a notebook in AML using the compute instance, and updating the placeholder environment variables:
from azure.ai.ml.entities import AmlCompute
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential, AzureAuthorityHosts, EnvironmentCredential

import traceback

# Set ENV Variables

os.environ["AZURE_CLIENT_SECRET"] = "<value>"
os.environ["AZURE_CLIENT_ID"] = "<value>"
os.environ["AZURE_TENANT_ID"] = "<value>"
os.environ["AZURE_AUTHORITY_HOST"] = AzureAuthorityHosts.AZURE_GOVERNMENT


credentials = DefaultAzureCredential(
    interactive_browser_tenant_id=os.environ["AZURE_TENANT_ID"],
    authority=AzureAuthorityHosts.AZURE_GOVERNMENT
    )

ml_client = MLClient(
    credential=credentials,
    subscription_id="<value>",
    resource_group_name="<value>",
    workspace_name="<value>",
    cloud="AzureUSGovernment",
)

# Name assigned to the compute cluster
cpu_compute_target = "cpu-cluster-2"

try:
    # let's see if the compute target already exists
    cpu_cluster = ml_client.compute.get(cpu_compute_target)
    print(
        f"You already have a cluster named {cpu_compute_target}, we'll reuse it as is."
    )

except Exception:
    print("Creating a new cpu compute target...")

    # Let's create the Azure ML compute object with the intended parameters
    cpu_cluster = AmlCompute(
        name=cpu_compute_target,
        # Azure ML Compute is the on-demand VM service
        type="amlcompute",
        # VM Family
        size="STANDARD_DS3_V2",
        # Minimum running nodes when there is no job running
        min_instances=0,
        # Nodes in cluster
        max_instances=4,
        # How many seconds will the node running after the job termination
        idle_time_before_scale_down=180,
        # Dedicated or LowPriority. The latter is cheaper but there is a chance of job termination
        tier="Dedicated",
    )

    # Now, we pass the object to MLClient's create_or_update method
    cpu_cluster = ml_client.compute.begin_create_or_update(cpu_cluster)

print(
    f"AMLCompute with name {cpu_cluster.name} is created, the compute size is {cpu_cluster.size}"
)

Expected behavior The above code should result in either a new CPU Cluster being created, or printing out the message You already have a cluster named {cpu_compute_target}, we'll reuse it as is."

Screenshots

The actual behavior is an error:

ResourceNotFoundError: (SubscriptionNotFound) The subscription 'xxxxxxxxxxxxxxxx' could not be found.
Code: SubscriptionNotFound
Message: The subscription 'xxxxxxxxxxxxxxxx' could not be found.

The stack trace is:

Creating a new cpu compute target...
---------------------------------------------------------------------------
ResourceNotFoundError                     Traceback (most recent call last)
Input In [8], in <cell line: 13>()
     13 try:
     14     # let's see if the compute target already exists
---> 15     cpu_cluster = ml_client_6.compute.get(cpu_compute_target)
     16     print(
     17         f"You already have a cluster named {cpu_compute_target}, we'll reuse it as is."
     18     )

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/_telemetry/activity.py:169, in monitor_with_activity.<locals>.monitor.<locals>.wrapper(*args, **kwargs)
    168 with log_activity(logger, activity_name or f.__name__, activity_type, custom_dimensions):
--> 169     return f(*args, **kwargs)

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/_operations/compute_operations.py:75, in ComputeOperations.get(self, name)
     67 """Get a compute resource
     68 
     69 :param name: Name of the compute
   (...)
     72 :rtype: Compute
     73 """
---> 75 response, rest_obj = self._operation.get(
     76     self._operation_scope.resource_group_name,
     77     self._workspace_name,
     78     name,
     79     cls=get_http_response_and_deserialized_from_pipeline_response,
     80 )
     81 # TODO: Remove warning logging after 05/31/2022 (Task 1776012)

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/core/tracing/decorator.py:83, in distributed_trace.<locals>.decorator.<locals>.wrapper_use_tracer(*args, **kwargs)
     82 if span_impl_type is None:
---> 83     return func(*args, **kwargs)
     85 # Merge span is parameter is set, but only if no explicit parent are passed

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/_restclient/v2022_01_01_preview/operations/_compute_operations.py:577, in ComputeOperations.get(self, resource_group_name, workspace_name, compute_name, **kwargs)
    576 if response.status_code not in [200]:
--> 577     map_error(status_code=response.status_code, response=response, error_map=error_map)
    578     error = self._deserialize.failsafe_deserialize(_models.ErrorResponse, pipeline_response)

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/core/exceptions.py:105, in map_error(status_code, response, error_map)
    104 error = error_type(response=response)
--> 105 raise error

ResourceNotFoundError: (SubscriptionNotFound) The subscription '50ff9458-6372-4522-8227-327043deaef5' could not be found.
Code: SubscriptionNotFound
Message: The subscription '50ff9458-6372-4522-8227-327043deaef5' could not be found.

During handling of the above exception, another exception occurred:

ResourceNotFoundError                     Traceback (most recent call last)
Input In [8], in <cell line: 13>()
     24     cpu_cluster = AmlCompute(
     25         name=cpu_compute_target,
     26         # Azure ML Compute is the on-demand VM service
   (...)
     37         tier="Dedicated",
     38     )
     40     # Now, we pass the object to MLClient's create_or_update method
---> 41     cpu_cluster = ml_client_6.compute.begin_create_or_update(cpu_cluster)
     43 print(
     44     f"AMLCompute with name {cpu_cluster.name} is created, the compute size is {cpu_cluster.size}"
     45 )

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/_telemetry/activity.py:169, in monitor_with_activity.<locals>.monitor.<locals>.wrapper(*args, **kwargs)
    166 @functools.wraps(f)
    167 def wrapper(*args, **kwargs):
    168     with log_activity(logger, activity_name or f.__name__, activity_type, custom_dimensions):
--> 169         return f(*args, **kwargs)

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/_operations/compute_operations.py:116, in ComputeOperations.begin_create_or_update(self, compute, **kwargs)
    107 @monitor_with_activity(logger, "Compute.BeginCreateOrUpdate", ActivityType.PUBLICAPI)
    108 def begin_create_or_update(self, compute: Compute, **kwargs: Any) -> LROPoller:
    109     """Create a compute
    110 
    111     :param compute: Compute definition.
   (...)
    114     :rtype: LROPoller
    115     """
--> 116     compute.location = self._get_workspace_location()
    117     compute._set_full_subnet_name(self._operation_scope.subscription_id, self._operation_scope.resource_group_name)
    119     compute_rest_obj = compute._to_rest_object()

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/_operations/compute_operations.py:308, in ComputeOperations._get_workspace_location(self)
    307 def _get_workspace_location(self) -> str:
--> 308     workspace = self._workspace_operations.get(self._resource_group_name, self._workspace_name)
    309     return workspace.location

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/core/tracing/decorator.py:83, in distributed_trace.<locals>.decorator.<locals>.wrapper_use_tracer(*args, **kwargs)
     81 span_impl_type = settings.tracing_implementation()
     82 if span_impl_type is None:
---> 83     return func(*args, **kwargs)
     85 # Merge span is parameter is set, but only if no explicit parent are passed
     86 if merge_span and not passed_in_parent:

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/ml/_restclient/v2022_01_01_preview/operations/_workspaces_operations.py:615, in WorkspacesOperations.get(self, resource_group_name, workspace_name, **kwargs)
    612 response = pipeline_response.http_response
    614 if response.status_code not in [200]:
--> 615     map_error(status_code=response.status_code, response=response, error_map=error_map)
    616     error = self._deserialize.failsafe_deserialize(_models.ErrorResponse, pipeline_response)
    617     raise HttpResponseError(response=response, model=error, error_format=ARMErrorFormat)

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/core/exceptions.py:105, in map_error(status_code, response, error_map)
    103     return
    104 error = error_type(response=response)
--> 105 raise error

Additional context

I looked through the source code in _azure_environments.py file and also the _ml_client.py file to infer what environment variables and values I needed to pass into the MLClient constructor. However, something doesn’t appear to be working correctly.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 3
  • Comments: 18 (7 by maintainers)

Most upvoted comments

The new CI image has been released with SDK v2 package installed from pypi. Please create a new Compute Instance.