Skip to content

Ignored IAM role passed to Pipeline Steps breaks Local Mode #5031

Open
@lorenzwalthert

Description

@lorenzwalthert

Describe the feature you'd like

In non-local mode, I can pass a role to each step of my SageMaker Pipeline and this role will be assumed a step. This helps scoping permissions narrowly as well as working around the IAM limit of 10 managed policies per role and 10240 bytes character limit for inline policies.

    processor_select_delivery_periods = sagemaker.processing.Processor(
        entrypoint=["bash", "processing.sh"],
        role=role_arn, # <- pass role for specific step
        instance_type="ml.m5.large",
        instance_count=1,
        sagemaker_session=signals.sagemaker.pipeline_session.cache,
    )

    step_args = processor_select_delivery_periods.run(
        arguments=...,
    )

However, the argument seems not respected in local mode. Even worse, not even the IAM role defined at pipeline level seems to be respected

        upsert_response = pipeline.upsert(
            role_arn='arn:/xxx/yzzc',
            description=description or "",
            tags=tags,
        )

The credentials seem to be purely generated from the sagemaker_session passed to Pipeline:

sagemaker.workflow.pipeline.Pipeline(
        name=str(name),
        parameters=list(parameters),
        steps=[step_a, step_b],
        sagemaker_session=sagemaker_session
)

In my understanding, the local mode's docker-compose file (one per step) contains credentials via environment variables (AWS_*) that are then used in the container. Before generating these credentials, one would need use sagemaker_session to assume the role passed to a step and then generate the credentials from these roles instead of directly from that session.

How would this feature be used? Please describe.

The inconsistency between local and remote execution breaks the highly efficient local developer workflow.

Describe alternatives you've considered

  • Refrain from using step-specific roles, i.e. one role for all steps. Drawback is IAM policy restrictions mentioned above.
  • Splitting my pipeline into multiple pipelines since it contains many steps that are only loosely related. Drawback: Need another orchestration tool on top.

Additional context

I opened an AWS Support case: 174841854500356.

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions