Skip to content

email: get_payload(decode=True) doesn't handle Content-Transfer-Encoding with trailing white space #98188

Closed
@rpcross

Description

@rpcross

If the Content-Transfer-Encoding header field of a message part has trailing whitespace, for example "base64 ", get_payload(decode=True) does not return the properly decoded payload.

Here is a minimal code example. Sample message file attached.

import email
from email import policy

with open('msg.txt', 'rb') as f:
    msg = email.message_from_binary_file(f, policy=policy.default)

parts = list(msg.walk())
parts[1].get_payload(decode=True)
> b'SGVsbG8uIFRlc3Rpbmc=\n'

The parsed content-transfer-encoding header "cte" value is truncated, but it's string value is not.

>>> header = parts[1].get('content-transfer-encoding')
>>> header.cte
'base64'
>>> str(header)
'base64 '

Which is what appears to be used in the decode attempt
https://github.com/python/cpython/blob/main/Lib/email/message.py#L289

  • CPython versions tested on: 3.9.13, 3.10.7
  • Operating system and architecture: macOS 12.6 Intel

msg.txt

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibPython modules in the Lib dirtopic-emailtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions