Skip to content

gh-132661: docs: add a t-string tutorial #137213

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Doc/library/string.templatelib.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@

.. seealso::

* :ref:`T-strings tutorial <tut-t-strings>`
* :ref:`Format strings <f-strings>`
* :ref:`T-string literal syntax <t-strings>`

Expand Down
192 changes: 192 additions & 0 deletions Doc/tutorial/inputoutput.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,22 @@ printing space-separated values. There are several ways to format output.
>>> f'Results of the {year} {event}'
'Results of the 2016 Referendum'

* When greater control is needed, :ref:`template string literals <tut-t-strings>`
can be useful. T-strings -- which begin with ``t`` or ``T`` -- share the
same syntax as f-strings but, unlike f-strings, produce a
:class:`~string.templatelib.Template` instance rather than a simple ``str``.
Templates give you access to the static and interpolated (in curly braces)
parts of the string *before* they are combined into a final string.

::

>>> name = "World"
>>> template = t"Hello {name}!"
>>> template.strings
('Hello ', '!')
>>> template.values
('World',)

* The :meth:`str.format` method of strings requires more manual
effort. You'll still use ``{`` and ``}`` to mark where a variable
will be substituted and can provide detailed formatting directives,
Expand Down Expand Up @@ -161,6 +177,182 @@ See :ref:`self-documenting expressions <bpo-36817-whatsnew>` for more informatio
on the ``=`` specifier. For a reference on these format specifications, see
the reference guide for the :ref:`formatspec`.

.. _tut-t-strings:

Template String Literals
-------------------------

:ref:`Template string literals <t-strings>` (also called t-strings for short)
are an extension of :ref:`f-strings <tut-f-strings>` that let you access the
static and interpolated parts of a string *before* they are combined into a
final string. This provides for greater control over how the string is
formatted.

The most common way to create a :class:`~string.templatelib.Template` instance
is to use the :ref:`t-string literal syntax <t-strings>`. This syntax is
identical to that of :ref:`f-strings` except that it uses a ``t`` instead of
an ``f``:

>>> name = "World"
>>> template = t"Hello {name}!"
>>> template.strings
('Hello ', '!')
>>> template.values
('World',)

:class:`!Template` instances are iterable, yielding each string and
:class:`~string.templatelib.Interpolation` in order:

.. testsetup::

name = "World"
template = t"Hello {name}!"

.. doctest::

>>> list(template)
['Hello ', Interpolation('World', 'name', None, ''), '!']

Interpolations represent expressions inside a t-string. They contain the
evaluated value of the expression (``'World'`` in this example), the text of
the original expression (``'name'``), and optional conversion and format
specification attributes.

Templates can be processed in a variety of ways. For instance, here's code that
converts static strings to lowercase and interpolated values to uppercase:

>>> from string.templatelib import Template
>>>
>>> def lower_upper(template: Template) -> str:
... return ''.join(
... part.lower() if isinstance(part, str) else part.value.upper()
... for part in template
... )
...
>>> name = "World"
>>> template = t"Hello {name}!"
>>> lower_upper(template)
'hello WORLD!'

Template strings are particularly useful for sanitizing user input. Imagine
we're building a web application that has user profile pages. Perhaps the
``User`` class is defined like this:

>>> from dataclasses import dataclass
>>>
>>> @dataclass
... class User:
... name: str
...

Imagine using f-strings in to generate HTML for the ``User``:

.. testsetup::

class User:
name: str
def __init__(self, name: str):
self.name = name

.. doctest::

>>> # Warning: this is dangerous code. Don't do this!
>>> def user_html(user: User) -> str:
... return f"<div><h1>{user.name}</h1></div>"
...

This code is dangerous because our website lets users type in their own names.
If a user types in a name like ``"<script>alert('evil');</script>"``, the
browser will execute that script when someone else visits their profile page.
This is called a *cross-site scripting (XSS) vulnerability*, and it is a form
of *injection vulnerability*. Injection vulnerabilities occur when user input
is included in a program without proper sanitization, allowing malicious code
to be executed. The same sorts of vulnerabilities can occur when user input is
included in SQL queries, command lines, or other contexts where the input is
interpreted as code.

To prevent this, instead of using f-strings, we can use t-strings. Let's
update our ``user_html()`` function to return a :class:`~string.templatelib.Template`:

>>> from string.templatelib import Template
>>>
>>> def user_html(user: User) -> Template:
... return t"<div><h1>{user.name}</h1></div>"

Now let's implement a function that sanitizes *any* HTML :class:`!Template`:

>>> from html import escape
>>> from string.templatelib import Template
>>>
>>> def sanitize_html_template(template: Template) -> str:
... return ''.join(
... part if isinstance(part, str) else escape(part.value)
... for part in template
... )
...

This function iterates over the parts of the :class:`!Template`, escaping any
interpolated values using the :func:`html.escape` function, which converts
special characters like ``<``, ``>``, and ``&`` into their HTML-safe
equivalents.

Now we can tie it all together:

.. testsetup::

from dataclasses import dataclass
from string.templatelib import Template
from html import escape
@dataclass
class User:
name: str
def sanitize_html_template(template: Template) -> str:
return ''.join(
part if isinstance(part, str) else escape(part.value)
for part in template
)
def user_html(user: User) -> Template:
return t"<div><h1>{user.name}</h1></div>"

.. doctest::

>>> evil_user = User(name="<script>alert('evil');</script>")
>>> template = user_html(evil_user)
>>> safe = sanitize_html_template(template)
>>> print(safe)
<div><h1>&lt;script&gt;alert(&#x27;evil&#x27;);&lt;/script&gt;</h1></div>

We are no longer vulnerable to XSS attacks because we are escaping the
interpolated values before they are included in the rendered HTML.

Of course, there's no need for code that processes :class:`!Template` instances
to be limited to returning a simple string. For instance, we could imagine
defining a more complex ``html()`` function that returns a structured
representation of the HTML:

>>> from dataclasses import dataclass
>>> from string.templatelib import Template
>>> from html.parser import HTMLParser
>>>
>>> @dataclass
... class Element:
... tag: str
... attributes: dict[str, str]
... children: list[str | Element]
...
>>> def parse_html(template: Template) -> Element:
... """
... Uses Python's built-in HTMLParser to parse the template,
... handle any interpolated values, and return a tree of
... Element instances.
... """
... ...
...

A full implementation of this function would be quite complex and is not
provided here. That said, the fact that it is possible to implement a method
like ``parse_html()`` showcases the flexibility and power of t-strings.

.. _tut-string-format:

The String format() Method
Expand Down
Loading