Skip to content

[BUG] Application gets stuck while manually stopping in async environment #3397

@0x4c47

Description

@0x4c47

Steps to Reproduce

  1. Create an async environment with signal handling and task cancellation to run the Telegram bot in. Working code example:
import asyncio
import logging
import signal
import telegram.ext
import telegram.ext.filters

logging.basicConfig(level=logging.DEBUG)
log = logging.getLogger("custom")
app = telegram.ext.ApplicationBuilder().token("<token here>").build()

async def main():
    loop = asyncio.get_event_loop()
    loop.add_signal_handler(signal.SIGTERM, lambda: asyncio.create_task(shutdown(loop)))

    await app.initialize()
    await app.start()
    await app.updater.start_polling()
    try:
        while True:
            log.info("Now running in main loop...")
            await asyncio.sleep(10000)
    except asyncio.CancelledError:
        log.warning("Telegram wait loop got cancelled, stopping...")
        await app.updater.stop()
        await app.stop()
        await app.shutdown()

async def shutdown(loop):
    log.warning("Received SIGTERM, shutting down...")
    tasks = [task for task in asyncio.all_tasks() if task is not asyncio.current_task()]
    [task.cancel() for task in tasks]
    log.warning("Waiting for async tasks to be cancelled...")
    await asyncio.gather(*tasks)

asyncio.run(main())
  1. Send a SIGTERM to the process: kill -s SIGTERM <pid>

Expected behaviour

Application stops correctly.

Actual behaviour

Application gets stuck while stopping and never stops, see logs. After the last log line, nothing happens, no matter how long I'm waiting. I have to manually kill the process with another signal like SIGKILL to stop it (ungracefully).

Operating System

Linux (tested on Fedora 37, Ubuntu 22.04 and within a Debian docker container)

Version of Python, python-telegram-bot & dependencies

python-telegram-bot 20.0a6
Bot API 6.3
Python 3.11.0 (main, Oct 24 2022, 00:00:00) [GCC 12.2.1 20220819 (Red Hat 12.2.1-2)]

Relevant log output

[...]
INFO:custom:Now running in main loop...
DEBUG:telegram.ext._updater:Start network loop retry getting Updates
DEBUG:telegram._bot:Entering: get_updates
WARNING:custom:Received SIGTERM, shutting down...
WARNING:custom:Waiting for async tasks to be cancelled...
WARNING:custom:Telegram wait loop got cancelled, stopping...
DEBUG:telegram.ext._updater:Stopping Updater
DEBUG:telegram.ext._updater:Waiting background polling task to finish up.
DEBUG:telegram.ext._updater:Network loop retry getting Updates was cancelled
DEBUG:telegram.ext._updater:Updater.stop() is complete
INFO:telegram.ext._application:Application is stopping. This might take a moment.
DEBUG:telegram.ext._application:Waiting for update_queue to join

Additional Context

I noticed that my Docker containers with Telegram bots always take ~10 seconds to get stopped which is a sign that SIGTERM doesn't suffice for stopping the container and a SIGKILL is needed. That's when I started digging deeper and found out about this.

I have other applications "embedded" in the same asyncio termination signal handling and they're stopped as expected.

Without the signal handling and task cancelling, this issue doesn't seem to appear. So it seems to be related to either the application getting the SIGTERM or the asyncio tasks getting cancelled.

I'm not 100% sure that I'm doing asyncio task cancelling correctly but since it works for other applications, it seems weird to me that it doesn't work for PTB in such a weird way.**

Code example that isn't affected by this issue:

import asyncio
import logging
import telegram.ext
import telegram.ext.filters

logging.basicConfig(level=logging.DEBUG)
log = logging.getLogger("custom")
app = telegram.ext.ApplicationBuilder().token("<token here>").build()

async def main():
    await app.initialize()
    await app.start()
    await app.updater.start_polling()
    log.info("Now running for 5 seconds...")
    await asyncio.sleep(5)
    log.warning("Trying to stop now...")
    await app.updater.stop()
    await app.stop()
    await app.shutdown()

asyncio.run(main())

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions