Skip to content

TimedRotatingFileHandler rotating on use not time #84649

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
markhallett mannequin opened this issue May 1, 2020 · 14 comments
Open

TimedRotatingFileHandler rotating on use not time #84649

markhallett mannequin opened this issue May 1, 2020 · 14 comments
Labels
3.8 (EOL) end of life stdlib Python modules in the Lib dir

Comments

@markhallett
Copy link
Mannequin

markhallett mannequin commented May 1, 2020

BPO 40469
Nosy @vsajip, @stevendaprano, @martonivan
PRs
  • gh-84649: Make TimedRotatingFileHandler use CTIME instead of MTIME #24660
  • Files
  • logging_config.ymal
  • app.py: Eg code to use the ymal file
  • config.json: JSON version of test config
  • log_40469.py: Modified test script
  • log_40469_new_instance.py
  • log_40469_single.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2020-05-01.15:02:02.675>
    labels = ['3.8']
    title = 'TimedRotatingFileHandler rotating on use not time'
    updated_at = <Date 2022-03-31.20:48:10.169>
    user = 'https://bugs.python.org/markhallett'

    bugs.python.org fields:

    activity = <Date 2022-03-31.20:48:10.169>
    actor = 'steven.daprano'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = []
    creation = <Date 2020-05-01.15:02:02.675>
    creator = 'markhallett'
    dependencies = []
    files = ['49111', '49112', '50196', '50197', '50208', '50209']
    hgrepos = []
    issue_num = 40469
    keywords = []
    message_count = 11.0
    messages = ['367849', '367922', '398684', '399306', '399481', '399496', '399499', '399500', '399506', '399531', '416458']
    nosy_count = 4.0
    nosy_names = ['vinay.sajip', 'steven.daprano', 'markhallett', 'martonivan']
    pr_nums = ['24660']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue40469'
    versions = ['Python 3.8']

    @markhallett
    Copy link
    Mannequin Author

    markhallett mannequin commented May 1, 2020

    Using the attached confing ymal file, the log file only rolls if the log file is not written to for over a minute, logging regulary (eg every 30s) prevents the correct rolling of the log.

    @markhallett markhallett mannequin added 3.8 (EOL) end of life labels May 1, 2020
    @markhallett
    Copy link
    Mannequin Author

    markhallett mannequin commented May 2, 2020

    Eg code to run is now attached

    @vsajip
    Copy link
    Member

    vsajip commented Aug 1, 2021

    With this slightly modified test script and the JSON configuration based on the YAML, I got the following results after running for a while:

    $ more info.log*
    ::::::::::::::
    info.log
    ::::::::::::::
    2021-08-01 07:13:45 INFO Message no. 9
    2021-08-01 07:14:15 INFO Message no. 10
    ::::::::::::::
    info.log.2021-08-01_07-09
    ::::::::::::::
    2021-08-01 07:09:45 INFO Message no. 1
    2021-08-01 07:10:15 INFO Message no. 2
    ::::::::::::::
    info.log.2021-08-01_07-10
    ::::::::::::::
    2021-08-01 07:10:45 INFO Message no. 3
    2021-08-01 07:11:15 INFO Message no. 4
    ::::::::::::::
    info.log.2021-08-01_07-11
    ::::::::::::::
    2021-08-01 07:11:45 INFO Message no. 5
    2021-08-01 07:12:15 INFO Message no. 6
    ::::::::::::::
    info.log.2021-08-01_07-12
    ::::::::::::::
    2021-08-01 07:12:45 INFO Message no. 7
    2021-08-01 07:13:15 INFO Message no. 8

    The test script logs every 30 seconds, so I don't see what the problem is. Can you describe it more clearly?

    @martonivan
    Copy link
    Mannequin

    martonivan mannequin commented Aug 10, 2021

    The assumed behaviour of TimedRotatingFileHandler is to rotate log files older than configured. Even when the script is executed multiple times.

    self.rolloverAt (the variable that defines when the rollover should be performed) is set after each rollover and right after initializing the file handler instance.

    If the instance is initialized once (like in your script) and a rollover is performed without having the object destroyed, the logger works fine, the rollover is done and the next round is scheduled properly.

    The case is not so lucky if the script runs multiple time or the logger itself is initialized multiple times during one execution. In this case, since the MTIME is read each time when the TimedRotatingFileHandler's init is called, and the file is modified (by having a new line added to the end of the file). The next execution will read the new MTIME and will never perform any rollover.

    I've slightly modified your example script to demonstrate these use-cases.

    Example 1: Log a single line with the script, but execute it multiple times with delays between the execution!
    In bash: for i in {1..101}; do python log_40469_single.py $i; sleep 30; done

    Example 2: Log multiple lines, but reinitiate the logger object between the events!
    See log_40469_new_instance.py

    @vsajip
    Copy link
    Member

    vsajip commented Aug 12, 2021

    The assumed behaviour of TimedRotatingFileHandler is to rotate log files older than configured. Even when the script is executed multiple times.

    Ah, but is it? The purpose of TimedRotatingFileHandler is to rotate files based on time intervals, and is normally assumed to be for long-running processes (where the rotation allows one to focus on logs output during a period of interest). What's the practical, real-world use-case for the example scripts you added?

    @martonivan
    Copy link
    Mannequin

    martonivan mannequin commented Aug 13, 2021

    Well, there is nothing in the documentation that would say the logrotation can be only performed by continously running scripts, thats why. :)

    There are scripts or even daemons written in Python that need to be restarted from time to time. If the logrotate period is greater than restart period the log files will never be rotated, however they could be.

    We were looking for something that works like the logrotate do, but without involving any external logic, but the one written in python. And TimedLogrotateHandler appears to be one of those, unless it checks the MTIME of the files instead of the CTIME.

    What is the logic behind checking the MTIME instead of CTIME when one is calculating the age of a file when tries to calculate whether it has to be rotated or not? If the only purpose of this initial rollover calculation would be to start a predefined long period (as you suggest), now() (time of execution) should used instead and no file attribute should be involved. From the code I have to assume that the original author also tried to prepare for the cases where the execution is interrupted and/or restarted and (s)he wanted the TimedLogrotateHandler to be resumed where it was during the previous execution. Dnn't you agree?

    @vsajip
    Copy link
    Member

    vsajip commented Aug 13, 2021

    Well, there is nothing in the documentation that would say the logrotation can be only performed by continously running scripts, thats why. :)

    Well, what's the need to rotate based on time when you just run scripts sporadically or they're very short-lived? Not trying to argue, just trying to understand.

    There are scripts or even daemons written in Python that need to be restarted from time to time. If the logrotate period is greater than restart period the log files will never be rotated, however they could be.

    Yes, but generally daemons are long-lived relative to the rotation interval, aren't they? "From time to time" certainly implies that to me.

    What is the logic behind checking the MTIME instead of CTIME when one is calculating the age of a file when tries to calculate whether it has to be rotated or not?

    I can't remember, that code is over 12 years old and that specific decision wasn't documented :-(

    From the code I have to assume that the original author also tried to prepare for the cases where the execution is interrupted and/or restarted and (s)he wanted the TimedLogrotateHandler to be resumed where it was during the previous execution. Dnn't you agree?

    I take it you mean TimedRotatingFileHandler. Certainly it assumes that the script might be restarted, which is why it opens the file in append mode.

    If the computation using MTIME were changed to use CTIME, that would be a behaviour change, which could conceivably break someone's code that relied on current behaviour (unlikely, but you just never know). I've no other objection to using CTIME rather than MTIME - perhaps I'm just being hyper-cautious about breakage?

    @vsajip
    Copy link
    Member

    vsajip commented Aug 13, 2021

    I just checked - the relevant code (using MTIME) was added in 2010, as a fix for bpo-8117 - around the time that Python 2.6 was released.

    @martonivan
    Copy link
    Mannequin

    martonivan mannequin commented Aug 13, 2021

    Well, what's the need to rotate based on time when you just run scripts sporadically or they're very short-lived? Not trying to argue, just trying to understand.
    Yes, but generally daemons are long-lived relative to the rotation interval, aren't they? "From time to time" certainly implies that to me.

    In one of our use-cases, where I faced with this bug/behaviour of TimedRotatingFileHandler, we are using it in a daemon for daily logrotate. The aim is to have the logs separated day by day, just like the majority of our other components running on a linux. We've just realized that some days are missing from the series of logs and the investigation led to the service restart and the MTIME checking that we are talking about here. When a logger is configured to rotate daily one just simply assume that it will rotate daily regardless the daemon restart. (As en extreme(?) example, when a the system - and the daemon of course - is restarted each day - or just never running for a whole day - and btw it logs often, the log file won't be ever rotated and can inrease over every limits.)

    I've quickly googled around and found someone who execute some logic from cron and expects the loghandler to rotate his files, that wouldn't happened for the same reason:
    https://stackoverflow.com/questions/30569164/logging-handler-timedrotatingfilehandler-never-rotates-the-log

    I just checked - the relevant code (using MTIME) was added in 2010, as a fix for bpo-8117 - around the time that Python 2.6 was released.
    As I understand bpo-8117, the goal was to do the rollover even after script restarts. (To be frank, only an extreme case was mentioned in the issue opener entry, when the rollover period had been reached when the script wasn't running.) I believe for this purpose MTIME wasn't the best choice, but using CTIME instead would make the script do what is expected to.

    I take it you mean TimedRotatingFileHandler. Certainly it assumes that the script might be restarted, which is why it opens the file in append mode.

    If the computation using MTIME were changed to use CTIME, that would be a behaviour change, which could conceivably break someone's code that relied on current behaviour (unlikely, but you just never know). I've no other objection to using CTIME rather than MTIME - perhaps I'm just being hyper-cautious about breakage?

    You are far better and more experienced in maintaining widely used opensource codebases so I wouldn't argue with you, but... :)

    • In case of scripts that create the log file for themselves the change would have no effect.
    • In case of scripts that are using already existing files for logging (by opening them for append) could change the behaviour, but just slighly. The first rollover would happen before adding the first new log line if the file age reached the configured time, and not when the age of last log line written to the file before having the script started reached the same value. No further rotation would be effected.

    I _believe_ no one expects the current initial behaviour in case of resumed log files, and for sure, the documentation doesn't cover this part so I don't know how to proceed.

    I _believe_ (again) that changing the behaviour wouldn't break the script outside, but rather would help them working the way how the developers originally assumed they did.

    @vsajip
    Copy link
    Member

    vsajip commented Aug 13, 2021

    You are far better and more experienced in maintaining widely used opensource codebases so I wouldn't argue with you, but... :)

    Well, if a change introduces a problem, it will be my lookout, won't it? Just as the original patch for bpo-8117 is proving to be :-)

    OK, how about this? We update the PR to add a new use_ctime=False keyword argument to TimedRotatingHandler.__init__(). If nothing else changes, the behaviour is as it is currently. If use_ctime=True is passed in the instantiation, then the constructor will use CTIME rather than MTIME. Will that work for you? I'd be OK with that approach.

    @stevendaprano
    Copy link
    Member

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @iritkatriel iritkatriel added the stdlib Python modules in the Lib dir label Nov 26, 2023
    @JChristensen
    Copy link

    I found this issue as I use logging with short-running scripts that are scheduled to run regularly. I had a script that was rotating weekly and working fine. I was surprised when I tried another script where I expected it to rotate daily and it did not.

    While I can see that things may be working as designed and documented, consider the apparent inconsistency created by the following two examples.

    The first works as expected, rotating the log file weekly. The second (where creating the handler on line 11 is the only significant change) I expected to rotate daily, but never does.

    #!/usr/bin/python3
    # This script is to be scheduled to execute and write to the log daily.
    # It should rotate the log files weekly.
    
    import logging
    import logging.handlers
    
    logger = logging.getLogger('daily_logger')
    logger.setLevel(logging.DEBUG)
    handler = logging.handlers.TimedRotatingFileHandler(
            'daily.log', when='w5', backupCount=4)
    logger.addHandler(handler)
    formatter = logging.Formatter(
            fmt='%(asctime)s:%(levelname)s:%(name)s:%(message)s',
            datefmt='%Y-%m-%d %H:%M:%S')
    handler.setFormatter(formatter)
    logger.info('Hello, world!')
    #!/usr/bin/python3
    # This script is to be scheduled to execute and write to the log hourly.
    # It should rotate the log files daily.
    
    import logging
    import logging.handlers
    
    logger = logging.getLogger('hourly_logger')
    logger.setLevel(logging.DEBUG)
    handler = logging.handlers.TimedRotatingFileHandler(
            'hourly.log', when='d', backupCount=4)
    logger.addHandler(handler)
    formatter = logging.Formatter(
            fmt='%(asctime)s:%(levelname)s:%(name)s:%(message)s',
            datefmt='%Y-%m-%d %H:%M:%S')
    handler.setFormatter(formatter)
    logger.info('Hello, world!')

    @serhiy-storchaka
    Copy link
    Member

    I was just about to create the same issue. It is not so uncommon use case -- run a script by schedule instead of as a daemon and want to have a timed logs rotation. Even if you have a daemon, if it crashes more often than the rotation interval, this will prevent the logs rotation.

    I do not think that the new use_ctime option is needed.

    martonivan added a commit to martonivan/cpython that referenced this issue Mar 7, 2024
    The TimedRotatingFileHandler previously only used st_mtime attribute of the
    log file to detect whether it has to be rotate yet or not. In cases when the
    file is changed within the rotatation period the st_mtime is also updated
    to the current time and the rotation never happens.
    
    It's more appropriate to check the file creation time (st_ctime) instead.
    Whenever available, the more appropriate st_birthtime will be in use. (This
    feature is available on FreeBSD, MacOS and Windows at the moment.) If
    the st_mtime would be newer than st_ctime (e.g.: because the inode
    related to the file has been changed without any file content
    modification), then the earliest attribute will be used.
    martonivan added a commit to martonivan/cpython that referenced this issue Mar 7, 2024
    The TimedRotatingFileHandler previously only used st_mtime attribute of the
    log file to detect whether it has to be rotate yet or not. In cases when the
    file is changed within the rotatation period the st_mtime is also updated
    to the current time and the rotation never happens.
    
    It's more appropriate to check the file creation time (st_ctime) instead.
    Whenever available, the more appropriate st_birthtime will be in use. (This
    feature is available on FreeBSD, MacOS and Windows at the moment.) If
    the st_mtime would be newer than st_ctime (e.g.: because the inode
    related to the file has been changed without any file content
    modification), then the earliest attribute will be used.
    martonivan added a commit to martonivan/cpython that referenced this issue Mar 7, 2024
    The TimedRotatingFileHandler previously only used st_mtime attribute of the
    log file to detect whether it has to be rotate yet or not. In cases when the
    file is changed within the rotatation period the st_mtime is also updated
    to the current time and the rotation never happens.
    
    It's more appropriate to check the file creation time (st_ctime) instead.
    Whenever available, the more appropriate st_birthtime will be in use. (This
    feature is available on FreeBSD, MacOS and Windows at the moment.) If
    the st_mtime would be newer than st_ctime (e.g.: because the inode
    related to the file has been changed without any file content
    modification), then the earliest attribute will be used.
    @DmitriSaburov
    Copy link

    Let me say a couple of words too) I faced rotation bug about 10 years ago. The correction based on ctime not seems good to me.
    Lets see my real case. I have a program (on windows) running every 5 minutes by scheduler. And I want log rotation once per month (30 days). Due to the fact that base log creation time will not changing(windows feature :) every time program run the rotation will occur. Not what I want.
    Ten years ago I made a workaround(inherited class) that analyze ROTATED logs list in order to determine when rotation happened. For first rotation to work I used creation time.
    Something like that:
    In the end of MyTimedRotatingFileHandler.init:
    self.rolloverAt = self.computeRollover(self.getlastrollover())
    where:

        def getlastrollover(self):
            """
            Get last rollover moment based on rolled logs
            :return: timestamp of last rollover or CREATE time of current log or current time
            """
            dirName, baseName = os.path.split(self.baseFilename)
            fileNames = os.listdir(dirName)
            result = []
            prefix = baseName + "."
            plen = len(prefix)
            for fileName in fileNames:
                if fileName[:plen] == prefix:
                    suffix = fileName[plen:]
                    if self.extMatch.match(suffix):
                        f2add = os.path.join(dirName, fileName)
                        mdft = os.stat(f2add).st_mtime
                        result.append((f2add, mdft))
            if len(result):
                sortedresult = sorted(result, key=lambda x: x[1])
                t = sortedresult[len(sortedresult)-1][1]
                del sortedresult
            else:
                if os.path.exists(self.baseFilename):
                    t = os.stat(self.baseFilename)[ST_CTIME]  # before first rolling check creation date
                else:
                    t = int(time.time())
             return t
    

    May be this can be useful for somebody.
    Best regards.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 (EOL) end of life stdlib Python modules in the Lib dir
    Projects
    Status: No status
    Development

    No branches or pull requests

    6 participants