Python Primers - Moving Old Files To ZIP Archives

In this Python Primer, we will walk through the process of moving all files in the current directory that are older than a certain date to a ZIP file. We will move files that were created more than one-week prior to a ZIP file - this cutoff date of a week is configurable, of course!

We will use the datetime, os and zipfile standard library modules to achieve with this functionality.

The associated video for this post can be found below.


Objectives

In this post, we will learn how to:

  • Create ZIP files from a list of files
  • Filter all files in a directory down to only those created below a particular date/time

Code

Create a script called archive_files.py (or any name you like), and place this file within a directory that contains many files. We're going to add Python code to this file that will allow us to find all files created over a week ago, and move them to a ZIP file.

Firstly, we need to acquire a timestamp that represents one week prior to the current date (i.e. one week prior to the time that the script is executed). We are going to use the datetime module to get this timestamp. Add the following code to your Python file.

from datetime import datetime, timedelta
import os

# Get a datetime object for exactly one week prior
last_week = datetime.now() - timedelta(days=7)
# print(last_week)

# convert datetime object to timestamp
timestamp = datetime.timestamp(last_week)

print(timestamp)  # 1640179392.563389

We can get a datetime object by calling the datetime.now() function and subtracting timedelta(days=7). This will give us a datetime object for exactly one-week prior to the current date/time.

Next, we will convert that object to a timestamp using the datetime.timestamp() function - this returns a timestamp for one-week ago.

The timestamp represents the number of seconds that have passed since 00:00:00 UTC on 1 January 1970 (which is known as the Unix epoch).

Now that we have our timestamp, we are going to get a list of all files in the current directory, using the following code:

files = [f for f in os.listdir() if os.path.isfile(f)]

This code is explained in this post, and returns a list of all files in the current directory. To use a different directory, pass an absolute or a relative path to the os.listdir() function.

Now that we have files, we need to find out when they were created. We can use the os.stat(f) function to get statistics about a file, one of which is the file's creation time. Code for this is below.

# iterate over the files and get their creation times
for file in files:
    created = os.stat(file).st_ctime
    print(created)

This will print the file creation timestamps for all of the files in the directory. We can now compare these timestamps to the timestamp that represents one-week ago, and add all the files created over a week ago to a list, as below.

# add files to list if created more than a week ago
old = []
for file in files:
    created = os.stat(file).st_ctime
    if created < timestamp:
        old.append(file)

print(old)

This will print out all the files in the directory that were created over a week ago. And now that we have this list, we can add all of the files within it to a ZIP archive with the following code.

from zipfile import ZipFile

with ZipFile('old.zip', 'w') as zipfile:
    for f in old:
        zipfile.write(f)

We use the zipfile module and open a ZipFile object using a context manager, and write each file in our list of old files to the zipfile called old.zip.

That's it! Easy as that. The final code is shown below.

from datetime import datetime, timedelta
import os
from zipfile import ZipFile

# Get a datetime object for exactly one week prior
last_week = datetime.now() - timedelta(days=7)

# convert datetime object to timestamp
timestamp = datetime.timestamp(last_week)

# get list of files in current directory
files = [f for f in os.listdir() if os.path.isfile(f)]

# Add files to list if created more than a week ago
old = []
for file in files:
    created = os.stat(file).st_ctime
    if created < timestamp:
        old.append(file)

# Add old files to ZIP archive
with ZipFile('old.zip', 'w') as zipfile:
    for file in old:
        zipfile.write(file)

Bonus: Convert Loop to Generator Expression

One modification we might make, to make our code more concise, is to use a generator expression (or alternatively, a list comprehension) to define our variable old. We can change the following lines:

old = []
for file in files:
    created = os.stat(file).st_ctime
    if created < timestamp:
        old.append(file)

To a single-line generator expression:

old = (f for f in files if os.stat(f).st_ctime < timestamp)

This shortens 5 lines of code down to a single-line, and is concise and more 'Pythonic'. The resulting item old is a generator, which we can loop over as before to add files to our ZIP archive. The final code is shown below.

from datetime import datetime, timedelta
import os
from zipfile import ZipFile

# Get a datetime object for exactly one week prior
last_week = datetime.now() - timedelta(days=7)

# convert datetime object to timestamp
timestamp = datetime.timestamp(last_week)

# get list of files in current directory
files = [f for f in os.listdir() if os.path.isfile(f)]

# Create generator that yields files created more than a week ago
old = (f for f in files if os.stat(f).st_ctime < timestamp)

# Add old files to ZIP archive
with ZipFile('old.zip', 'w') as zipfile:
    for file in old:
        zipfile.write(file)

Summary

In this post, we have demonstrated how to use Python to archive old files into a ZIP file. From here, you can do any number of tasks. Some suggested extensions to this script include:

  • Accept the target directory as a command-line argument.
  • Accept the ZIP file name as command-line argument.
  • Implement the timestamp code with the 'time' module and timestamp arithmetic [hint: look at the time.time() function]
  • Delete the old files from the directory after saving to the ZIP archive (be careful with this!)
  • Programmatically send the final ZIP file to cloud storage such as AWS S3 (see the boto3 library).

If you enjoyed this post, please subscribe to our YouTube channel and follow us on Twitter to keep up with our new content!

Please also consider buying us a coffee, to encourage us to create more posts and videos!

;