Reading a file directory with Python

Published: Jul 24, 2021

Last updated: Jul 24, 2021

This is Day 5 of the #100DaysOfPython challenge.

This post will use the glob library to reading a base directory that we have set with files and put them into a list.

Prerequisites

Familiarity with Pipenv. See here for my post on Pipenv.
Familiarity with JupyterLab. See here for my post on JupyterLab.
A list of icons (or just any files) in an icons directory.

Getting started

Let's create the hello-read-dir-python directory and add an icons directory to read from.

# Make the `hello-read-dir-python` directory
$ mkdir hello-read-dir-python
$ cd hello-read-dir-python
# Create a folder to place your icons
$ mkdir icons

# Init the virtual environment
$ pipenv --three
$ pipenv install pillow
$ pipenv install --dev jupyterlab

Ensure to put some files into the icons directory. They do not necessarily need to be imaged, but the code written will target that folder.

Now we can start up the notebook server.

# Startup the notebook server
$ pipenv run jupyter-lab
# ... Server is now running on http://localhost:8888/lab

The server will now be up and running.

Creating the notebook

Once on http://localhost:8888/lab, select to create a new Python 3 notebook from the launcher.

Ensure that this notebook is saved in hello-read-dir-python/docs/<your-file-name>.

We will create two cells to handle two parts of this script:

A cell to import the require packages.
A cell to read the directory.

Importing the packages

We will use the glob library to read the directory.

The glob library allows us to recurise through a directory and return a list of all the files that match the glob pattern.

We also use the packages from the os.path library to create a path to our icons directory from the script..

Update hello-read-dir-python/docs/<your-file-name> to have the following:

import glob
from os.path import join, dirname, abspath

Reading the directory

Once we have the glob library imported, then we can use it to read the directory and do the heavy lifting for us.

# Assign icons_dir as a path to the icons directory from the script file
icons_dir = join(dirname(abspath("__file__")), '../icons')
print(glob.glob(f"{icons_dir}/*"))

In my case, the following is printed to the console for the four files that I have in that folder:

['/path/to/hello-read-dir-python/docs/../icons/git.png', '/path/to/hello-read-dir-python/docs/../icons/python.png', '/path/to/hello-read-dir-python/docs/../icons/aws.png', '/path/to/hello-read-dir-python/docs/../icons/dok.png']

Now with that list assigned to the variable, we can use that list to iterate through and do what we need to from those files e.g. manipulate the image, read the file contents, write back file contents and more.

Summary

Today's post demonstrated how to use the glob library to programmatically get all the matching files within a folder.

I use techniques such as this within my own blog repository to grab relevant markdown and image files to generate the blog post images or do things with the blog metadata (like creating the list for my home page).