Logging the activity of programs is extremely useful. It’s more common for programs that run services such as web servers. You’ll see that they log requests, authentication and they are often in the form crowded stream of messages that require further software to analyse. But even if you have a small project, it is worth keeping track of what’s going on.

Most people use the handy printfunction to actually log the activity of their program. It’s very common to see code such as print("Loading data...")but these print statements do not scale well and can get messy to keep track of. Following the online reference, here is how to structure your project and use the logging module in Python.

# Suppose this is our main.py
# our main file that we start our program
import argparse
import logging

# This is another module that we use logging in
import mylib

parser  =  argparse.ArgumentParser(description="Very descriptive.")
parser.add_argument("--log", choices=['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'], default="INFO", help="Log level.")
ARGS  =  parser.parse_args()

# Setup logging
# Choose one of these options
# Option 1: set the level of logging required
# and print to standard out or error by default
logging.basicConfig(level=getattr(logging, ARGS.log))
# Option 2: similar to above but save to file
logging.basicConfig(filename='example.log',
                    encoding='utf-8',
                    level=getattr(logging, ARGS.log))

# This line creates the logger with the module name
# It's pretty much this line that allows us to scale
# logging across our application and keep track of
# who's logging what
logger = logging.getLogger(__name__)

logger.debug("Program guts open for inspection.")
logger.info("This is some useful event.")
logger.warning("Be careful, something is up.")
logger.error("Something went wrong, you should investigate.")
logger.critical("Panic, hopefully we didn't blow something up.")

mylib.myfunc()

Now in all other files, you can simply setup logging using:

# some other module mylib.py
import logging

# Here the module name will be prefixed to the logs
logger = logging.getLogger(__name__)

def myfunc():
	logger.debug("Program guts open for inspection.")
	logger.info("This is some information.")
	logger.warning("Be careful, something is up.")
	logger.error("Something went wrong, you should investigate.")
	logger.critical("Panic, hopefully we didn't blow something up.")

The most important part is really the logging.getLoggerwhich creates different loggers for each module with their name. This way you can keep track of who is logging what. By who I mean which module.

In the above example, we set the logging level, essentially what information we want our application to report to us. There are 5 main levels of logging:

  1. logger.debug() concerns itself with spilling out operational information that might help debugging the application. I often use these for report metadata such as size of datasets. If the application does not work, I can enable the debug level logs by setting level=logging.DEBUGand investigate if for example the dataset is empty.
  2. logger.info() reports expected activity. Loaded data, logged user in, useful events. Everything is fine and I’m letting you know these things are happening. They are very useful because if something does go wrong, you can see from the logs what was the last legitimate step the application did.
  3. logger.warning() talks about potential issues that you should investigate. The most common example is deprecation warnings. In my projects, I often use them to report activity the program could handle but really the user shouldn’t do. For example, warn the user that they are about to do something stupid. Or that a file does not exist and I will create one for them. Another common example is the training of a model finished early, so that they check if that is okay.
  4. logger.error() is when something actually went wrong. The programmer should investigate and fix. In my projects this is often an action that the user wanted to do but is invalid but the program can continue on. I use them when I catch an exception but can continue on running the program. If you find yourself supressing an error but resuming execution, you should report it!
  5. logger.critical() tells about a bomb that has already exploded. Honestly, I rarely rarely use it since the any runtime error that is not handled will just terminate the program and print the error message.

As a final note, use your intuition on what to log. Don’t log everything and don’t log nothing. Think about what are the critical steps of your application and what information might be useful to know at what level. For most machine learning projects I’ve worked on, these include steps surround loading, processing and saving datasets, starting, resuming and terminating training loops as well as evaluation steps. Keep it simple and tidy.