Creating a command line tool

Most of the examples we have seen up to know we using python using the interactive ipython environment. Many times, however, it is useful to run your script form the command line. In this chapter we will see how you can write python code that can be used both from the command line and the interactive interpreters.

The difference with the command line

Let examine for a minute the following file example.py:

# filename: example.py
def simple_function():
    print("This is inside the function.")

print("This is outside the function.")

Imagine you run the following script from the command line:

python example.py

The output will be:

This is outside the function.

When a python files are imported from the command line all the commands in the file are interpreted. In our case first the function is declared, and then the print command is executed.

Specifically, imagine that you now want to interpret the simple_function from the command line:

>>> import example
This is outside the function.

>>> example.simple_function()
This is inside the function.

What just happened? When we imported a file, all the code inside the file was executed, including the print function.

This is not perfect. Ideally, we should import and use a function from a file without executing any other command.

Keeping some command only for the command line

The way to keep some command only for the command line is to use the following code:

if __name__ == "__main__":
    <your commands>

Each python module has the built-in property called __name__. When the module is the main program running (i.e. it is run from the command line) the the variable __name__ gets the string value “__main__”.

Instead, when the module is imported the variable __name__ holds the name of the imported module.

You can check this with this very simple file:

# filename: test_name.py
print "The __name__ variable is: " + __name__

Then try to run it both from the command line and importing it:

python test_name.py
The __name__ variable is: __main__

and:

>>> import test_name
The __name__ variable is test_name

We can now change the example.py file to print something only when run from the command line:

# filename: example.py
def simple_function():
    print("This is inside the function.")

if __name__ == "__main__":
    print("This is outside the function.")

When we import this file nothing is printed. The variable __name__ is not equal to __main__ so the print command is never run.

Reading user input

A way of making your script more flexible is to ask the user for some input parameters, to control the program execution. This is done by the raw_input built-in function:

if __name__ == "__main__":
    color = raw_input("What color is your car?")
    print("Your car is %s" % color)

The raw_input command will just return a string, so if you are expecting a number you should do the conversion yourself.

Providing arguments for the command line

A command line tool can be extra useful when provide arguments from the command line to control their behavior. Command line arguments can also be used to write batch scripts or when you use your script in a chain of commands, a thing not possible when interactive user input is required.

We will see here how we can read the command line arguments in our script and change the behavior of our program accordingly.

We will try to write a small program that will count the number of ‘.txt’ files in a directory. We start by writing a function that will do just this and save it in a file. A simple implementation could be:

# filename: count_txt.py
import os
import glob  # Module to read the content of the directory

def count_txt(directory):
   """ Count the number of txt files in the provided directory. """

   # Find the search path in a system-independent way
   # this will give e.g. /my/dir/*.txt
   search_str = os.path.join(directory, '*.txt')

   # Get a list with all the files
   files = glob.glob(search_str)

   # Count the files in the list
   number_of_files = len(files)

   return number_of_files

The above function will get a list of all the files with the .txt ending in a directory and will return their number. We want now to provide the directory name in the command line e.g.:

python count_txt /my/dir/

One way to do it is to use the sys module. This module contains the sys.argv property that contains the command line arguments. The name of the module is stored in the first position sys.argv[0], the first argument is at position 1 etc. The simplest way to use this is:

# filename: count_txt.py (cont.)
import sys  # Add sys to read the command line arguments

if __name__ == "__main__":
   directory = sys.argv[1]  # Get the first command line argument
   number_of_files = count_txt(directory)

   print("The number of txt files are %s." % number_of_files)

You can now run the script from the command line (changing of course the path to a real directory):

python count_txt.py /home/user/mydir/
The number of txt files are 3.