Python

Python is a programming language created by Guido van Rossum in the late 1980s.  As the developer and overseer of the Python language, he has been designated as the “Benevolent Dictator for Life” (BDFL).  He is presently working at Google on Python-related topics.  The name is a reference to Monty Python’s Flying Circus.

Python is an open source project with an emphasis on group development.  Anyone can propose a modification/update to Python by submitting a PEP, or Python Enhancement Proposal.  (The process is described in the very first PEP.)  The Python community as a whole is friendly and helpful.  The Python forums are a great place for discussion of anything related Python, and there are web locations that provide free documentation on the use of Python.

The design goals of Python make it suitable for learning (and teaching), prototyping, and general scripting.  It can be used instead of awk or Perl.  (However, as with Perl, it is not mandated by POSIX; only awk is.)  Python has a much simpler syntax than Perl, is interpreted (and so portable to any system with an interpreter (Qu: what is an interpreter?  What is a compiler?) installed), and comes with a large standard library (compared to awk).  Available for Python are several GUI toolkits, the old (and bundled with Python) Tk toolkit, and the more modern Qt4 toolkit (used on Android cell phones).  Thus, it is easy to create GUI scripts with Python.  Python supports both simple scripts, scripts with functions, and “object-oriented” scripts suitable for large-scale development.

For these reasons, “Python is the new Perl” (that is, it has become quite popular).  Today many new scripts are written in Python rather than Perl.  (Consider the Red Hat installer program, Anaconda: it’s a big Python script, hence the name.)  One reason for this switch is that Perl has grown too complex to learn easily.  I still use Perl, but usually using someone else’s code I found using CPAN, or stolen from some Perl cookbook (I own some).

Python has had three major versions; the latest is not compatible with the earlier versions, and is often installed as “python3” while version 2 is installed as “python”.  If you want to use Qt for the GUI stuff, you need to install that, in addition to “PyQt” (the Python library providing access to Qt).  I installed the following packages: python, python3, PyQt-examples, PyQt4, python3-PyQt, python-tools (provides idle, but only for python2; for Python3 (“idle3”), install python3-tools), tkinter, and python3-tkinter .

Python Basics

Variables are created when you first assign something to one.  The naming rules are simple: letters, digits, and underscores, and can’t start with a digit.  Python keeps lists of variables and the “objects” they refer to.  These lists are called namespaces.  When using some variable, Python will look for its name in the local namespace, then the global namespace, and then the built-in namespace.  (This is only important if you create functions, or have multiple modules of code.)

Strings (text) can be quoted with either single or double quotes.

Comments are the same as for other scripting languages: a “#” starts a comment, through the end of that line.

Statements, as in shell and awk, end with a newline or a semicolon.

Unlike all other languages, Python doesn’t use curly braces to indicate a “block” of code.  Instead, all statements in a block are indented by the same amount.

Here are some simple examples:

$ python3
Python 3.2.1 (default, Jul 11 2011, 18:55:33)
[GCC 4.6.1 20110627 (Red Hat 4.6.1-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

>>> name = input("Enter your name: ")
Enter your name: Hymie
>>> print( "Hello, " + name)
Hello, Hymie
>>> ^D
$

Notice you can use “+” to concatenate strings.  You can also do normal math operations.  Use “**” for exponentiation.  “/” does floating-point division, even on integers.  To truncate (the way shell arithmetic does) use “//” instead.

Python also supports complex numbers:  ((0+1j) ** 2).real # “-1”.  (You can also use “.imag” to extract the imaginary part.)

Also notice the interactive Python prompt of “>>>”.  At the prompt, you can type the name of any variable, or type in any expression, and Python will tell you the value.  An EOF (^D, or on DOS, ^Z) indicates the end; you can also use the exit() function.

Besides this basic mode of work, you can install a simply Python shell, or IDE, called IDLE.  This has a few features such as syntax coloring, auto-completion, and a command history.  It can work with python3, but requires a bit of customization to make that happen currently.

If you start a block, or continue a long line, the prompt changes to “...”.  The block ends when the next line starts with the previous block’s indent.  (In the interactive mode, you need to enter a blank line to end the block.)  Here’s an example of an if statement (similar to the shell’s if statement you have learned):

>>> name = input("Enter your name: ")
Enter your name: wayne
>>> if name == "wayne":
...   print("Welcome, Wayne!")
... else:
...   print( "Go away!")
...
Welcome, Wayne!

>>>

The end of the block was indicated by typing a line with no (or a different) indent; here, I just hit enter in column one and that was the end of the if statement.

Notice the colons.  Blocks always start with a line that ends in a colon.  Also note the else has the same indent (none in this example) as the if line.

Loops

The blocking and indenting works the same way with loops.  A loop is a way to repeat a block of code.  This is handy to do the same set of steps for each line in a file, for each cell in a table, for each user, for each command line argument, for each file in a directory, etc.  (As we will see, the shell and other languages such as awk have similar if statements and loops.)

Python has a while loop and a for loop, similar to those in other languages:

>>> for num in range(5):
...   print( num )
0
1
2
3
4
>>> stooges = ['Moe', 'Larry', 'Curly' ]
>>> for stooge in stooges:
...    print( stooge )
...
Moe
Larry
Curly
>>> num
4
>>> while num > 0:
...   print( num )
...   num = num - 1
...
4
3
2
1
>>>

(We could have used “num -= 1” in the while loop above, but not “--num”.)  Demo looking up range() function.  Note, typing the range() function alone at the interactive prompt won’t show the resulting list.  Try “list(range(5))”.

In Python, for loops always look like this: “for variable in list:”.  Python will execute the (indented) block that follows, setting variable to each value of the list then executing the block.  A while loop is a bit different; you specify some Boolean expression (one that evaluates to True or False).  If True, the block is executed.  Then the expression is evaluated again.  This continues until the expression evaluates to False.

Lists and Strings

Python includes a rich set of operations that work on lists (arrays) and strings:

>>> name = "Wayne"
>>> name[0]
'W'
>>> name[-1]
'e'
>>> name[1:]
'ayne'
>>> name[0:-1]
'Wayn'
>>> topings = ["meatball", "pepperoni", "sausage", "anchovies"]
>>> topings[:-1]  # I hate anchovies!
['meatball', 'pepperoni', 'sausage']
>>> topings[:2] + [ "pinapple" ] + topings[2:]
['meatball', 'pepperoni', 'pinapple', 'sausage', 'anchovies']
>>> topings
['meatball', 'pepperoni', 'sausage', 'anchovies']
>>> topings[1:3] = [ "pinapple" ]
>>> topings
['meatball', 'pinapple', 'anchovies']
>>> 'pinapple' in topings
True
>>> 'sausage' in topings
False

As you can see, you can easily obtain a slice from any list, add to a list, and insert or replace parts of a list.  You can check list membership.  The elements of lists can be anything, even other lists.

Python also supports associative arrays or hashes; they are called dictionaries:

>>> stooge_says = { 'Moe':'Oh, a wise-guy!',
... 'Larry':'Hey Moe!', 'Curly':'Woob-woo-woo!' }
>>> stooge_says['Moe']
'Oh, a wise-guy!'
>>>

Python also has read-only (“immutable”) lists, called a tuples.  (Also it has “sets”.)  While important, they won’t be discussed further here.

Notice the use of single-quotes; Python doesn’t care.  Also notice how the long line was continued, without extra indenting.

Python supports printf-like formatting of strings, in two ways:

>>> print( '{0:5d}{1:5d}' .format(3,4) )
    3    4
>>> print( '%5d%5d' % (3,4) )
    3    4
>>>

Unlike Perl and awk, Python doesn’t include regular expressions in the language directly.  Instead they are provided by the re module.

Functions and Modules

Python allows you to define your own functions.  That allows you to avoid copy and paste, when you need to do the same sub-task from different places in your script.  You define the code once, and then invoke it (or call it) from different places in your script.  In addition, a lengthy script is often easier to understand if you break it down into shorter, simpler functions, each of which does some sub-task.

Functions can invoke other functions.  Functions can be passed arguments (just like any utility).  Here’s a simple example:

>>> def times2(item):
... """ times2( number )
...     Returns number * 2
... """
...   return item * 2
...
>>> times2( 3 )
6
>>> times2( 'Foo' )
'FooFoo'
>>>

Notice how the multiply works for strings too.  Python also allows you to define default values for arguments, and to name them.  Note the docstring.  Functions can start with a documentation string, showing others (or yourself in the future) how the function is meant to be used.  The docstring is shown if you use the built-in help function.

You can do math on Boolean values too: x = True; y = not x; z = x * 3.  (True acts like one and False like zero in expressions.  Note this is rarely useful.)

Python files that define things are called modules.  Module files should have the extension “.py”.  You can invoke these to use the defined functions and variables.  Your system comes with many modules.  Here’s one example:

>>> import random
>>> random.randint(1,10)
3
>>>

In addition to import, you can use functions from modules without qualifying them with the module name.  You could repeat the above example this way:

>>> from random import randint
>>> randint(1,10)
6
>>>

(You can import all the functions from a module, using “*” instead of a function name.  That imports all names from the module, except those that start with an underscore.)  Python makes it easy to create and use your own modules.  Like functions, modules often start with a docstring.  You can use dir(module) to see what is defined in module.

When looking for a module, Python will look for a file named module.py in the directories listed on PYTHONPATH.  You can view the built-in path with:

>>> import sys
>>> sys.path
['', '/usr/lib/python32.zip', '/usr/lib/python3.2', '/usr/lib/python3.2/plat-linux2', '/usr/lib/python3.2/lib-dynload', '/usr/lib/python3.2/site-packages']

It is easy to make a file that can be used as a module (and thus imported), or as a script.  You wrap the statements in a function, typically called main, and then have an if statement at the end that says if run as a script, run main.  Here’s an example called modscript.py:

#!/usr/bin/python3
""" This file is both a script and a module """

def hello ():
   print( "hello" )
if __name__ == "__main__": hello()

This module can be either used as a script, or imported as a module:

$ ./modscript.py
hello
$ python3
>>> import modscript
>>> modscript.hello()
hello
>>> from modscript import *
>>> hello()
hello
>>> ^D

Python Examples

To get help on any topic, keyword, module, or function, start the interactive help system:

>>> help()
...info about help...
help> quit
>>>

(If you add documentation in the right way to your own modules, this works for those too.)

Here’s an example that shows how to work with files in Python:

>>> import os
>>> os.getcwd()
'/home/wpollock'
>>> f = open( "myfile.txt", "w" )
>>> f.write("Hello from the world of Python!\n")
32
>>> f.close()
>>> f = open( "myfile.txt", "r" )
>>> text = f.readline()
>>> f.close()
>>> text
'Hello from the world of Python!\n'
>>> ^D
$ cat myfile.txt

Hello from the world of Python!

Here’s a version of the Unix wc utility, in Python:

$ cat wc.py
#!/usr/bin/python3
""" Reads a file and shows the number of lines,
    words, and characters.
"""
import sys
infile = open( sys.argv[1] )
lines = infile.read().split("\n")
num_lines = len(lines)
num_words = 0; num_chars = 0
for line in lines:
   words = line.split()
   num_words += len(words)
   num_chars += len(line)
print( num_lines, num_words, num_chars )
$ ./wc.py wc.py
16 50 345
$