Notes On: Python

Installation

Head to the official Python website and download the relevant installation package for your system. There are usually two production versions available. Either is fine.

Core structure

Commands can be written and executed in the Python interpretor, or placed in a program file with the file extension of .py. Double-clicking on the Python file will then run it. Generally, lines starting with, and immediately following, >>> are written in the interpretor.

First key point to remember is that Python does not need semi-colons at the end of expressions, and that whitespace is very important! Be sure to indent text when required, especially in statements and functions.

Comments

Start single line comments with the hash key #. For multi-line comments use three consecutive double quotes, i.e.

""" This is a
multi-line comment """

Numbers and Maths

Numbers (integers or floats) behave similar to other programming languages, though take care with integer division. For example,

>>> 12/3
4                   // Correct
>>> 13/3
4                   // Incorrect
>>> 13.0/3.0
4.333333333333333   // Correct    or    13.0/3    or    13/3.0    or    13/3.

Mathematical operators are the usual + - * / %, with ** to denote the exponential operator, such as

>>> 2 ** 3
8

Variables

Python variables are loosely-typed and case-sensitive. Be sure to capitalize the first letter of your booleans.

Strings

Use the + key to concatenate strings, or string literals. The print command prints strings:

>>> print "This is a string"
This is a string
>>> print "This is the number " + str(10)  # Concatenate a string and a number with the str() function
This is the number 10
>>> print "This is the number " + `10`     # Use backticks instead of the str() function
This is the number 10

The str() function performs explicit string conversion, so explicitly converts anything that isn't a string, into a string. Implicit string conversion is literally putting quotes around a sequence of characters.

Another function is len() which returns the length of a sequence of characters. Other basic string-specific methods are .upper(), .lower(), noting that these use dot notation.

>>> a = "Hello"
>>> b = "World"
>>> len(a)
5
>>> a + b
"HelloWorld"
>>> a.upper()
"HELLO"
>>> a.startswith("Hell")
True
>>> a.replace("H","M")
"Mello"

Remember that strings in Python are immutable meaning that you cannot change them once they have been created. Strings can be formatted with % via the following syntax

print "%s" % (variable)

where s represents a string. For example,

print "My name is: %s, and likes %s" % (name, likes)

where name and like are variables of string type.

Conversions

Methods exist to convert variables between datatypes, e.g. int(), float(), and str().

Raw Input

To ask for user input use the following commands:

first_input  = input("Your name please:")                 # Just for strings
second_input = raw_input("Give me something, anything:")  # Any data type is accepted

Control Flow

Syntax for conditional statements are

if condition1:
    # Do something
elif condition2:
    # Do something else
else:
    # Do another thing if condition1 and condition2 both fail

Note: Remember the colons!

There are six comparators: ==, !=, <, <=, >, and >=.

Loops

There are two kep looping constructs

A while loop:

n = 10
while n > 10:
    print 'T-minus', n
    n = n - 1
print 'Blastoff!'

and a for loop:

for i in range(5):
    print i

names = ['Dave','Paula','Thomas','Lewis']
for name in names:
    print name

Printing

To print to stdout, we use the print command (Python 2)

print x
print x, y, z
print "Your name is", name
print x,                         # Omits newline

or the print function (Python 3)

print(x)
print(x,y,z)
print("Your name is", name)
print(x,end=' ')                 # Omits newline

Files

Opening a file

f = open("foo.txt","r")          # Open for reading
f = open("bar.txt","w")          # Open for writing

To read data

data = f.read()                  # Read all data

To write text to a file

g.write("some text\n")

Reading a file one line at a time

f = open("foo.txt","r")
for line in f:
    # Process the line
    ...
f.close()

Functions

Generally, there are three parts to a function in Python

# This multi-line comment
# is entirely optional, but describes
# what the function does
def name_of_function():           # This is the Header
    # Do something                # This is the Body

To call the function, just type name_of_function(). Parameters may be placed, as a comma separated list, in the parentheses.

Use splat arguments if you do not know how many arguments a function will take:

def many_arguments(*args):
    # Do something

many_arguments("arg1", "arg2", "arg3")

To specific an optional argument, or provide a default value for a parameter, define it in the function header:

def function_name(req_arg,opt_arg="optional"):
    # Do something

Data Structures

Tuples

A tuple is a read-only collection of related values grouped together

tuple = ('123', 456, 789)

You can then unpack the tuple in variables

a, b, c = tuple
    # a = '123'
    # b = 456
    # c = 789

Dictionaries

A dictionary is a collection of values indexed by 'keys'

dict = {
    'alpha' = '123',
    'beta'  = 456,
    'gamma' = 789
}

>>> dict['alpha']
'123'
>>> dict['beta'] = 1011

Lists

A list is an ordered sequence of items (usually of the same type. Python does not care, but we do not normally do this.)

names = ['Dave', 'Paula', 'Thomas']

>>> len(names)
3
>>> names.append('Lewis')
>>> names
['Dave', 'Paula', 'Thomas', 'Lewis']
>>> names[0]
'Dave'

We can create a new list by applying an operation to each element of a sequence. This is called list comprehensions

>>> a = [1,2,3,4,5]
>>> b = [2*x for x in a]
>>> b
[2,4,6,8,10]

Sets

A set is an unordered collection of unique items, useful for detecting duplicates or related tasks.

ids = set(['123','456','789'])

>>> ids.add('101')
>>> ids.remove('123')
>>> '456' in ids
True

Modules

A module is a file that contains definitions — including variables and functions — that you can use. You can import modules by using the import command, such as for math library

Math

>>> import math

and call a function such as math.sqrt(), where the math. prefix tells python to look for the function sqrt() in the math module.

Note that in your program, you can assign a custom name to a function. For example,

>>> mysqrt = math.sqrt
>>> mysqrt(9)               # Use new custom name for function
3.0

You can also just get a single function from a module by function import such as just the sqrt function

from math import sqrt

You can then call the function by sqrt(), noting that we do not need the math. prefix.

To import everything from a module use an asterisk

from math import *

but be careful with this. You have been warned.

URLs

To read data from the web, we can import the urllib module.

import urllib                # urllib.request on Python 3

u = urllib.urlopen("http://www.python.org")
data = u.read()

Parsing XML

Parsing a document into a tree

from xml.etree.ElementTree import parse

doc = parse('data.xml')

Useful methods include findall('elname'), findtext('elname')

Tips & Tricks

from collections import Counter

words = ['yes','but','no','but','yes']
wordcounts = Counter(words)

>>> wordcounts['yes']
2
>>> wordcounts.most_common()
[('yes',2),('but',2),('no',1)]
records.sort(key=lambda p: p['COMPLETION DATA'])
records.sort(ley=lambda p: p['ZIP'])
records.sort(key=lambda r: r['ZIP'])

from itertools import groupby
groups = groupby(records, key=lambda r: r['ZIP'])
for zipcode, group in groups:
    for r in group:
        # All records with same zip-code
        ...
from collections import defaultdict

zip_index = defaultdict(list)
for r in records:
    zip_index[r['ZIP']].append(r)

zip_index = {
    '60640': [ rec, rec, ... ],
    '60637': [ rec, rec, rec, ... ],
    ...
}

Bibliography & Resources