Sue Evans & Will Murnane
Problems:
>>> values = [7, 6, 9, 3, 4, 1] >>> values.sort() >>> minimum = values[0] >>> size = len(values) >>> maximum = values[size - 1] >>> minimum 1 >>> maximum 9
>>> scores = [3, 3, 3, 3.5, 3.5, 0, 3, 2, 3, 3.5, 3.5, 2, 0] >>> scores.sort() >>> labScore = 0 >>> size = len(scores) >>> for i in range(3, size): ... labScore += scores[i] ... >>> labScore 31.0 >>>
>>> gymScores = [9, 8, 9, 9, 8, 10, 10, 9, 9, 8] >>> gymScores.sort() >>> total = 0 >>> size = len(gymScores) >>> for i in range(1, size - 1): ... total += gymScores[i] ... >>> average = float (total) / (size - 2) >>> average 8.875
def uniq(aList): aList.sort() result = [] lastItem = None for item in aList: if item != lastItem: result.append(item) lastItem = item return result print uniq([1,2,4,2,3,5,2,1,3]) [1, 2, 3, 4, 5]
Suppose we have a list of things that we want to sort by something other than just the natural ordering—for example, if we had a list of strings that we want to sort by length instead of alphabetically. We could take this list and turn it into a list of things whose natural sort order puts the items of the original list into the order we want them in. Here's what it looks like in code:
>>> stringList = ["a", "apple", "hello", "trucks", "junk"] >>> decorated = [] >>> for item in stringList: ... decorated.append((len(item), item)) ... >>> print decorated [(1, 'a'), (5, 'apple'), (5, 'hello'), (6, 'trucks'), (4, 'junk')]
Now if we sort this list, the strings we started with are in order by length.
>>> decorated.sort() >>> print decorated [(1, 'a'), (4, 'junk'), (5, 'apple'), (5, 'hello'), (6, 'trucks')]
Then we can take this list and undecorate it (remove the counts and leave only the original strings) which will still be in sorted order.
>>> undecorated = [] >>> for item in decorated: ... undecorated.append(item[1]) ... >>> undecorated ['a', 'junk', 'apple', 'hello', 'trucks']
This method, as suggested by the variable names we've used, is called
"decorate-sort-undecorate" or "the Schwartzian transform".
It's a common method of sorting by orders other than the default one.
The decorate-sort-undecorate method shown in the last slide works, but it's a little messy. It takes up temporary space as it does its work that is bigger than the original list of items. This can be a concern if a lot of items are present in the list. To alleviate this problem, Python (and many other languages) have built-in functionality that lets us do this non-default sort without the extra steps.
Suppose we are tracking how many students get a particular grade, and we want to sort the results in various ways. We've got a dictionary built like this:
grades = {'A': 13, 'B': 27, 'C': 11, 'D': 4, 'F': 0}
Now we want to display this data in two ways: first, sorted by grade. We can extract the elements of the dictionary to a list by using the items method, and then sort it:
>>> grades.items() [('A', 13), ('C', 11), ('B', 27), ('D', 4), ('F', 0)] >>> sorted(grades.items()) [('A', 13), ('B', 27), ('C', 11), ('D', 4), ('F', 0)]
So, we can easily sort by letter grade. But, suppose we wanted to see
which grades were most common?
In other words, we want to sort by the second part of each item. To do
this, we could write a function that gets the second part of an item, and
pass it to the sort function:
>>> def getSortKey(item): ... return item[1] ... >>> sorted(grades.items(), key = getSortKey) [('F', 0), ('D', 4), ('C', 11), ('A', 13), ('B', 27)]
Now the list is sorted by the second attribute of the items. This is a common thing to do, so Python has a function that does the same thing as getSortKey. We can load and use it like this:
>>> from operator import itemgetter >>> sorted(grades.items(), key = itemgetter(1)) [('F', 0), ('D', 4), ('C', 11), ('A', 13), ('B', 27)]
As you can see, the itemgetter function takes an argument that specifies which attribute of the items it should return for comparison.
What would the earlier "sort strings by length" example look like with a key function?
>>> stringList = ["apple", "a", "hello", "trucks", "junk"] >>> sorted(stringList, key = len) ['a', 'junk', 'apple', 'hello', 'trucks']
Sometimes it's not easy to come up with a function that generates a sort key for a particular sorted order that we want. We can also use a cmp function that will be given two items to compare. As an example, here's what the default cmp function does with integers:
>>> cmp(5, 3) 1 >>> cmp(5, 5) 0 >>> cmp(5, 7) -1
So our function returns a positive number if the first item is larger,
zero if they're equal, or a negative number if the first item is smaller.
Suppose we wanted to sort the grades dictionary as before, but this time using a comparison function:
>>> def myCmp(a, b): ... if a[1] > b[1]: ... return 1 ... if a[1] < b[1]: ... return -1 ... return 0 ... >>> sorted(grades.items(), cmp = myCmp) [('A', 13), ('B', 27), ('C', 11), ('D', 4), ('F', 0)]
This may seem like a trivial example, since we could use a key function. But some languages don't have an equivalent to the key function, so we want to introduce many ways of accomplishing this task.
Let's try a more complicated example. Suppose we have a list of lists: each item in the list is itself a list of the answers a student gave on the midterm exam. We have a list of the correct answers, and we want to sort the students by the grade they got.
ANSWER_KEY = [ ... ] def byGrade(student1, student2): score1 = score2 = 0 for question in range(len(student1)): if student1[question] == ANSWER_KEY[question]: score1 += 1 if student2[question] == ANSWER_KEY[question]: score2 += 1 return cmp(score1, score2)
Three ways to get the list sorted in the opposite order:
>>> aList = [4,1,6,3,8,2,5] >>> aList.sort() >>> aList [1, 2, 3, 4, 5, 6, 8] >>> aList.reverse() >>> aList [8, 6, 5, 4, 3, 2, 1]This works, but it doesn't feel quite right; we just spent a lot of time putting it in one order, then we go back and swap everything around.
>>> def reverseCmp(a, b): ... if a[1] > b[1]: ... return -1 ... if a[1] < b[1]: ... return 1 ... return 0 >>> aList.sort(cmp = reverseCmp) >>> aList [8, 6, 5, 4, 3, 2, 1]This, too, works, but there's one more thing we could do to optimize it.
>>> aList.sort(reverse = True) >>> aList [8, 6, 5, 4, 3, 2, 1]
There are times when it would be convenient for your program to be able to get information from the operating system's command line. This allows programs to be run in batch mode where the output from one program can be the input for another, etc.
Any number of arguments can be passed from the command line into your program.
Here's an example of code that uses command-line arguments:
import sys num = len(sys.argv) for i in range(num): print "sys.argv[%d] = %s" % (i, sys.argv[i])
Here's the output:
linuxserver1.cs.umbc.edu[160] python commandLine.py 2 foo 7.5 bar snoopy jazz sys.argv[0] = commandLine.py sys.argv[1] = 2 sys.argv[2] = foo sys.argv[3] = 7.5 sys.argv[4] = bar sys.argv[5] = snoopy sys.argv[6] = jazz linuxserver1.cs.umbc.edu[161]
This example uses command line arguments to give the program the name of the input file to use and the name of the output file to write during processing. This is a very common use of command line arguments.
If your program needs command line arguments in order to run, then you should have a clearly marked usage instructions in your file header comment to explain how to run the program.
# commandLine.py # Sue Evans # 11/17/09 # All sections # bogar@cs.umbc.edu # # This program illustrates using command-line arguments # It also uses file-handling, strings, lists & dictionaries # # This program requires command line arguments which are the # filename of the input file and the filename of the output # file, in that order. import sys import string ANSWER_KEY = ['T','T','F','a','b','c'] def main(): # make sure there are 3 arguments on the command line # exit if not if len(sys.argv) != 3: print "This program requires command line arguments." print "The first argument is the filename of the input file." print "The second argument is the filename of the output file." sys.exit() grades = {} # open file for input infile = open(sys.argv[1], "r") # for each student read in a line, strip it and # split it into the student's name and a string of his answers for line in infile: student = string.strip(line) student = string.split(student) size = len(student[1]) # change the string of answers to a list of answers answers = [] for i in range(2, size, 4): answers.append(student[1][i]) # make the student's name the key and his list of answers the value grades[student[0]] = answers # calculate the student's score score = 0 size = len(answers) for question in range(size): if answers[question] == ANSWER_KEY[question]: score += 1 # change the value to being just the score instead of a # list of answers grades[student[0]] = score # close the infile and open the outfile infile.close() outfile = open(sys.argv[2], "w") # make a list of the keys and sort them names = grades.keys() names.sort() # write the sorted student's names and their scores to the outfile for name in names: outfile.write(name + "\t" + str(grades[name]) + "\n") # close the output file outfile.close() main()
Here's the input file, answers.txt :
Barnes,Beth ['T','T','F','a','b','c'] Carson,Ed ['T','F','T','a','b','b']
Let's run it!
linuxserver1.cs.umbc.edu[205] python commandLine.py answers.txt grades.out linuxserver1.cs.umbc.edu[206]
Here's grades.out :
Barnes,Beth 6 Carson,Ed 3
Since I chose to use a dictionary, things became out of order immediately. Therefore, getting a list of keys and sorting them was necessary to get the roster back into sorted order by the students' last names.
Write a program that will add values passed in as command-line arguments and will print their sum. The user may enter as many values as they choose on the command line.