Held Wednesday, February 9, 2000
Overview
Today, we'll visit the issue of sorting: turning a collection
of elements into a collection in which smaller elements precede larger
elements. Our focus will primarily be on sorting arrays.
Question 11 for today's class: Describe how to put a pile of books in alphabetical order by author.
Question 12 for Friday's class: Computer programmers often emphasize efficiency as the primary criterion for evaluating algorithms. What effects do you think this emphasis has?
Notes
- A few of you have expressed frustration at writing algorithms.
We may spend a few minutes delving into those issues. Do you
also find it frustrating to give directions or write down recipes?
If not, what's the difference? Is it that we're trying to use a
formal language? Is it that we can't rely on common sense?
- Don't forget that we meet in Science 2424 (down the hall) on Friday.
- Readings for Friday's class:
Dewdney 35 (Sequential Sorting), 40 (Heaps and Merges)
Contents
Summary
- The problem of sorting
- Handouts:
- Your answers to question 11:
Describe how to put a pile of books in alphabetical order by author.
- Typically, computer scientists look at collections of problems and
attempt to find appropriate generalizations of these problems
(or their subproblems).
- By solving the generalized problem, you solve a number of
related problems.
- One problem that seems to crop up a lot is that of sorting.
Given a list, array, vector, sequence, or file of comparable elements,
put the elements in order.
- in order means that each element is no bigger than
the next element. (You can also sort in decreasing order, in
which case each element is no smaller than the next element.)
- you also need to ensure that all elements in the original
list are in the sorted list.
- In evaluating sorting methods, we should concern ourselves with
both the running time and the amount of extra storage (beyond the
original array) that are required.
- Most often, in-memory sorting is accomplished by repeatedly
swapping elements. However, this is not the
only way in which sorting can be done.
- It's often best to ground sorting algorithms in practical
experience.
- I'll bring in a stack of my CDs and we'll consider ways to
sort them.
- First, I'll just hand you stacks of CDs and ask you to
describe what you're doing.
- We'll continue by looking at the instructions you wrote for today.
- Because sorting is such an important task, computer scientists
(and normal people, too) have developed a number of techniques
that are commonly used for sorting.
- Selection sort is among the simpler and more natural
methods for sorting.
- In this sorting algorithm, you segment the array into two
subparts, a sorted part and an unsorted part. You repeatedly find the
largest of the unsorted elements, and put that at the
beginning of the sorted part. This continues until there are no
unsorted elements.
- What's the running time of this algorithm? To sort an array of
n elements, we have to find the largest element in that array
in O(n) steps, and then continue with the rest.
- What's the extra memory required by this algorithm (ignoring the
extra memory for recursive calls)? It's more or less O(1), since
we only allocate a few extra variables and no extra vectors.
- How much extra memory is required for recursive method calls? This is a
tail-recursive algorithm, so there shouldn't be any.
- Bubble sort is a lot like selection sort except that instead of
finding the largest element and moving it to the end, you swap
adjecent elements, thereby ``bubbling'' the largest value
to the end.
- Another simple sorting technique is insertion sort.
- Insertion
sort operates by segmenting the list into unsorted and sorted portions,
and repeatedly removing the first element from the unsorted portion
and inserting it into the correct place in the sorted portion.
- This may be likened to the way typical card players sort their
hands.
- All three of these are O(N2). Which should
we choose?
- In this case, it turns out that the constants do make a
difference. Typically selection sort and insertion sort
run much more quickly than does bubble sort.
- Would we ever want to use bubble sort? Yes.
- Sometimes we can only swap neighboring elements. Consider
a situation in which you can only store two objects in memory,
and the rest in a file. Here's how we might do one round of
bubbling up.
Open the input file
Open a temporary file for the "more sorted" version
Let largestSoFar = the first element in the input file
While (elements remain in the input file)
Let nextElement = the next element in the input file
If nextElement < largestSoFar then
Write nextElement to the temporary file
Else
Write largestSoFar to the temporary file
Set largestSoFar to nextElement
// We've read the whole file (N elements), but only written
// N-1 elements. Write the last one.
Write largestSoFar to the temporary file
Close the input file
Close the temporary file
Replace the input file with the temporary file
- Are there other times we might want to use bubble sort? It turns
out that bubble sort is nice on some parallel computers. You can
swap a N/2 pairs of adjacent elements in one step
- Round 1: all the cells numbered 2*i swap with 2*i+1 (if out of
order)
- Round 2: all the cells numbered 2*i swap with 2*i-1 (if out of
in order)
- We may try acting out this last sorting routine.