Monday 2 December 2013

Sorting Algorithms (Selection Sort, Quick Sort, Merge Sort, and Efficiency)

Sorting Algorithms

This is the last blog post that I will be writing for CSC148H1F. I want to take this opportunity to say that I really enjoyed Danny's pedagogical methods in this course.

In this post, I'll be talking about a few sorting algorithms and talk about their efficiency and how they can be improved.

Selection Sort:
Lets say we want to sort a list of n elements. At each step we assume the first 'k' elements are already sorted. Then we look at the rest 'n-k' elements, find the smallest element, and put it at the 'k+1'st position.

Efficiency:
So we first go through n-1 elements, then n-2 elements, then n-3 elements, ..., 2 elements, and 1 element. Hence, the runtime of this algorithm is O(n^2). But this is very inefficient since we make many unnecessary comparisons.

Implementation:
The following is the code that was provided in lab #9 for selection sort:

def find_min(L, i):
    """Return the index of the smallest item in L[i:]."""

    smallest = i
    for j in range(i + 1, len(L)):
        if L[j] < L[smallest]:
            smallest = j
    return smallest

# find_min using built-in min, tuple comparisons, and enumerate
# should be faster...
#def find_min(L, i):
    #"""Return index of smallest item in L[i:]."""
    #return min([(k, j) for j, k in enumerate(L[i:])])[1] + i

def selection_sort(L):
    """Sort the items in L in non-decreasing order."""

    i = 0
    while i != len(L) - 1:

        # Find the smallest remaining item and move it to index i.
        smallest = find_min(L, i)
        L[smallest], L[i] = L[i], L[smallest]
        i += 1
Merge Sort:
We all learned about merge sort algorithm in class as one of our main examples for a divide and conquer algorithm. As we can see in lab 9, it can be implemented in two ways.
1) You can divide the list into two parts, call merge sort on both of them, and then merge the two sorted lists.
2) The second method is what you would call an in place sort. In this method, instead of making copies of each part of the list and sorting them, you do the sort process on the initial list, hence the name. This method significantly reduces the amount of memory that is used by the program.

Efficiency:
The runtime of merge sort is of order O(n lg n). Because we divide the problem lg n times and in each step of merging, we make at most n comparisons.
There is no significant difference between the runtime of the first and second methods. It is mainly the memory usage that differs.

Implementation:
The following is the implementation for both methods. The code if from lab 9:

############ Mergesort 1 ############

def _merge_1(left, right):
    """Merge sorted lists left and right into a new list and return that new
    list."""

    result = []

    i ,j = 0, 0
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1

    result += left[i:]
    result += right[j:]
    return result

def _mergesort_1(L):
    """Return a new list that is a sorted version of list L."""

    if len(L) < 2:
        return L
    else:
        middle = len(L) // 2
        left = _mergesort_1(L[:middle])
        right = _mergesort_1(L[middle:])
    return _merge_1(left, right)

def mergesort_1(L):
    """Sort list L."""

    L[:] = _mergesort_1(L)

############ Mergesort 2 ############

def _merge_2(L, i, mid, j):
    """Merge the two sorted halves L[i:mid + 1] and L[mid + 1:j + 1] and return
    them in a new list. Notice that L[mid] belongs in the left half and L[j]
    belongs in the right half -- the indices are inclusive."""

    result = []
    left = i
    right = mid + 1

    # Done when left > mid or when right > j -- we've finished one of the
    # halves.
    while left <= mid and right <= j:
        if L[left] < L[right]:
            result.append(L[left])
            left += 1
        else:
            result.append(L[right])
            right += 1

    # Items left: L[left:mid + 1]
    #             L[right:j + 1]
    return result + L[left:mid + 1] + L[right:j + 1]

def _mergesort_2(L, i, j):
    """Sort the items in L[i] through L[j] in non-decreasing order."""

    if i < j:
        mid = (i + j) // 2
        _mergesort_2(L, i, mid)
        _mergesort_2(L, mid + 1, j)
        L[i:j + 1] = _merge_2(L, i, mid, j)

def mergesort_2(L):
    """Sort list L."""

    _mergesort_2(L, 0, len(L) - 1)


Quick Sort:
Quick Sort was covered in class with a pretty lazy/elegant one-line implementation ("Python one-liners" <3). We were presented with 2 other implementations of quick sort in lab 9.
1) This one tries to do an in place quick sort, but it doesn't. The way the partitioning is done is cohesive with an in place quick sort algorithm. However, after the partitioning, we make a copy of the left side, and a copy of the right side and call quick sort on those two. (Ugh, so close to being an in place sort!)
2)This one though is very nice. It follows a similar pattern for partitioning, and does not make any additional copies of any part of the initial list. (Beautiful :') )

Efficiency:
Both of these alogrithms have a average case runtime of order O(n lg n). However, the second one, being an in place sort, save a whole lot on memory usage.

Implementation:
Here are the implementations from lab 9:

############ Quicksort 1 ############

def _partition_1(L):
    """Rearrange L so that items < L[0] are at the beginning and items >= L[0]
    are at the end, and return the new index for L[0]."""

    v = L[0]
    i = 1
    j = len(L) - 1
    while i <= j:
        if L[i] < v:
            i += 1
        else:
            L[i], L[j] = L[j], L[i]
            j -= 1

    L[0], L[j] = L[j], L[0]
    return j

def _quicksort_1(L):
    """Sort L by partitioning it around the first item, then recursing."""

    if len(L) <= 1:
        return L
    else:
        pivot = _partition_1(L)
        left = L[:pivot]
        right = L[pivot + 1:]
        return _quicksort_1(left) + [L[pivot]] + _quicksort_1(right)

def quicksort_1(L):
    """Sort list L."""

    L[:] = _quicksort_1(L)

# functional version
#def quicksort(L):
#    if len(L) > 1 :
#        return (quicksort([x for x in L[1:] if x < L[0]])
#                + [L[0]] + quicksort([x for x in L[1:] if x >= L[0]]))
#    else :
#        return L
#
#def quicksort_1(L):
#    L[:] = quicksort(L) # to sort original list

############ Quicksort 2 ############

def _partition_2(L, i, j):
    """Rearrange L[i:j] so that items < L[i] are at the beginning and items
    >= L[i] are at the end, and return the new index for L[i]."""

    v = L[i]
    k = i + 1
    j -= 1
    while k <= j:
        if L[k] < v:
            k += 1
        else:
            L[k], L[j] = L[j], L[k]
            j -= 1

    L[i], L[j] = L[j], L[i]
    return j

def _quicksort_2(L, i, j):
    """Sort L[i:j] by partitioning it around the first item, then recursing."""

    if i < j:
        pivot = _partition_2(L, i, j)
        _quicksort_2(L, i, pivot)
        _quicksort_2(L, pivot + 1, j)

def quicksort_2(L):
    """Sort list L."""

    _quicksort_2(L, 0, len(L))

Monday 25 November 2013

Linear Merge Sort

Linear Merge Sort

I probably just blew your mind with the title, right? :D But no, I'm not kidding. Today it suddenly occurred to me that I can do a linear merge sort. However, there's a catch. I've shifted from serial programming to parallel programming.

Yesterday, I started learning about parallel programming on the GPU using CUDA, through an online course at Udacity (https://www.udacity.com/course/cs344). If you haven't heard of Udacity, it's an amazing website where you can learn a whole lot about computer science for the beautiful price of Free.

Now, here's my algorithm:
  1. Lets say my initial list is [a0, a1, ..., aN]
  2. The CPU then runs N/2 threads on the GPU that do the merging step of merge sort on pairs of elements of the list. a0 & a1, a2 & a3, ...
  3. Then the CPU runs N/(2^2) threads on the GPU that merge lists of size 2. [a0, a1] & [a2, a3], [a4, a5] & [a6, a7], ...
  4. Then the same with N/(2^3) threads on the GPU that merge lists of size 3.
  5. And so on until N/(2^(lg N)) threads on the GPU that do the final merging.
Efficiency Analysis:
The CPU sends commands to the GPU lg N times. On the time it takes for the "i"th run of the GPU is at most the time it takes for 2*2^(i) comparisons since all of the merges happen at the same time. Hence, the time complexity of the entire program (disregarding the constant time it takes to copy array values from the CPU to the GPU and vice versa) can be calculated using the following equation:

Hence, the algorithm is of order O(N) AKA LINEAR.

Tuesday 19 November 2013

Regular Expressions

Regular Expressions

Assignment 2 was about a simplified version of Regular Expressions. Being curious, I studied a bit more on regexes. Regexes are mainly used in pattern matching with strings, or string matching ie. checking to see if a string matches the format of a valid email address.

Here, I want to provide a list of the most common meta-characters that are used in regexes:
  • ".":  Despite what you might think after having completed assignment 2, the dot does not represent the joining of two expressions. It means any character except a new line ( a literal dot would be [.]). If you want to show the joining of two expressions, you just put them one after the other without anything in between.
  • "( )": Groups a series of pattern elements to a single element.
  • "+": Matches the element or pattern preceding it one or more times.
  • "?": Matches the element or pattern preceding it zero or one times.
  • "*": Matches the preceding pattern or element zero or more times.
  • "{M,N}": Matches the preceding pattern or element at least M times and at most N times.
  • "[...]": A set of possible character matches.
  • "|": Alternate possibilities.
  • "\b": I'm not sure what this one means. This is what Wikipedia is telling me, "Matches a zero-width boundary between a word-class character (see next) and either a non-word class character or an edge."
  • "\w": Matches an alphanumeric character.
  • "\W": Matches a non-alphanumeric character.
  • "\s": Matches a whitespace character.
  • "\S": Matches anything but a whitespace.
  • "\d": Matches a digit.
  • "\D": Matches a non-digit.
  • "^": Matches the beginning of a line or string.
  • "$": Matches the end of a line or string.
  • "\A": Matches the beginning of a string.
  • "\z": Matches the end of a string.
  • "[^...]": Matches every character except the ones inside the brackets.
So, no you can understand what the following regex means:
  • \b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b
Lets go through this step by step:
  1. Start of regex
  2. Any alphanumeric character or one of the following: ., _, %, +, -
  3. 2 repeated 1 or more times
  4. The "@" character
  5. Any alphanumeric character or the "-" character
  6. The "." character
  7. An alphabet
  8. Step 7 repeated from 2 to 4 times
  9. End of regex
As you might have guessed, this regex checks if a string has the format of a valid email address. WOOHOO! WE ROCK! :D

Saturday 9 November 2013

A Few Nice References

A Few Nice References

I had a difficult time trying to figure out what to write about this week. I decided to share a couple of references that I've found quite useful (and that I'm hoping to have time using).
Street name: CLRS
  • Written by 4 MIT computer science professors, this is one of the best references for studying algorithms. This (fairly large) book teaches you about the foundations of algorithms, sorting, data structures, advanced design and analysis techniques, advanced data structures, graph algorithms, and even some interesting special topics. This book is definitely worth the $90 you spend. (http://mitpress.mit.edu/books/introduction-algorithms)



  • Another book that I really want to start reading is Artificial Intelligence: A Modern Approach. This book is one of the most widely used books on the topic of AI. If you guys are also interested in artificial intelligence, U of T has an amazingly large number of AI course for undergraduates. (http://aima.cs.berkeley.edu/)




  • Time for a few cool websites. These websites all have amazing tutorials in so many different topics (and some of them not only in CS).
    • Udacity.com:  This website has all CS tutorials. Web design, algorithms, AI, ... I've started taking the parallel programming course, and I have to say, the videos are very well made. There are also mini quizzes throughout the lectures that help grasp the topic.
    •  edX: All the best universities put up free courses on this website. Anything from psychology at Harvard, to electrical at MIT. One course that I highly recommend is MIT's introduction to electrical engineering.

Sunday 3 November 2013

The Master Theorem

The Master Theorem

When doing various sorting algorithms in class, we talked a little about the order of an algorithm. Some of these analyses were straightforward, such as selection sort and insertion sort. But other sorting algorithms such as merge sort and quicksort are harder to analyze due to their recursive structure. When analyzing the runtime of recursive functions, we are often faces with a recursive identity that we need to solve. The master theorem provides a solution to many of these recurrences.

Lets say the following is our function:

def T(n):
     if n>1:
          f(n) operations done here

          T(n/b)
          T(n/b)
          ..... called a times .....
          T(n/b)
          T(n/b)

The recurrence formula for this method will be: (a & b constants, a >= 1, b > 1)
T(n) = aT(n/b) + f(n)
Now the master theorem says that:
  1. If  f(n) = \Theta\left( n^{c} \right) where c < \log_b a, then T(n) = \Theta\left( n^{\log_b a} \right)
  2. If f(n) = \Theta\left( n^{c} \log^{k} n \right) where c = \log_b a then T(n) = \Theta\left( n^{c} \log^{k+1} n \right)
  3. If f(n) = \Theta\left( n^{c} \right) where c > \log_b a then T\left(n \right) = \Theta\left(f(n) \right)
When faced with one of the above cases, the master theorem provides a mechanical way to solve the recurrence.

Example:
  • T(n) = 9T(n/3) + n:
    • a = 9
    • b = 3
    • f(n) = n
    • Since f(n) = O(n^(log{9}{3} - 1), it is case 1
    • Hence, T(n) = 0(n^(log{9}{3})) = 0(n^2)
Sources:
1) Introduction to Algorithms, 3rd edition, MIT Press

Wednesday 30 October 2013

BST + Recursion = Happy

 BST + Recursion = Happy

This week's lab was the very first lab in which my partner and I had to stay till the very end and still didn't finish. Recursion makes life so much easier! I remember  in the first week when Danny was jokingly saying that by the end of the semester, we would all want to always use recursion on every problem. Well after this lab, I have to say that I have been converted to recursionism (well, to some extent). Implementing the count_less() function iteratively actually required a significant amount of work whereas the recursion solution was easy as 3.1415... (a little math humor :D)

I'm also starting to grow fond of BTSs and just trees in general. Off the top of my head, with binary search trees we can:
  • Search in O(lgN) time (well, if it's fairly balanced)
  • Find the minimum
  • Find the maximum
  • Find the inorder successor
  • Find the inorder predecessor
  • Do Huffman encoding by using a BST to get prefix-free binary strings
  • And so many more things (as we've been seeing in class, exercises, labs, and assignment 2)...
I was also looking at different types of trees and I found so many interesting thing:
  • There are different ways to balance a binary search tree:
    • AVL trees
    • Red & Black trees
  • And there are many more kinds of trees, each tailored for a specific purpose:
    • Radix trees: useful for constructing dictionaries with keys that can be expressed as strings
    • There are prefix trees, suffix trees, and there are even trees to sub in for hash tables!

Thursday 24 October 2013

Dynamic Programming

Dynamic Programming

Dynamic Programming is another Divide and Conquer method of solving problems. Similar to recursion, we make make use of smaller cases of the problem to solve larger cases. The difference is that in recursion, we store the solutions to the smaller cases to avoid recomputing them. At times, DP (the street name for Dynamic Programming) is significantly more efficient than recursive solutions to the same problem. Here is a problem that runs significantly faster when using DP than when using recursion:

Example:
Let us have a rod of length n. We can divide this rod into smaller rods of integer lengths. Each length of a rod sells at a given price. What is the maximum revenue we can have by dividing this rod?

Prices:
Length        1        2        3        4        5        6        7        8        9        10
Price         $3      $7      $9      $12    $15    $19    $21    $24    $28    $30

Solution Using Recursion:


def cut_rod(n, p):
    """
    Recursive Solution to the rod cutting problem
    """

    if n == 0:
        return 0
    return max((p[i] + cut_rod(n-i-1, p)) for i in range(n))

if __name__ == '__main__':

    p = [3, 7, 9, 12, 15, 19, 21, 24, 28, 30]
    print(cut_rod(10, p))

After running, this codes prints out the number 35 as the maximum revenue that we can by dividing a rod of length 10. Playing around with different, we can see that indeed this is true.

10 = 2 + 2 + 2 + 2 + 2
Maximum Revenue = $7 + $7 + $7 + $7 + $7 = $35

Solution Using Dynamic Programming:

def cut_rod(n, p, r):
    """
    Dynamic Programming Solution to the rod cutting problem
    """

    if n == 0:
        return 0
    elif r[n-1] != -1:
        return r[n-1]
    r[n-1] = max((p[n-1-i] + cut_rod(i, p, r)) for i in range(n))
    return r[n-1]

if __name__ == '__main__':

    p = [3, 7, 9, 12, 15, 19, 21, 24, 28, 30]
    r = [-1 for x in range(10)]
    print(cut_rod(10, p, r))

This solution produces the same output as the recursion solution, 35. The difference here is that instead of simply calling the recursion function to solve for our smaller cases, we check if it has already been solved and use the answer it if it has been.

Runtime Analysis:

By going through the code for the first method and analyzing its recursion tree, we can see that it has a runtime of order O(2^n). Analyzing the runtime of the second solution tells us that its runtime is of order O(n^2) which is significantly smaller!!!!!

Sample Runtimes:
Case                                1            5            10            15            20            25
Recursion                        0.001     0.001     0.002       0.036        1.334       42.684
Dynamic Programming     0.001     0.001     0.001       0.001        0.001       0.001