SU-CS161 OCT092025 — Jemoka Knowledge Base

Motivating Questions Can we do better than O\left(n \log n\right) sort? What’s our model of computation? What are some reasonable models of computation? comparison based sorting merge sort, quicksort, insertion sort. you can’t access values directly you can only compare two items (i.e. if something is better / smaller than another) Any deterministic, comparison-based sorting algorithm must take \Omega\left(n \log\left(n\right)\right) steps of computation. Proof idea—any comparison based sorting algorithm gives rise to a decision tree. The decision tree must have this many steps. The structure of the tree is as follows: nodes are each comparison decision, each nodes has 2 children (one for each ordering between two items), leaf nodes represents the output. The deepest leaf gives the lower bound. For input with length n, it should have n! leafs depending on the choice of input ordering. The shallowest such tree would have \log \left(n!\right) depth (i.e. a perfectly balanced tree). n! is about \left(\frac{n}{e}\right)^{n}. Taking the log of that gives that our depth is likely n \log \left(n\right). heuristic based sorting Many times the content of the thing we are sorting has well ordering (i.e. they are numbers). There are then O\left(n\right) sorting algorithms. counting sort Find the maximum of input list is O\left(n\right). Then allocate 1…\max buckets, each a linked list. FIFO insert items into the buckets by iterating through the list again. Then dump the buckets in sorted order. This is rather silly if you have a very large number (so you would initialize way to many sparsely-populated buckets). radix sort Insight: numbers that are “bigger” than each other has each digit equal or bigger than the other digit. run counting sort on the least significant digit run counting sort on the second least significant digit …. repeat ad nausium …. your list is sorted And so, proof: IH: after the kth iteration, the array is sorted by the first k significant digits. base case: after 0th digit induction: for two elements x,y \in A such that x<y, we want to show that up to k digit is sorted case 1: x_1 < y_1, we desire x_2x_1 < y_2y_1 (x would be in an earlier bucket than y after counting sort, so pulling out the list is just good) case 2: x_1 = y_1, we desire x_2 x_1 < y_2 y_1 (because of FIFO condition, if the first digit was inserted based on x_1 < y_1, then it will be pulled out ditto) improving efficiency You can just start encoding the inputs in increasingly smaller bases. general running time For n integers, M maximum value in input list, is base r. number of iterations: d = \lfloor \log_{r}\left(M\right) \rfloor + 1 time per iterations O\left(n+r\right) (r buckets, n items into each) total time: O\left(d\cdot\left(n+r\right)\right) = O\left(\left(\lfloor \log_{r}\left(M\right)\rfloor +1\right)\cdot\left(n+r\right)\right) We typically choose r = n, which gives:

\begin{equation} O\left(n \cdot\left( \lfloor \log_{n}\left(M\right) \rfloor +1 \right)\right) \end{equation}

if M \leq n^{c} for constant c, this would be O\left(n\right) if M = 2^{n}, this is O\left(n^{2}\right)