update notes Thu Jul 7 10:45:50 EDT 2022

lijqhs · lijqhs · commit 8cf172d1cb73 · 2022-07-07T10:45:50.000-04:00
diff --git a/README.md b/README.md
@@ -10,5 +10,4 @@ The notes are taken from the *Algorithms* course on Coursera ([Part I](https://w
 Resources link:
 - [Algorithms, 4th Edition](https://algs4.cs.princeton.edu/home/)
 - [Lecture PPT](https://algs4.cs.princeton.edu/lectures/)
-- [Mathematics for Computer Science](https://ocw.mit.edu/courses/6-042j-mathematics-for-computer-science-fall-2010/) (MIT Open Course)
 - [Introduction to Algorithms](https://ocw.mit.edu/courses/6-006-introduction-to-algorithms-fall-2011/) (MIT Open Course)
diff --git a/part-II/week4/boggle/readme.md b/part-II/week4/boggle/readme.md
@@ -0,0 +1,84 @@
+# Assignment Note
+
+>Write a program to play the word game Boggle®.
+>
+> [The Boggle game](https://coursera.cs.princeton.edu/algs4/assignments/boggle/specification.php). Boggle is a word game designed by Allan Turoff and distributed by Hasbro. It involves a board made up of 16 cubic dice, where each die has a letter printed on each of its 6 sides. At the beginning of the game, the 16 dice are shaken and randomly distributed into a 4-by-4 tray, with only the top sides of the dice visible. The players compete to accumulate points by building valid words from the dice, according to these rules:
+> - A valid word must be composed by following a sequence of adjacent dice—two dice are adjacent if they are horizontal, vertical, or diagonal neighbors.
+> - A valid word can use each die at most once.
+> - A valid word must contain at least 3 letters.
+> - A valid word must be in the dictionary (which typically does not contain proper nouns).
+
+see also: [Nifty Boggle](https://www-cs-faculty.stanford.edu/~zelenski/boggle/)
+
+```
+ASSESSMENT SUMMARY
+
+Compilation:  PASSED
+API:          PASSED
+
+SpotBugs:     PASSED
+PMD:          PASSED
+Checkstyle:   PASSED
+
+Correctness:  13/13 tests passed
+Memory:       3/3 tests passed
+Timing:       9/9 tests passed
+
+Aggregate score: 100.00%
+[ Compilation: 5%, API: 5%, Style: 0%, Correctness: 60%, Timing: 10%, Memory: 20% ]
+```
+
+From FAQ: [Possible Optimizations](https://coursera.cs.princeton.edu/algs4/assignments/boggle/faq.php)
+
+> You will likely need to optimize some aspects of your program to pass all of the performance points (which are, intentionally, more challenging on this assignment). Here are a few ideas:
+>
+> - Make sure that you have implemented the critical backtracking optimization described above. This is, by far, the most important step—several orders of magnitude!
+> - Think about whether you can implement the dictionary in a more efficient manner. Recall that the alphabet consists of only the 26 letters A through Z.
+> - Exploit that fact that when you perform a prefix query operation, it is usually almost identical to the previous prefix query, except that it is one letter longer.
+> - Consider a nonrecursive implementation of the prefix query operation.
+> - Precompute the Boggle graph, i.e., the set of cubes adjacent to each cube. But don't necessarily use a heavyweight Graph object.
+
+
+**Key points**:
+- DFS
+- Trie (26-way vs. TST)
+- Prefix search (Recursive vs. Non-recursive)
+- Reuse Trie search node to reduce duplicate search cost
+
+With Modified TST and non-recursive prefix search, I still only get 98/100. May be recursive dfs can be optimized, will find out later.
+
+```
+Test 2: timing getAllValidWords() for 5.0 seconds using dictionary-yawl.txt
+        (must be <= 2x reference solution)
+    - reference solution calls per second: 8541.72
+    - student   solution calls per second: 4007.69
+    - reference / student ratio:           2.13
+
+=> passed    student <= 10000x reference
+=> passed    student <=    25x reference
+=> passed    student <=    10x reference
+=> passed    student <=     5x reference
+=> FAILED    student <=     2x reference
+
+Total: 8/9 tests passed!
+```
+
+With 26-way Trie, easily got 100/100.
+
+Note: when using 26-way Trie, one should transform index of letters to be in the range of R, and remember to convert them back to letters when iterating all keys of Trie, otherwise your words would be *eaten*, lol.
+
+```
+Test 2: timing getAllValidWords() for 5.0 seconds using dictionary-yawl.txt
+        (must be <= 2x reference solution)
+    - reference solution calls per second: 8085.84
+    - student   solution calls per second: 4446.44
+    - reference / student ratio:           1.82
+
+=> passed    student <= 10000x reference
+=> passed    student <=    25x reference
+=> passed    student <=    10x reference
+=> passed    student <=     5x reference
+=> passed    student <=     2x reference
+
+Total: 9/9 tests passed!
+```
diff --git a/part-II/week5/burrows/readme.md b/part-II/week5/burrows/readme.md
@@ -0,0 +1,45 @@
+# Assignment Note
+
+>[Implement the Burrows–Wheeler data compression algorithm](https://coursera.cs.princeton.edu/algs4/assignments/burrows/specification.php). This revolutionary algorithm outcompresses gzip and PKZIP, is relatively easy to implement, and is not protected by any patents. It forms the basis of the Unix compression utility bzip2.
+>
+>The Burrows–Wheeler data compression algorithm consists of three algorithmic components, which are applied in succession:
+>- Burrows–Wheeler transform. Given a typical English text file, transform it into a text file in which sequences of the same character occur near each other many times.
+>- Move-to-front encoding. Given a text file in which sequences of the same character occur near each other many times, convert it into a text file in which certain characters appear much more frequently than others.
+>- Huffman compression. Given a text file in which certain characters appear much more frequently than others, compress it by encoding frequently occurring characters with short codewords and infrequently occurring characters with long codewords.
+>
+>Step 3 is the only one that compresses the message: it is particularly effective because Steps 1 and 2 produce a text file in which certain characters appear much more frequently than others. To expand a message, apply the inverse operations in reverse order: first apply the Huffman expansion, then the move-to-front decoding, and finally the inverse Burrows–Wheeler transform. Your task is to implement the **Burrows–Wheeler** and [**move-to-front**](https://en.wikipedia.org/wiki/Move-to-front_transform#:~:text=An%20important%20use,entropy%2Dencoding%20step.) components.
+
+```
+ASSESSMENT SUMMARY
+
+Compilation:  PASSED
+API:          PASSED
+
+SpotBugs:     PASSED
+PMD:          PASSED
+Checkstyle:   PASSED
+
+Correctness:  73/73 tests passed
+Memory:       10/10 tests passed
+Timing:       163/163 tests passed
+
+Aggregate score: 100.00%
+[ Compilation: 5%, API: 5%, Style: 0%, Correctness: 60%, Timing: 10%, Memory: 20% ]
+
+```
+
+The difficult part of this assignment is to satisfy the performance requirement of `CircularSuffixArray`, which necessitates efficient [suffix sorting algorithms](https://en.wikipedia.org/wiki/Suffix_array#:~:text=suffix%20trees.-,Suffix%20sorting%20algorithms%20can%20be%20used%20to%20compute%20the%20Burrows%E2%80%93Wheeler,.,-Suffix%20arrays%20can).
+
+> Suffix sorting algorithms can be used to compute the [Burrows–Wheeler transform (BWT)](https://en.wikipedia.org/wiki/Burrows%E2%80%93Wheeler_transform). The BWT requires sorting of all cyclic permutations of a string. If this string ends in a special end-of-string character that is lexicographically smaller than all other character (i.e., $), then the order of the sorted rotated BWT matrix corresponds to the order of suffixes in a suffix array. The BWT can therefore be computed in linear time by first constructing a suffix array of the text and then deducing the BWT string: `BWT[i]=S[A[i]-1]`.
+
+A good way to solve suffix sorting is to use [3-way String Quicksort](https://github.com/lijqhs/algorithms-princeton/tree/main/part-II#651-3-way-string-quicksort-java-implementation).
+
+For `MoveToFront`, first I used `ArrayList` in `decode`, but the auto grader showed that it did not reach perfect performance (failed one timing test). I tried fixed-sized array and used `System.arraycopy` instead to move character, then it passed all tests. Someone else used Linked list which was possibly also a good idea. A mentor [explained the difference](https://www.coursera.org/learn/algorithms-part2/discussions/forums/ujwo1LPrEeaDjQrM8JcKQg/threads/78xezPReEeaIjwovgVtlYg) in the discussion forum, which was helpful.
+
+>Arrays use less memory than linked lists as a rule, because node objects and pointers overhead, and in situations with a lot of traversal, arrays are faster than lists.
+>
+>That's so because lists use dynamic memory, which is not sequential access and also uses dynamic memory alocation for every node, while arrays use blocks of memory. The second allows for two things - sequential access and using the faster memory of the CPU.
+>
+>In combination with moving whole blocks of memory with System.arraycopy, there's no way to perform faster.
+>
+>With system.arraycopy you avoid the index logic as well. Even more - it works faster than moving the elements one by one. Just traverse the array to find the element you're looking for.