Skip to content

Commit 8cf172d

Browse files
committed
update notes Thu Jul 7 10:45:50 EDT 2022
1 parent 3f6f565 commit 8cf172d

File tree

3 files changed

+129
-1
lines changed

3 files changed

+129
-1
lines changed

README.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,5 +10,4 @@ The notes are taken from the *Algorithms* course on Coursera ([Part I](https://w
1010
Resources link:
1111
- [Algorithms, 4th Edition](https://algs4.cs.princeton.edu/home/)
1212
- [Lecture PPT](https://algs4.cs.princeton.edu/lectures/)
13-
- [Mathematics for Computer Science](https://ocw.mit.edu/courses/6-042j-mathematics-for-computer-science-fall-2010/) (MIT Open Course)
1413
- [Introduction to Algorithms](https://ocw.mit.edu/courses/6-006-introduction-to-algorithms-fall-2011/) (MIT Open Course)

part-II/week4/boggle/readme.md

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
# Assignment Note
2+
3+
>Write a program to play the word game Boggle®.
4+
>
5+
> [The Boggle game](https://coursera.cs.princeton.edu/algs4/assignments/boggle/specification.php). Boggle is a word game designed by Allan Turoff and distributed by Hasbro. It involves a board made up of 16 cubic dice, where each die has a letter printed on each of its 6 sides. At the beginning of the game, the 16 dice are shaken and randomly distributed into a 4-by-4 tray, with only the top sides of the dice visible. The players compete to accumulate points by building valid words from the dice, according to these rules:
6+
> - A valid word must be composed by following a sequence of adjacent dice—two dice are adjacent if they are horizontal, vertical, or diagonal neighbors.
7+
> - A valid word can use each die at most once.
8+
> - A valid word must contain at least 3 letters.
9+
> - A valid word must be in the dictionary (which typically does not contain proper nouns).
10+
11+
see also: [Nifty Boggle](https://www-cs-faculty.stanford.edu/~zelenski/boggle/)
12+
13+
```
14+
ASSESSMENT SUMMARY
15+
16+
Compilation: PASSED
17+
API: PASSED
18+
19+
SpotBugs: PASSED
20+
PMD: PASSED
21+
Checkstyle: PASSED
22+
23+
Correctness: 13/13 tests passed
24+
Memory: 3/3 tests passed
25+
Timing: 9/9 tests passed
26+
27+
Aggregate score: 100.00%
28+
[ Compilation: 5%, API: 5%, Style: 0%, Correctness: 60%, Timing: 10%, Memory: 20% ]
29+
```
30+
31+
From FAQ: [Possible Optimizations](https://coursera.cs.princeton.edu/algs4/assignments/boggle/faq.php)
32+
33+
> You will likely need to optimize some aspects of your program to pass all of the performance points (which are, intentionally, more challenging on this assignment). Here are a few ideas:
34+
>
35+
> - Make sure that you have implemented the critical backtracking optimization described above. This is, by far, the most important step—several orders of magnitude!
36+
> - Think about whether you can implement the dictionary in a more efficient manner. Recall that the alphabet consists of only the 26 letters A through Z.
37+
> - Exploit that fact that when you perform a prefix query operation, it is usually almost identical to the previous prefix query, except that it is one letter longer.
38+
> - Consider a nonrecursive implementation of the prefix query operation.
39+
> - Precompute the Boggle graph, i.e., the set of cubes adjacent to each cube. But don't necessarily use a heavyweight Graph object.
40+
41+
42+
**Key points**:
43+
- DFS
44+
- Trie (26-way vs. TST)
45+
- Prefix search (Recursive vs. Non-recursive)
46+
- Reuse Trie search node to reduce duplicate search cost
47+
48+
With Modified TST and non-recursive prefix search, I still only get 98/100. May be recursive dfs can be optimized, will find out later.
49+
50+
```
51+
Test 2: timing getAllValidWords() for 5.0 seconds using dictionary-yawl.txt
52+
(must be <= 2x reference solution)
53+
- reference solution calls per second: 8541.72
54+
- student solution calls per second: 4007.69
55+
- reference / student ratio: 2.13
56+
57+
=> passed student <= 10000x reference
58+
=> passed student <= 25x reference
59+
=> passed student <= 10x reference
60+
=> passed student <= 5x reference
61+
=> FAILED student <= 2x reference
62+
63+
Total: 8/9 tests passed!
64+
```
65+
66+
With 26-way Trie, easily got 100/100.
67+
68+
Note: when using 26-way Trie, one should transform index of letters to be in the range of R, and remember to convert them back to letters when iterating all keys of Trie, otherwise your words would be *eaten*, lol.
69+
70+
```
71+
Test 2: timing getAllValidWords() for 5.0 seconds using dictionary-yawl.txt
72+
(must be <= 2x reference solution)
73+
- reference solution calls per second: 8085.84
74+
- student solution calls per second: 4446.44
75+
- reference / student ratio: 1.82
76+
77+
=> passed student <= 10000x reference
78+
=> passed student <= 25x reference
79+
=> passed student <= 10x reference
80+
=> passed student <= 5x reference
81+
=> passed student <= 2x reference
82+
83+
Total: 9/9 tests passed!
84+
```

part-II/week5/burrows/readme.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# Assignment Note
2+
3+
>[Implement the Burrows–Wheeler data compression algorithm](https://coursera.cs.princeton.edu/algs4/assignments/burrows/specification.php). This revolutionary algorithm outcompresses gzip and PKZIP, is relatively easy to implement, and is not protected by any patents. It forms the basis of the Unix compression utility bzip2.
4+
>
5+
>The Burrows–Wheeler data compression algorithm consists of three algorithmic components, which are applied in succession:
6+
>- Burrows–Wheeler transform. Given a typical English text file, transform it into a text file in which sequences of the same character occur near each other many times.
7+
>- Move-to-front encoding. Given a text file in which sequences of the same character occur near each other many times, convert it into a text file in which certain characters appear much more frequently than others.
8+
>- Huffman compression. Given a text file in which certain characters appear much more frequently than others, compress it by encoding frequently occurring characters with short codewords and infrequently occurring characters with long codewords.
9+
>
10+
>Step 3 is the only one that compresses the message: it is particularly effective because Steps 1 and 2 produce a text file in which certain characters appear much more frequently than others. To expand a message, apply the inverse operations in reverse order: first apply the Huffman expansion, then the move-to-front decoding, and finally the inverse Burrows–Wheeler transform. Your task is to implement the **Burrows–Wheeler** and [**move-to-front**](https://en.wikipedia.org/wiki/Move-to-front_transform#:~:text=An%20important%20use,entropy%2Dencoding%20step.) components.
11+
12+
```
13+
ASSESSMENT SUMMARY
14+
15+
Compilation: PASSED
16+
API: PASSED
17+
18+
SpotBugs: PASSED
19+
PMD: PASSED
20+
Checkstyle: PASSED
21+
22+
Correctness: 73/73 tests passed
23+
Memory: 10/10 tests passed
24+
Timing: 163/163 tests passed
25+
26+
Aggregate score: 100.00%
27+
[ Compilation: 5%, API: 5%, Style: 0%, Correctness: 60%, Timing: 10%, Memory: 20% ]
28+
29+
```
30+
31+
The difficult part of this assignment is to satisfy the performance requirement of `CircularSuffixArray`, which necessitates efficient [suffix sorting algorithms](https://en.wikipedia.org/wiki/Suffix_array#:~:text=suffix%20trees.-,Suffix%20sorting%20algorithms%20can%20be%20used%20to%20compute%20the%20Burrows%E2%80%93Wheeler,.,-Suffix%20arrays%20can).
32+
33+
> Suffix sorting algorithms can be used to compute the [Burrows–Wheeler transform (BWT)](https://en.wikipedia.org/wiki/Burrows%E2%80%93Wheeler_transform). The BWT requires sorting of all cyclic permutations of a string. If this string ends in a special end-of-string character that is lexicographically smaller than all other character (i.e., $), then the order of the sorted rotated BWT matrix corresponds to the order of suffixes in a suffix array. The BWT can therefore be computed in linear time by first constructing a suffix array of the text and then deducing the BWT string: `BWT[i]=S[A[i]-1]`.
34+
35+
A good way to solve suffix sorting is to use [3-way String Quicksort](https://github.com/lijqhs/algorithms-princeton/tree/main/part-II#651-3-way-string-quicksort-java-implementation).
36+
37+
For `MoveToFront`, first I used `ArrayList` in `decode`, but the auto grader showed that it did not reach perfect performance (failed one timing test). I tried fixed-sized array and used `System.arraycopy` instead to move character, then it passed all tests. Someone else used Linked list which was possibly also a good idea. A mentor [explained the difference](https://www.coursera.org/learn/algorithms-part2/discussions/forums/ujwo1LPrEeaDjQrM8JcKQg/threads/78xezPReEeaIjwovgVtlYg) in the discussion forum, which was helpful.
38+
39+
>Arrays use less memory than linked lists as a rule, because node objects and pointers overhead, and in situations with a lot of traversal, arrays are faster than lists.
40+
>
41+
>That's so because lists use dynamic memory, which is not sequential access and also uses dynamic memory alocation for every node, while arrays use blocks of memory. The second allows for two things - sequential access and using the faster memory of the CPU.
42+
>
43+
>In combination with moving whole blocks of memory with System.arraycopy, there's no way to perform faster.
44+
>
45+
>With system.arraycopy you avoid the index logic as well. Even more - it works faster than moving the elements one by one. Just traverse the array to find the element you're looking for.

0 commit comments

Comments
 (0)