|
| 1 | +{ |
| 2 | + "cells": [ |
| 3 | + { |
| 4 | + "cell_type": "markdown", |
| 5 | + "metadata": {}, |
| 6 | + "source": [ |
| 7 | + "## Algorithm Analysis\n", |
| 8 | + "Algorithm is simply a procedure or formula for solving a problem. For comparing algorithms, the amount of space taken in memory or how much time is taken by each algorithm to run is considered." |
| 9 | + ] |
| 10 | + }, |
| 11 | + { |
| 12 | + "cell_type": "code", |
| 13 | + "execution_count": 21, |
| 14 | + "metadata": {}, |
| 15 | + "outputs": [ |
| 16 | + { |
| 17 | + "data": { |
| 18 | + "text/plain": [ |
| 19 | + "55" |
| 20 | + ] |
| 21 | + }, |
| 22 | + "execution_count": 21, |
| 23 | + "metadata": {}, |
| 24 | + "output_type": "execute_result" |
| 25 | + } |
| 26 | + ], |
| 27 | + "source": [ |
| 28 | + "def sum1(n):\n", |
| 29 | + " final_sum = 0\n", |
| 30 | + " for x in range(n+1):\n", |
| 31 | + " final_sum+=x\n", |
| 32 | + " return final_sum\n", |
| 33 | + "\n", |
| 34 | + "sum1(10)" |
| 35 | + ] |
| 36 | + }, |
| 37 | + { |
| 38 | + "cell_type": "code", |
| 39 | + "execution_count": 22, |
| 40 | + "metadata": {}, |
| 41 | + "outputs": [ |
| 42 | + { |
| 43 | + "data": { |
| 44 | + "text/plain": [ |
| 45 | + "55" |
| 46 | + ] |
| 47 | + }, |
| 48 | + "execution_count": 22, |
| 49 | + "metadata": {}, |
| 50 | + "output_type": "execute_result" |
| 51 | + } |
| 52 | + ], |
| 53 | + "source": [ |
| 54 | + "def sum2(n):\n", |
| 55 | + " return (n*(n+1))//2\n", |
| 56 | + "\n", |
| 57 | + "sum2(10)" |
| 58 | + ] |
| 59 | + }, |
| 60 | + { |
| 61 | + "cell_type": "code", |
| 62 | + "execution_count": 23, |
| 63 | + "metadata": {}, |
| 64 | + "outputs": [ |
| 65 | + { |
| 66 | + "name": "stdout", |
| 67 | + "output_type": "stream", |
| 68 | + "text": [ |
| 69 | + "4.47 µs ± 18.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n" |
| 70 | + ] |
| 71 | + } |
| 72 | + ], |
| 73 | + "source": [ |
| 74 | + "%timeit sum1(100)" |
| 75 | + ] |
| 76 | + }, |
| 77 | + { |
| 78 | + "cell_type": "code", |
| 79 | + "execution_count": 24, |
| 80 | + "metadata": {}, |
| 81 | + "outputs": [ |
| 82 | + { |
| 83 | + "name": "stdout", |
| 84 | + "output_type": "stream", |
| 85 | + "text": [ |
| 86 | + "140 ns ± 2.43 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)\n" |
| 87 | + ] |
| 88 | + } |
| 89 | + ], |
| 90 | + "source": [ |
| 91 | + "%timeit sum2(100)" |
| 92 | + ] |
| 93 | + }, |
| 94 | + { |
| 95 | + "cell_type": "markdown", |
| 96 | + "metadata": {}, |
| 97 | + "source": [ |
| 98 | + "The two functions take different times to execute. This shows **Time complexity**. \n", |
| 99 | + "Big-O notation describes how quickly runtime will grow relative to the input as the input get arbitrarily large. It is a relative representation.\n", |
| 100 | + "* Remember, we want to compare how quickly runtime will grows, not compare exact runtimes, since those can vary depending on hardware.\n", |
| 101 | + "* Since we want to compare for a variety of input sizes, we are only concerned with runtime grow relative to the input. This is why we use n for notation.\n", |
| 102 | + "* As n gets arbitrarily large we only worry about terms that will grow the fastest as n gets large, to this point, Big-O analysis is also known as asymptotic analysis\n", |
| 103 | + "\n", |
| 104 | + "## Common Big-O functions\n", |
| 105 | + "**Big-O**|**Name**\n", |
| 106 | + ":-----:|:-----:\n", |
| 107 | + "1 |Constant\n", |
| 108 | + "log(n) |Logarithmic\n", |
| 109 | + "n |Linear\n", |
| 110 | + "nlog(n) |Log Linear\n", |
| 111 | + "n^2 |Quadratic\n", |
| 112 | + "n^3 |Cubic\n", |
| 113 | + "2^n |Exponential\n", |
| 114 | + "\n", |
| 115 | + "## Big-O examples\n", |
| 116 | + "Here is a real life example to understand the notations mentioned above. Take a phone book. It has businesses (the \"Yellow Pages\") which have unique names and people (the \"White Pages\") which may not have unique names. A phone number is assigned to at most one person or business. We will also assume that it takes constant time to flip to a specific page.\n", |
| 117 | + "* **O(1) (best case):** Given the page that a business's name is on and the business name, find the phone number.\n", |
| 118 | + "* **O(1) (average case):** Given the page that a person's name is on and their name, find the phone number.\n", |
| 119 | + "* **O(log n):** Given a person's name, find the phone number by picking a random point about halfway through the part of the book you haven't searched yet, then checking to see whether the person's name is at that point. Then repeat the process about halfway through the part of the book where the person's name lies. (This is a binary search for a person's name.)\n", |
| 120 | + "* **O(n):** Find all people whose phone numbers contain the digit \"5\".\n", |
| 121 | + "* **O(n):** Given a phone number, find the person or business with that number.\n", |
| 122 | + "* **O(n log n):** There was a mix-up at the printer's office, and our phone book had all its pages inserted in a random order. Fix the ordering so that it's correct by looking at the first name on each page and then putting that page in the appropriate spot in a new, empty phone book.\n", |
| 123 | + "\n", |
| 124 | + "For the below examples, we're now at the printer's office. Phone books are waiting to be mailed to each resident or business, and there's a sticker on each phone book identifying where it should be mailed to. Every person or business gets one phone book.\n", |
| 125 | + "* **O(n log n):** We want to personalize the phone book, so we're going to find each person or business's name in their designated copy, then circle their name in the book and write a short thank-you note for their patronage.\n", |
| 126 | + "* **O(n2):** A mistake occurred at the office, and every entry in each of the phone books has an extra \"0\" at the end of the phone number. Take some white-out and remove each zero.\n", |
| 127 | + "* **O(n · n!):** We're ready to load the phonebooks onto the shipping dock. Unfortunately, the robot that was supposed to load the books has gone haywire: it's putting the books onto the truck in a random order! Even worse, it loads all the books onto the truck, then checks to see if they're in the right order, and if not, it unloads them and starts over. (This is the dreaded bogo sort.)\n", |
| 128 | + "* **O(n^n):** You fix the robot so that it's loading things correctly. The next day, one of your co-workers plays a prank on you and wires the loading dock robot to the automated printing systems. Every time the robot goes to load an original book, the factory printer makes a duplicate run of all the phonebooks! Fortunately, the robot's bug-detection systems are sophisticated enough that the robot doesn't try printing even more copies when it encounters a duplicate book for loading, but it still has to load every original and duplicate book that's been printed.\n", |
| 129 | + "\n", |
| 130 | + "The above illustration is taken from stackoverflow: https://stackoverflow.com/questions/2307283/what-does-olog-n-mean-exactly\n", |
| 131 | + "Additional reference for understanding O(nlogn): https://hackernoon.com/what-does-the-time-complexity-o-log-n-actually-mean-45f94bb5bfbf" |
| 132 | + ] |
| 133 | + }, |
| 134 | + { |
| 135 | + "cell_type": "code", |
| 136 | + "execution_count": 25, |
| 137 | + "metadata": {}, |
| 138 | + "outputs": [ |
| 139 | + { |
| 140 | + "name": "stdout", |
| 141 | + "output_type": "stream", |
| 142 | + "text": [ |
| 143 | + "1\n" |
| 144 | + ] |
| 145 | + } |
| 146 | + ], |
| 147 | + "source": [ |
| 148 | + "\"\"\"Irrespective of the list size, the time taken to print the first element will always be constant. Hence, \n", |
| 149 | + "the time complexity is O(1)\"\"\"\n", |
| 150 | + "\n", |
| 151 | + "lst = [1,2,3,4,5,6]\n", |
| 152 | + "\n", |
| 153 | + "def func_constant(values):\n", |
| 154 | + " print(values[0])\n", |
| 155 | + " \n", |
| 156 | + "func_constant(lst)" |
| 157 | + ] |
| 158 | + }, |
| 159 | + { |
| 160 | + "cell_type": "code", |
| 161 | + "execution_count": 26, |
| 162 | + "metadata": {}, |
| 163 | + "outputs": [ |
| 164 | + { |
| 165 | + "name": "stdout", |
| 166 | + "output_type": "stream", |
| 167 | + "text": [ |
| 168 | + "1\n", |
| 169 | + "2\n", |
| 170 | + "3\n", |
| 171 | + "4\n", |
| 172 | + "5\n", |
| 173 | + "6\n" |
| 174 | + ] |
| 175 | + } |
| 176 | + ], |
| 177 | + "source": [ |
| 178 | + "\"\"\"The time taken to print the entire list is proportional to the size of the list. Hence, \n", |
| 179 | + "the time complexity is O(n)\"\"\"\n", |
| 180 | + "\n", |
| 181 | + "def func_lin(lst):\n", |
| 182 | + " for val in lst:\n", |
| 183 | + " print(val)\n", |
| 184 | + "\n", |
| 185 | + "func_lin(lst)" |
| 186 | + ] |
| 187 | + }, |
| 188 | + { |
| 189 | + "cell_type": "code", |
| 190 | + "execution_count": 27, |
| 191 | + "metadata": {}, |
| 192 | + "outputs": [ |
| 193 | + { |
| 194 | + "name": "stdout", |
| 195 | + "output_type": "stream", |
| 196 | + "text": [ |
| 197 | + "1 1\n", |
| 198 | + "1 2\n", |
| 199 | + "1 3\n", |
| 200 | + "2 1\n", |
| 201 | + "2 2\n", |
| 202 | + "2 3\n", |
| 203 | + "3 1\n", |
| 204 | + "3 2\n", |
| 205 | + "3 3\n" |
| 206 | + ] |
| 207 | + } |
| 208 | + ], |
| 209 | + "source": [ |
| 210 | + "\"\"\"Tthe time taken to print the combination of each element with each element of the list. n operations are performed\n", |
| 211 | + "for each element of the list. Hence, the time complexity is O(n^2)\"\"\"\n", |
| 212 | + "\n", |
| 213 | + "def func_quad(lst):\n", |
| 214 | + " for item1 in lst:\n", |
| 215 | + " for item2 in lst:\n", |
| 216 | + " print(item1, item2)\n", |
| 217 | + "func_quad([1,2,3])" |
| 218 | + ] |
| 219 | + }, |
| 220 | + { |
| 221 | + "cell_type": "markdown", |
| 222 | + "metadata": {}, |
| 223 | + "source": [ |
| 224 | + "## Calculating Big-O\n", |
| 225 | + "Different sections of a code might have different time complexities. The one with the most significant term is considered. This is because, as the input grows larger, only the fastes growing term will matter most." |
| 226 | + ] |
| 227 | + }, |
| 228 | + { |
| 229 | + "cell_type": "code", |
| 230 | + "execution_count": 29, |
| 231 | + "metadata": {}, |
| 232 | + "outputs": [], |
| 233 | + "source": [ |
| 234 | + "\"\"\"O(n)+O(n) = O(2n) = O(2*n)\n", |
| 235 | + "drop constants. Hence, time complexity is O(n)\"\"\"\n", |
| 236 | + "def func_lin(lst):\n", |
| 237 | + " for val in lst:\n", |
| 238 | + " print(val)\n", |
| 239 | + " for val in lst:\n", |
| 240 | + " print(val)" |
| 241 | + ] |
| 242 | + }, |
| 243 | + { |
| 244 | + "cell_type": "code", |
| 245 | + "execution_count": 31, |
| 246 | + "metadata": {}, |
| 247 | + "outputs": [], |
| 248 | + "source": [ |
| 249 | + "\"\"\"O(1) + O(n/2) + O(10) will equate to O(n)\"\"\"\n", |
| 250 | + "def comp(lst):\n", |
| 251 | + " print(lst[0])\n", |
| 252 | + " \n", |
| 253 | + " midpoint = len(lst)/2\n", |
| 254 | + " \n", |
| 255 | + " for val in lst[:midpoint]:\n", |
| 256 | + " print(val)\n", |
| 257 | + " \n", |
| 258 | + " for x in range(10):\n", |
| 259 | + " print('number')" |
| 260 | + ] |
| 261 | + }, |
| 262 | + { |
| 263 | + "cell_type": "markdown", |
| 264 | + "metadata": {}, |
| 265 | + "source": [ |
| 266 | + "## Worst and Best Case\n", |
| 267 | + "A given algorithm can have different time comlexities based on the input. Take the below code, for example. If the first element in the list is a match, the time complesxity will be O(1). If the last element matches, the complexity will be O(n). The former is the **best case** and the latter is called the **worst case**. Usually, the worst case is taken into account for time complexity measurements." |
| 268 | + ] |
| 269 | + }, |
| 270 | + { |
| 271 | + "cell_type": "code", |
| 272 | + "execution_count": 32, |
| 273 | + "metadata": {}, |
| 274 | + "outputs": [], |
| 275 | + "source": [ |
| 276 | + "def matcher(lst,match):\n", |
| 277 | + " for item in lst:\n", |
| 278 | + " if item == match:\n", |
| 279 | + " return True\n", |
| 280 | + " return False" |
| 281 | + ] |
| 282 | + }, |
| 283 | + { |
| 284 | + "cell_type": "markdown", |
| 285 | + "metadata": {}, |
| 286 | + "source": [ |
| 287 | + "## Space Complexity\n", |
| 288 | + "Notation for space complexity is the same as time comlexity. Here the amount of memory used for execution is taken into account." |
| 289 | + ] |
| 290 | + }, |
| 291 | + { |
| 292 | + "cell_type": "code", |
| 293 | + "execution_count": 35, |
| 294 | + "metadata": {}, |
| 295 | + "outputs": [], |
| 296 | + "source": [ |
| 297 | + "\"\"\"Space for the string 'Hello World' is only required. Hence space complexity is O(1)\"\"\"\n", |
| 298 | + "\n", |
| 299 | + "def printer(n=10):\n", |
| 300 | + " for x in range(n):\n", |
| 301 | + " print('Hello World!')" |
| 302 | + ] |
| 303 | + }, |
| 304 | + { |
| 305 | + "cell_type": "code", |
| 306 | + "execution_count": 36, |
| 307 | + "metadata": {}, |
| 308 | + "outputs": [], |
| 309 | + "source": [ |
| 310 | + "\"\"\"Space for the lsit with n elements is required. Hence space complexity is O(n)\"\"\"\n", |
| 311 | + "\n", |
| 312 | + "def create_list(n):\n", |
| 313 | + " new_list = []\n", |
| 314 | + " for num in range(n):\n", |
| 315 | + " new_list.append('new')\n", |
| 316 | + " return new_list" |
| 317 | + ] |
| 318 | + } |
| 319 | + ], |
| 320 | + "metadata": { |
| 321 | + "kernelspec": { |
| 322 | + "display_name": "Python 3", |
| 323 | + "language": "python", |
| 324 | + "name": "python3" |
| 325 | + }, |
| 326 | + "language_info": { |
| 327 | + "codemirror_mode": { |
| 328 | + "name": "ipython", |
| 329 | + "version": 3 |
| 330 | + }, |
| 331 | + "file_extension": ".py", |
| 332 | + "mimetype": "text/x-python", |
| 333 | + "name": "python", |
| 334 | + "nbconvert_exporter": "python", |
| 335 | + "pygments_lexer": "ipython3", |
| 336 | + "version": "3.7.3" |
| 337 | + } |
| 338 | + }, |
| 339 | + "nbformat": 4, |
| 340 | + "nbformat_minor": 2 |
| 341 | +} |
0 commit comments