What Are The Top 10 Algorithms Every Software Engineer Should Know by Heart?

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

What are the top 10 algorithms every

software engineer should know by heart?

Carson McNeil, former Software Engineer - Human Sensing at Google (2013-2016)


Updated Mar 25 Upvoted by Gideon Shavit, Professional Software Engineer for 25
years
I agree with many of the answerers. I was a Google engineer for three years, did a lot of
projects, and was quickly promoted, so I think I'm a pretty solid software engineer. As I
read this to myself, I wondered do I even know ten algorithms by heart, in full
detail?? I don't think so.

Understanding algorithms is an invaluable skill, but like many domains of knowledge,


the skill is in a way of thinking, not in memorization. You want to be able to read an
algorithm, and immediately understand why it is the way it is. As far as knowing exactly
how a specific algorithm is implemented, that's what Wikipedia is for.

That being said, here are a couple of the (less than 10) algorithms that I DO know by
heart, because they're pretty fundamental:

1. Binary search. Binary search. Binary search. This is an incredibly important


concept, and it pops up in everything from convex optimization to databases. If
you need to find a particular value in a big set of things, and you can figure out
which direction you need to go, use binary search for a MASSIVE speed up.
2. Matrix operations. Now, this one only pops up for me because I'm more of a
data scientist, but in general, it's important to remember that matrix
multiplication is faster than you think it is. Brilliant people have come up with
incredibly unintuitive algorithms to make matrices multiply in
O(n^2.something) instead of O(n^3), like you would expect. In practice, these
algorithms aren't necessarily used, but the BLAS and MPI libraries do provide
incredibly fast matrix multiplication all the same. Anytime you can express a
bunch of calculations in terms of matrix operations, you win.
3. Linear regression. Simple, beautiful, and useful.
4. Caching. Justuse caching. Does that count as an algorithm? I don't know,
but it is the cause of, and solution to, all of your problems.
5. The idea of a concurrent work queue.
6. Knowing when to throw hashtables at a problem.

Joe Zbiciak
7 upvotes Upvoted by Gideon Shavit, Professional Software Engineer for 25
years
Caching can happen at multiple levels, too. If you program in a more functional direction, with side-
effect-free/pure functionsthat is, you have referential transparencyyou can trivially cache those
functions results. Doing this at the function level is sometimes called memoization.

For certain classes of algorithm, it can turn deep searches into shallow lookups. Ive found it
tremendously useful over the years. Im also surprised how many people are unaware of the technique.

Note: When I say program in a more functional direction, I dont mean program in Haskell or
Scala. I just mean avoid side-effects and mutable state wherever possible. I wrote a fairly complex
constraint solver in C++ a couple years ago, and I stuck to a everythings const once created
paradigm thats popular in functional programming languages. That allowed me to apply a few tricks:
(1) I could replace a given object with an integer index into a table of objects of that type. Since each
object captured a particular portion of the constraint solution computation, and the same intermediate
values get created again and again, compressing this to a single instance saves a lot of space. (Similar
to the Flyweight design pattern.)

(2) Since new objects were created by combining existing objects with various operators, and that
operation was side-effect free, I could easily memoize these computations. And since I had associated
all the unique objects with integers, I could look up the memod result in a hash table very quickly.

(3) Since all the various objects were remembered, I could always backtrack on a partial solution by
just remembering what set of objects I had at each step. Doing so was cheap since everything was
already captured in the unique objects table that mapped objects to integers.

1. Modularization. Break up software into understandable and testable


modules.
2. Caching. All computer science is an exercise in caching.
3. Optimize what matters: in a CPU- or memory-bound program, only a tiny
part of the code determines the performance.
4. Use C for I/O-bound programs, including networking code, and for very little
else.
5. Never get married to a language, a framework, a library, or a tool. Dont be
a <tool name> programmer; instead, use what is appropriate.

You might also like