Avoid This Costly Mistake When Indexing A DataFrame
Avoid This Costly Mistake When Indexing A DataFrame
To: chidichekwas@yahoo.com
SHARE
about:blank 1/6
5/25/23, 10:08 AM Yahoo Mail - Avoid This Costly Mistake When Indexing A DataFrame
As shown above, selecting the column first is over 15 times faster than
slicing the row first. Why?
about:blank 2/6
5/25/23, 10:08 AM Yahoo Mail - Avoid This Costly Mistake When Indexing A DataFrame
But when you slice a row first, each row is retrieved by accessing non-
contiguous blocks of memory, thereby making it slow.
Also, once all the elements of a row are gathered, Pandas converts them to a
Series, which is another overhead.
about:blank 3/6
5/25/23, 10:08 AM Yahoo Mail - Avoid This Costly Mistake When Indexing A DataFrame
Instead, when you select a column first, elements are retrieved by accessing
contiguous blocks of memory, which is way faster. Also, a column is
inherently a Pandas Series. Thus, there is no conversion overhead involved
like above.
about:blank 4/6
5/25/23, 10:08 AM Yahoo Mail - Avoid This Costly Mistake When Indexing A DataFrame
This makes selecting the column first faster than slicing a row first in indexing
operations.
If you are confused about what selecting, indexing, slicing, and filtering mean,
here’s what you should read next:
Read more
👉 If you liked this post, don’t forget to leave a like ❤️. It helps more
people discover this newsletter on Substack and tells me that you
appreciate reading these daily insights. The button is located towards
the bottom of this email.
👉 If you love reading this newsletter, feel free to share it with friends!
about:blank 5/6
5/25/23, 10:08 AM Yahoo Mail - Avoid This Costly Mistake When Indexing A DataFrame
I like to explore, experiment and write about data science concepts and tools.
You can read my articles on Medium. Also, you can connect with me on
LinkedIn and Twitter.
about:blank 6/6