In the first two parts of our series on NumPy optimization, we have primarily covered how to speed up your code by trying to substitute loops for vectorized code. We covered the basics of vectorization and broadcasting, and then used them to optimize an implementation of the K-Means algorithm, speeding it up by 70x compared to the loop-based implementation.

This is a companion discussion topic for the original entry at https://blog.paperspace.com/numpy-optimization-internals-strides-reshape-transpose