NumPy: The Engine of Scientific Computing
NumPy (Numerical Python) is the foundational package for almost all scientific computing and machine learning in Python. At its core, it provides the tightly-packed, fast, and multi-dimensional ndarray (N-dimensional array) object.
Why not just use standard Python lists?
1. Memory Efficiency: Python lists are arrays of pointers to scattered objects in memory. A NumPy array is a single block of contiguous memory containing elements of exactly the same data type.
2. Speed (Vectorization): Because the memory is contiguous and data types are fixed, NumPy operates entirely in optimized C code beneath the hood. It executes math equations on entire arrays simultaneously, skipping slow Python for loops.
Core Concepts
1. Vectorization
Vectorization refers to the absence of explicit looping, indexing, etc., in the code - these things are taking place, of course, just “behind the scenes” in optimized, pre-compiled C code.
2. Broadcasting
Broadcasting describes how NumPy treats arrays with different shapes during arithmetic operations. Subject to certain constraints, the smaller array is "broadcast" across the larger array so that they have compatible shapes. This allows operations between an array and a single number, or between a 2D matrix and a 1D vector.
3. Universal Functions (ufuncs)
A universal function is a function that operates on ndarrays in an element-by-element fashion, supporting array broadcasting, type casting, and several other standard features. Examples include np.add, np.exp, np.sqrt.
4. Indexing and Slicing
NumPy extends Python's list slicing [start:stop:step] to multiple dimensions [row_start:row_stop, col_start:col_stop], allowing extremely rapid querying and subsetting of massive datasets.
How to execute the examples:
Go to the Examples/ folder and run the scripts using Python:
python NumPy_Vectorization.py
python NumPy_Broadcasting.py
python NumPy_MatrixMath.py