Mastering NumPy in Python: The Backbone of Scientific Computing

472 0 0 0 0

Chapter 4: Advanced Indexing, Masking, and Manipulation in NumPy

🔹 1. Introduction

Indexing and slicing are fundamental when working with arrays in NumPy. While basic indexing is similar to Python lists, NumPy extends it with powerful features like boolean masking, fancy indexing, and efficient reshaping. These techniques allow you to extract, modify, and analyze subsets of data more intuitively and performantly.

In this chapter, we’ll explore:

  • Boolean indexing
  • Fancy (advanced) indexing
  • Slicing tips
  • Combining and splitting arrays
  • Copy vs. view behavior

Mastering these techniques is essential for efficient data manipulation, preprocessing, and modeling tasks in data science, machine learning, and beyond.


🔹 2. Basic Indexing Review

Before diving deep, let’s recall basic indexing:

import numpy as np

 

arr = np.array([[10, 20, 30],

                [40, 50, 60]])

 

print(arr[1, 2])  # Output: 60

  • arr[i, j]: Row i, column j
  • Negative indexing works: arr[-1, -1] → 60

🔹 3. Slicing Arrays

1D Slicing

arr = np.array([0, 1, 2, 3, 4, 5])

print(arr[1:4])     # [1 2 3]

print(arr[::-1])    # Reversed array

2D Slicing

arr = np.array([[10, 20, 30], [40, 50, 60], [70, 80, 90]])

print(arr[:2, 1:])   # [[20 30] [50 60]]


🔹 4. Boolean Indexing (Masking)

Allows selection based on condition:

arr = np.array([5, 10, 15, 20])

mask = arr > 10

print(arr[mask])  # [15 20]

Or inline:

print(arr[arr < 15])  # [5 10]

Use Case: Filter Even Numbers

arr = np.arange(10)

even = arr[arr % 2 == 0]  # [0 2 4 6 8]


🔹 5. Fancy Indexing

Fancy indexing allows you to access multiple indices at once using lists or arrays.

arr = np.array([100, 200, 300, 400, 500])

print(arr[[0, 2, 4]])  # [100 300 500]

You can also use it with 2D arrays:

arr2d = np.array([[1, 2], [3, 4], [5, 6]])

print(arr2d[[0, 2], [1, 0]])  # [2 5]


🔹 6. Combining Boolean & Fancy Indexing

You can chain conditions and use logical operators:

arr = np.array([5, 10, 15, 20, 25])

print(arr[(arr > 10) & (arr < 25)])  # [15 20]

Note: Use &, |, ~ instead of and, or, not for arrays.


🔹 7. Indexing with np.where()

np.where() returns indices or replaces values based on a condition.

Get Indices

arr = np.array([1, 3, 7, 9])

indices = np.where(arr > 5)

print(indices)  # (array([2, 3]),)

Conditional Assignment

arr = np.array([10, 15, 20])

new_arr = np.where(arr > 12, 1, 0)

print(new_arr)  # [0 1 1]


🔹 8. Copy vs. View: What’s the Difference?

View (shallow copy)

a = np.array([1, 2, 3])

b = a[1:]

b[0] = 100

print(a)  # [1 100 3] → View affects original

Copy (deep copy)

a = np.array([1, 2, 3])

b = a[1:].copy()

b[0] = 100

print(a)  # [1 2 3] → Original unchanged


🔹 9. Array Manipulation: Stack, Split, Reshape

Reshape

arr = np.arange(6)

arr2d = arr.reshape(2, 3)

Stacking Arrays

a = np.array([1, 2])

b = np.array([3, 4])

 

np.vstack((a, b))  # Vertical stack

np.hstack((a, b))  # Horizontal stack

Splitting Arrays

arr = np.array([[1, 2, 3], [4, 5, 6]])

np.hsplit(arr, 3)  # Split into 3 columns

np.vsplit(arr, 2)  # Split into 2 rows


🔹 10. Summary Table


Technique

Description

Boolean Indexing

Select based on condition

Fancy Indexing

Select by list of indices

np.where()

Conditional index/assignment

View vs Copy

Affects original vs. independent copy

Slicing

Extract rows/columns

reshape()

Change shape of array

hstack() / vstack()

Combine arrays horizontally/vertically

split()

Split arrays into parts

Back

FAQs


1. What is NumPy used for?

NumPy is used for numerical computations, array operations, linear algebra, and data processing in Python.

2. How is NumPy different from regular Python lists?

NumPy arrays are faster, use less memory, and support vectorized operations, unlike Python lists which are slower and less flexible for numerical tasks

3. What is an ndarray in NumPy?

It’s the core data structure in NumPy — an N-dimensional array that allows element-wise operations and advanced indexing.

4. Is NumPy part of the standard Python library?

No, it needs to be installed separately using pip install numpy.

5. What are broadcasting rules in NumPy?

Broadcasting allows NumPy to perform operations on arrays of different shapes by automatically expanding them to be compatible.

6. Can NumPy be used for linear algebra and matrix operations?

Yes, it provides comprehensive support for matrix multiplication, eigenvalues, singular value decomposition, and more.

7. Is NumPy suitable for big data or deep learning?

While NumPy is essential for preprocessing and fast array computations, deep learning libraries like TensorFlow or PyTorch build on top of it for more advanced tasks.

8. Can I use NumPy with Pandas and Matplotlib?

Absolutely Pandas is built on NumPy arrays, and Matplotlib supports NumPy for plotting.

9. Does NumPy support random number generation?

Yes the numpy.random module offers distributions like normal, binomial, uniform, etc.

10. Is NumPy faster than Python loops?

Significantly. NumPys vectorized operations are typically 10x to 100x faster than traditional for-loops in Python.