Fading Coder

One Final Commit for the Last Sprint

Home > Notes > Content

Essential NumPy Operations for Data Analysis

Notes 3

NumPy, short for Numerical Python, is a foundational library for scientific computing in Python. Many data analysis packages, including pandas, are built on top of NumPy.

At its core, NumPy uses the ndarray (N-dimensional array) data structure. While similar to Python lists, ndarrays are more efficient due to their C-based implementation and require uniform data types for optimal performance in large-scale mathematical operations.

Installation and Import

Install NumPy using pip:

pip install numpy

Import the library with the conventional alias:

import numpy as np

Creating Arrays

Manual Array Creation

Use np.array() to create arrays from sequences:

# One-dimensional array
vector = np.array([4, 5, 6])

# Two-dimensional array
matrix = np.array([[1, 2, 3], [7, 8, 9]])

Specialized Array Creation Methods

Zero Arrays:

# 1D zero array
zero_vec = np.zeros(5)
print(zero_vec)  # Output: [0. 0. 0. 0. 0.]

# 2D zero array
zero_mat = np.zeros((2, 3))
print(zero_mat)
# Output:
# [[0. 0. 0.]
#  [0. 0. 0.]]

One Arrays:

# 1D array of ones
ones_vec = np.ones(4)
print(ones_vec)  # Output: [1. 1. 1. 1.]

# 2D array of ones
ones_mat = np.ones((3, 2))
print(ones_mat)
# Output:
# [[1. 1.]
#  [1. 1.]
#  [1. 1.]]

Empty Arrays:

# Uninitialized 1D array
empty_arr = np.empty(3)

# Uninitialized 2D array
empty_mat = np.empty((2, 2))

Range Arrays:

# Array from 0 to 9
seq_arr = np.arange(10)
print(seq_arr)  # Output: [0 1 2 3 4 5 6 7 8 9]

# Array from 5 to 15 with step 3
step_arr = np.arange(5, 16, 3)
print(step_arr)  # Output: [ 5  8 11 14]

Random Arrays:

# 2x3 array with random values
rand_mat = np.random.randn(2, 3)
print(rand_mat)
# Output example:
# [[-0.234  1.456 -0.789]
#  [ 0.123 -0.456  0.890]]

Accessing Array Elements

Numeric Indexing

Access elements using zero-based indexing:

arr = np.array([10, 20, 30, 40])
print(arr[2])  # Output: 30

mat = np.array([[1, 2, 3], [4, 5, 6]])
print(mat[1][2])  # Output: 6

Slicing

Extract subarrays with slice notation:

arr = np.array([0, 1, 2, 3, 4, 5])
print(arr[2:5])  # Output: [2 3 4]

mat = np.array([[10, 20, 30], [40, 50, 60]])
print(mat[0:2, 1:3])
# Output:
# [[20 30]
#  [50 60]]

Boolean Indexing

Filter arrays using boolean conditions:

values = np.array([5, 15, 25, 35])
mask = np.array([True, False, True, False])
print(values[mask])  # Output: [ 5 25]

Vectoriaztion and Broadcasting

Vectorization allows element-wise operations without explicit loops:

arr = np.array([[1, 2], [3, 4]])
print(arr + 10)
# Output:
# [[11 12]
#  [13 14]]

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
print(arr1 * arr2)  # Output: [ 4 10 18]

Broadcasting automatically aligns arrays of different shapes:

vec = np.array([1, 2, 3])
mat = np.array([[10, 20, 30], [40, 50, 60]])
print(vec + mat)
# Output:
# [[11 22 33]
#  [41 52 63]]

Common Array Methods and Properties

Array Attributes

arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr.ndim)    # Dimensions: 2
print(arr.shape)   # Shape: (2, 3)
print(arr.size)    # Total elements: 6
print(arr.dtype)   # Data type: int32

Statistical Operations

arr = np.array([10, 20, 30, 40])

print(arr.max())    # Maximum: 40
print(arr.min())    # Minimum: 10
print(arr.mean())   # Mean: 25.0
print(arr.sum())    # Sum: 100

Sorting

NumPy provides in-place and non-destructive sorting options:

arr = np.array([30, 10, 40, 20])

# NumPy sort (returns new array)
sorted_np = np.sort(arr)
print(sorted_np)  # Output: [10 20 30 40]

# Python built-in sort
sorted_py = sorted(arr, reverse=True)
print(sorted_py)  # Output: [40, 30, 20, 10]

Conditional Filtering

Combine boolean indexing with comparison operators:

arr = np.array([5, 15, 25, 35])

# Single condition
print(arr[arr > 20])  # Output: [25 35]

# Multiple conditions
print(arr[(arr > 10) & (arr < 30)])  # Output: [15 25]

Transposition

Swap array axes using the .T attribute:

mat = np.array([[1, 2, 3], [4, 5, 6]])
print(mat.T)
# Output:
# [[1 4]
#  [2 5]
#  [3 6]]

Related Articles

Designing Alertmanager Templates for Prometheus Notifications

How to craft Alertmanager templates to format alert messages, improving clarity and presentation. Alertmanager uses Go’s text/template engine with additional helper functions. Alerting rules referenc...

Deploying a Maven Web Application to Tomcat 9 Using the Tomcat Manager

Tomcat 9 does not provide a dedicated Maven plugin. The Tomcat Manager interface, however, is backward-compatible, so the Tomcat 7 Maven Plugin can be used to deploy to Tomcat 9. This guide shows two...

Skipping Errors in MySQL Asynchronous Replication

When a replica halts because the SQL thread encounters an error, you can resume replication by skipping the problematic event(s). Two common approaches are available. Methods to Skip Errors 1) Skip a...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.