How to Find Rows with More Than n Positive Elements in Python Using NumPy ?

Introduction

When working with numerical data in Python, it’s very common to apply row-wise conditions to arrays. One frequent task is identifying rows that contain more than a given number of positive values.

This kind of operation appears often in:

  • Scientific computing
  • Remote sensing and geophysical data analysis
  • Feature engineering for machine learning
  • Quality control and filtering of large datasets

In this article, we’ll walk through a clean and efficient NumPy-based solution, explain how it works step by step, and discuss a few useful variations.

Problem Statement

Given a 2D NumPy array, we want to:

Find the indices of rows that contain more than n elements greater than 0

Example Data

Let’s start with a simple example array:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
import numpy as np

# Example array
arr = np.array([
    [0,  2,  3, -1],
    [4, -5,  6,  0],
    [-2, -3, -4, -5],
    [7,  8,  9, 10]
])

# Define threshold
n = 2

Visually, this array looks like:

Row index Values
0 0, 2, 3, -1
1 4, -5, 6, 0
2 -2, -3, -4, -5
3 7, 8, 9, 10

Step 1: Count Positive Elements Per Row

NumPy allows us to apply boolean conditions directly to arrays.

The expression arr > 0 returns a boolean array of the same shape:

1
arr > 0

Output

1
2
3
4
[[False  True  True False]
 [ True False  True False]
 [False False False False]
 [ True  True  True  True]]

In NumPy:

  • True is treated as 1
  • False is treated as 0

So we can count how many values are positive in each row using np.sum along axis=1:

1
2
count_positive = np.sum(arr > 0, axis=1)
print(count_positive)

Output:

1
[2 2 0 4]

Each value corresponds to the number of positive elements in a row.

Step 2: Select Rows with More Than n Positive Values

Now that we have a per-row count, filtering becomes straightforward:

1
rows = np.where(count_positive > n)[0]
  • count_positive > n creates a boolean mask
  • np.where(...) returns the indices where the condition is True
  • [0] extracts the array of row indices

Finally, print the result:

1
print("Rows with more than", n, "positive elements:", rows)

Output:

1
Rows with more than 2 positive elements: [3]

Only row 3 contains more than 2 positive values

Full Working Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
import numpy as np

arr = np.array([
    [0, 2, 3, -1],
    [4, -5, 6, 0],
    [-2, -3, -4, -5],
    [7, 8, 9, 10]
])

n = 2

count_positive = np.sum(arr > 0, axis=1)
rows = np.where(count_positive > n)[0]

print("Rows with more than", n, "positive elements:", rows)

Common Variations

1. Get the Rows Themselves (Not Just Indices)

1
2
selected_rows = arr[count_positive > n]
print(selected_rows)

Output:

1
[[ 7  8  9 10]]

2. Use >= n Instead of > n

If you want rows with at least n positive values:

1
rows = np.where(count_positive >= n)[0]

3. Count Values Above Any Threshold

You’re not limited to > 0. For example, values greater than 5:

1
count_above_5 = np.sum(arr > 5, axis=1)

4. Boolean Mask Only (No np.where)

If you prefer a more compact style:

1
2
mask = count_positive > n
rows = np.flatnonzero(mask)

Why This Approach Is Efficient

  • Fully vectorized (no Python loops)
  • Scales well to large arrays
  • Easy to read and modify
  • Leverages NumPy’s optimized C backend

This makes it ideal for large scientific or satellite datasets where performance matters.

Example of Use: Identification of Spurious VIIRS Fire Detections

This row-wise thresholding approach can be applied to operational satellite fire products to identify spurious detections associated with sensor or processing anomalies. Such anomalies were observed during the early phase of the Suomi NPP Visible Infrared Imaging Radiometer Suite (VIIRS) mission and are documented in Active fires from the Suomi NPP Visible Infrared Imaging Radiometer Suite: Product status and first evaluation results.

How to Find Rows with More Than n Positive Elements in Python Using NumPy ?
How to Find Rows with More Than n Positive Elements in Python Using NumPy ?

Early-mission artifacts included elevated false-alarm rates resulting from calibration uncertainties, detector striping, scan geometry effects, and residual bow-tie contamination. These effects can lead to fire detections characterized by multiple concurrent anomalies rather than a single isolated indicator.

Methodological Framework

Assume each fire detection (or aggregated fire pixel/cluster) is represented by a vector of diagnostic indicators derived from the VIIRS Active Fire product and ancillary information, for example:

  • Fire Radiative Power (FRP) outliers
  • Brightness temperature residuals
  • Confidence or quality flags
  • Temporal persistence or spatial consistency metrics

Spurious detections are expected to exhibit simultaneous exceedances across multiple diagnostics. This behavior can be exploited by counting the number of indicators exceeding predefined thresholds on a per-detection basis.

Illustrative Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# Rows represent individual VIIRS fire detections
# Columns represent thresholded diagnostic indicators

features = np.array([
    [0, 1, 1, 0],
    [1, 0, 1, 0],
    [0, 0, 0, 0],
    [1, 1, 1, 1]
])

n = 2  # minimum number of anomalous indicators

anomaly_count = np.sum(features > 0, axis=1)
suspect_rows = np.where(anomaly_count > n)[0]

print("Potential spurious detections:", suspect_rows)

This procedure flags detections exhibiting more than n anomalous indicators, which can then be subjected to further analysis or excluded from downstream processing.

References

Image

of