🔍 Testing an ML Model ≠ Testing Traditional Code

Wednesday, July 2, 2025

🔍 Testing an ML Model ≠ Testing Traditional Code

Testing a Machine Learning (ML) model is very different from testing traditional software because:

The output is probabilistic, not deterministic.
The behavior depends on data patterns, not just logic.

To test an ML model effectively, you need a multi-layered strategy combining functional, data-driven, and performance-based testing.

✅ 1. Unit Testing the ML Pipeline (Code-Level)

🔍 What to Test:

Data preprocessing methods (normalization, encoding)
Feature extraction logic
Model loading and inference function

💡 Example:

# Check that normalization function scales data between 0–1
assert 0 <= normalize(input)[0] <= 1

✅ 2. Input/Output Testing (Functional Testing)

🔍 What to Test:

Given a known input, does the model return an expected class/label/output?
Test with edge-case inputs (nulls, NaNs, unexpected formats)

💡 Example:

# Input: image of a cat
# Expected Output: "cat" label
assert model.predict(cat_image) == "cat"

✅ 3. Accuracy / Precision / Recall Testing (Performance Testing)

🔍 What to Test:

Use a test dataset with known labels
Compare predictions vs ground truth
Track metrics: accuracy, precision, recall, F1-score

💡 Example:

from sklearn.metrics import accuracy_score
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_true, y_pred)
assert accuracy > 0.90

✅ 4. Regression Testing

🔍 What to Test:

When retraining/updating model, ensure performance doesn’t degrade
Compare new vs old model predictions on same test set

💡 Example:

assert new_model_accuracy >= old_model_accuracy - 0.01

✅ 5. Robustness Testing

🔍 What to Test:

Slightly modified inputs (noise, adversarial examples)
Input format changes, missing values

💡 Example:

assert model.predict(perturbed_input) ≈ model.predict(original_input)

✅ 6. Bias & Fairness Testing

🔍 What to Test:

Check model behavior across different demographic groups
Ensure fairness in prediction

💡 Example:

# Accuracy across gender groups
assert abs(male_accuracy - female_accuracy) < 0.05

✅ 7. Integration & Deployment Testing

🔍 What to Test:

Can the model be deployed?
Can it process input in production format (e.g., JSON via REST API)?
Is the latency acceptable?

🧠 Summary of ML Model Testing Layers

Layer
What to Test
Tool Examples
Unit Testing
Preprocessing, feature logic
pytest, unittest
Functional Testing
Input → Output checks
assert, Golden datasets
Performance Metrics
Accuracy, Precision, F1
sklearn.metrics
Regression Testing
New vs Old Model consistency
Custom scripts
Robustness
Noisy, missing, edge inputs
Fuzzing, noise injection
Bias/Fairness
Group-wise comparison
AIF360, Fairlearn, custom code
Integration
API, latency, resource usage
Postman, API tests, load tests

Layer	What to Test	Tool Examples
Unit Testing	Preprocessing, feature logic	pytest, unittest
Functional Testing	Input → Output checks	assert, Golden datasets
Performance Metrics	Accuracy, Precision, F1	sklearn.metrics
Regression Testing	New vs Old Model consistency	Custom scripts
Robustness	Noisy, missing, edge inputs	Fuzzing, noise injection
Bias/Fairness	Group-wise comparison	AIF360, Fairlearn, custom code
Integration	API, latency, resource usage	Postman, API tests, load tests

Complete script to test model can be found here

Tech Unpacked – Research & Fundamentals with Nitin Sharma

Popular Posts

Search This Blog

Wednesday, July 2, 2025

🔍 Testing an ML Model ≠ Testing Traditional Code

✅ 1. Unit Testing the ML Pipeline (Code-Level)

🔍 What to Test:

✅ 2. Input/Output Testing (Functional Testing)

🔍 What to Test:

💡 Example:

✅ 3. Accuracy / Precision / Recall Testing (Performance Testing)

🔍 What to Test:

💡 Example:

✅ 4. Regression Testing

🔍 What to Test:

💡 Example:

✅ 5. Robustness Testing

💡 Example:

✅ 6. Bias & Fairness Testing

🔍 What to Test:

💡 Example:

✅ 7. Integration & Deployment Testing

🔍 What to Test:

🧠 Summary of ML Model Testing Layers

No comments:

My Profile

Featured Post

🚀 Introducing the Universal API Testing Tool — Built to Catch What Manual Testing Misses

!! IMPORTANT LINKS !!

!! INTERESTING TALKS !!

Contact Form

Labels

Total Pageviews