2.2 KiB
Outlier Analysis Worker
Processes outlier detection jobs to identify statistical outliers in spatial data.
Overview
The outlier analysis worker identifies features with values that are statistically unusual using z-score or MAD (Median Absolute Deviation) methods.
Job Type
outlier_analysis
Input Parameters
{
"dataset_id": 123,
"value_field": "income",
"method": "zscore",
"threshold": 2.0
}
Parameters
dataset_id(required): Source dataset IDvalue_field(required): Numeric field to analyzemethod(optional): "zscore" or "mad" (default: "zscore")threshold(optional): Z-score threshold or MAD multiplier (default: 2.0)
Output
Creates a new dataset with outlier analysis results:
- Original features marked as outliers
- Outlier score (z-score or MAD score)
- Outlier flag
- Original attributes preserved
Methods
Z-Score Method
Calculates standardized z-scores:
- Mean and standard deviation calculated
- Z-score = (value - mean) / standard_deviation
- Features with |z-score| > threshold are outliers
MAD Method
Uses Median Absolute Deviation:
- Median calculated
- MAD = median(|value - median|)
- Modified z-score = 0.6745 * (value - median) / MAD
- Features with |modified z-score| > threshold are outliers
Example
# Enqueue an outlier analysis job via API
curl -X POST "https://example.com/api/analysis/outlier_run.php" \
-H "Content-Type: application/json" \
-d '{
"dataset_id": 123,
"value_field": "income",
"method": "zscore",
"threshold": 2.0
}'
Background Jobs
This analysis runs as a background job. The worker:
- Fetches queued
outlier_analysisjobs - Validates input parameters
- Calculates statistics (mean/std or median/MAD)
- Identifies outliers
- Creates output dataset
- Marks job as completed
Performance Considerations
- Processing time depends on dataset size
- Z-score method requires two passes (mean/std, then scoring)
- MAD method is more robust to outliers in calculation
- Consider filtering null values before analysis
PostGIS
Mobile
QGIS
MapBender
GeoServer
GeoNode
GeoNetwork
Novella
Solutions