Cloud Masking: Why a Binary Choice Breaks a Continuous Problem
2025-08-09 · 8 min read · Cloud Masking · Cloud Shadows · Sentinel-2 · Sen2Cor

TL;DR: Clouds and haze form a continuum. Hard thresholds flip pixels off or on and create gaps or contamination. Good cloud masking uses probabilities, cloud type awareness, and geometry. Pair masking with multi-date strategies when you must keep a clear nominal date.
The spectrum problem
Real scenes rarely split into only cloud and no cloud. You will meet thin cirrus, low stratus and fog, broken cumulus fields, stratocumulus decks, bright snow and ice, salt flats, bright sand, sunglint on water, smoke and dust, and adjacency effects near cloud edges. Most modern detectors output a probability per pixel for cloud and shadow. The moment you select a threshold, you inherit a trade-off: coverage versus contamination.
Cloud shadows are another source of ambiguity. They are low signal, not zero. Over dark water and evergreen forests, shadows are easily confused with natural variability unless you account for solar geometry, view angle, terrain, and cloud height.
Cloud types that break simple masks
Deep convective cumulus and cumulonimbus. Bright tops with strong SWIR absorption and sharp texture. Large shadows with long displacement that depends on solar zenith and cloud-top height. Parallax over high clouds can misplace shadow projections if height is wrong.
Stratocumulus decks. Textured sheets with semi-transparent breaks. Edge pixels and subpixel gaps cause adjacency brightening and partial transmission. Conservative dilation removes fringes but erases good pixels. Permissive thresholds leave bright edges that bias reflectance.
Thin cirrus and subvisible cirrus. High altitude ice clouds transmit light and reduce contrast while leaving apparent structure below. On Sentinel-2, the 1.38 µm cirrus band (Band 10) helps, but performance varies with column water vapor and surface altitude. Cirrus can survive strict blue or SWIR tests and still depress indices.
Low stratus and fog. Spectrally similar to bright surfaces, especially over snow, salt pans, and beaches. Thermal bands help on Landsat. Sentinel-2 has no thermal, so reliance shifts to blue, coastal aerosol, SWIR, and texture tests plus terrain context.
Broken cumulus over bright deserts. Bright sand and dry lake beds saturate thresholds. Masks confuse bright land with cloud unless SWIR absorption and texture cues are strong. Dilation near edges can remove valid bright pixels.
Smoke and dust. Aerosol plumes vary from optically thin to opaque. They change blue and red reflectance, reduce contrast, and introduce wavelength-dependent scattering. Smoke plumes over dark water can be mistaken for thin cloud, while dust over bright land can pass as cloud edge.
Snow and ice. High albedo with distinct spectral behavior. NDSI separates snow from cloud, but scene-dependent thresholds and terrain cast shadows that complicate both. In patchy snow the adjacency effect and BRDF changes cause mislabels.
Water with sunglint and foam. Specular reflection and wave foam create bright streaks that trip cloud tests. Glint strength depends on wind, solar geometry, and sensor view angle. Standard deglinting helps, but bright water near cloud edges is still risky.
Cloud shadows in practice
Shadow detection is geometry first. Project each cloud object using solar azimuth and zenith, sensor geometry, and an estimate of cloud-top height. Over mountains you must include a DEM to avoid putting shadows on the wrong slope. Typical errors:
- Height underestimation: projects shadows too short and misses the true dark area.
- Height overestimation: projects shadows beyond the dark patch and flags valid pixels.
- Terrain and parallax: high relief and oblique view angles shift apparent cloud location and elongate shadows unpredictably.
Mitigations include height brackets, search along the solar vector, and spectral checks that prefer low-NIR and low-Green with preserved SWIR texture. On water, look for low NIR with low red and low SWIR to avoid classifying turbid water as shadow.
Bright-surface and look-alike traps
- Deserts, beaches, gypsum, salt pans: high reflectance and low texture fool cloud tests. SWIR absorption and edge gradients help.
- Urban roofs and concrete: bright, spectrally flat patches resemble optically thick cloud in visible bands.
- Snow on conifer forest: mixed pixels toggle between snow and shadow across dates and create flicker in time series.
- Glint and whitecaps: glint masks reduce some false positives but do not fix bright water at cloud edges.
Algorithms and what they rely on
- Fmask and CFMask (Landsat): spectral tests including thermal, morphological dilation, and shadow projection. Works well with thermal but still scene-dependent.
- Sentinel-2 Sen2Cor SCL: scene classification with cloud, cirrus, shadow, snow, and vegetation classes. Performance depends on atmosphere and surface type.
- S2Cloudless (probability): gradient boosted or CNN-based probability of cloud derived from multi-band features. Requires a threshold and often some post-processing.
All three use a threshold at some point. The strictness of that threshold controls gaps versus contamination.
Thresholds in practice
Pipelines tend to pick one of three stances:
- Conservative threshold: removes thin cloud, edges, and much haze. Quality is high, coverage is low. Time series become gappy, especially in wet seasons.
- Permissive threshold: keeps coverage high but admits thin cloud and veiling haze. Indices drop, textures blur, and small changes disappear.
- Scene-adaptive threshold: adjusts to illumination, surface type, and atmosphere. Results are better, but tuning must be stable across sensors and seasons. Dilation to remove fringes increases gaps unless carefully limited.
Haze and aerosols are not cloud
Urban haze and pollution. Fine-mode aerosols reduce contrast in blue and green and bias vegetation and water indices downward. Pixels are often still usable if you estimate aerosol optical thickness or use dehazing and harmonization.
Smoke. Spectrally complex with absorbing and non-absorbing phases. Over bright land, smoke can look like thin cloud. Over water, smoke reduces NIR and red but keeps some texture. Mislabeling smoke as cloud creates holes you do not want.
Dust. Coarse-mode scattering increases red and SWIR brightness and changes color ratios. Deserts with dust and thin cloud combined are the hardest cases for fixed thresholds.
The right question is not mask or keep. It is how much weight to give this pixel today, and whether you can supplement it with a nearby clean observation.
Temporal consistency and cross-date clues
Even simple temporal filters help. If a pixel is labeled cloud on a clear day before and after, favor the clear state and downweight the cloudy label. Edge pixels that toggle on and off across dates are a sign to reduce dilation radius or switch to adaptive dilation. Parallax and fast cloud motion cause misalignment; register dates carefully before comparing.
When a binary mask still helps
Binary outputs are useful when you must hide obvious cloud and shadow in viewers, meet regulatory schemas that require a cloud bit, or create seasonal basemaps where long windows absorb gaps. If you publish a binary bit, keep the underlying probabilities available for analysts who need to revisit decisions.
ClearSKY in practice
We are satellite agnostic. We pull from multiple constellations, often hundreds of passes, and let the freshest observations speak louder. SAR carries structure through cloud, optical adds color and indices, and the output is tied to a single date.
For cloud seasons we prioritise same-day observations and then pull from nearby prior days when weather blocks optical coverage. SAR adds structure under cloud and reduces the need for aggressive thresholds. This approach keeps a clear nominal date while avoiding long seasonal windows.
Quick view: cloud type vs masking pitfalls
Cloud or condition | What breaks masks | Typical false positives | Useful cues to fix |
---|---|---|---|
Deep cumulus and cumulonimbus | Long displaced shadows, parallax, bright tops | Dark water or forest labeled as shadow | Shadow geometry with height brackets, SWIR absorption tests, DEM-aware projection |
Stratocumulus decks | Edge pixels, subpixel gaps, adjacency brightening | Bright sand or urban roofs at edges | Moderate dilation, texture and gradient checks, adaptive thresholds |
Thin cirrus | Partial transmission with low contrast | Haze or smoke over water | 1.38 µm cirrus band on S2, multi-date consistency, spectral slope tests |
Low stratus and fog | Spectral confusion with bright surfaces | Snow, salt flats, beaches | Thermal on Landsat, SWIR and coastal aerosol on S2, terrain context |
Smoke and dust | Variable spectra and texture | Thin cloud classification over land and water | Blue-to-SWIR ratios, temporal persistence, plume morphology |
Snow and ice | High albedo with complex terrain shadows | Cloud over mountains, cloud edges over snow | NDSI, terrain shadow modeling, BRDF-aware thresholds |
Water with sunglint | Specular streaks and whitecaps | Cloud over oceans and lakes | Glint modeling, NIR and SWIR checks, wind and geometry awareness |
Quick view: approaches
Approach | Coverage | Risk | Good for |
---|---|---|---|
Conservative mask | Low | Few artifacts, many gaps | Strict regulatory pipelines, small areas |
Permissive mask | High | Haze and edge contamination | Visual basemaps, rapid screening |
Scene-adaptive mask | Medium to high | Tuning drift across scenes | Regional programs, mixed surfaces |
Probability plus temporal filter | High | Requires housekeeping | Operational monitoring, change detection |
Even with strong algorithms, most hard failures happen at cloud edges, over bright land and glint, and under thin cirrus. Recognising these patterns and treating cloud masking as a probabilistic decision rather than an on-off switch is the difference between a clean time series and a flickering one.