Repeatable (retry always fails) HTTP 500 errors for certain dates

I am running large numbers of WCS requests through the EO-learn library. I request LAI time series data for areas of 0.1-10 hectare land parcels for ca. 6 month time intervals.
Occasionally (ca. 1% of the time), the EO learn workflow.execute() call will fail because of a HTTP error:

“DownloadFailedException: During execution of task S2L2AWCSInput: Failed to download from:
https://services.sentinel-hub.com/ogc/wcs/?SERVICE=wcs&BBOX=660977.0343447%2C5689713.093569177%2C661295.2431077849%2C5690002.2208702285&FORMAT=image%2Ftiff%3Bdepth%3D32f&CRS=EPSG%3A32631&RESX=10m&RESY=10m&COVERAGE=LAI&REQUEST=GetCoverage&VERSION=1.1.2&TIME=2018-06-05T10%3A45%3A24%2F2018-06-05T10%3A55%3A24&MAXCC=90.0&ShowLogo=False&Transparent=True
with HTTPError:
500 Server Error: Internal Server Error for url: https://services.sentinel-hub.com/ogc/wcs/?SERVICE=wcs&BBOX=660977.0343447%2C5689713.093569177%2C661295.2431077849%2C5690002.2208702285&FORMAT=image%2Ftiff%3Bdepth%3D32f&CRS=EPSG%3A32631&RESX=10m&RESY=10m&COVERAGE=LAI&REQUEST=GetCoverage&VERSION=1.1.2&TIME=2018-06-05T10%3A45%3A24%2F2018-06-05T10%3A55%3A24&MAXCC=90.0&ShowLogo=False&Transparent=True
Server response: “Something went wrong!””

This error always seems to relate to the request for a specific date in the range, and repeating the request always leads to failure again (i.e. it isn’t a temporary server error).

The following code recreates the error:
import datetime
from eolearn.core import LinearWorkflow, FeatureType, LoadFromDisk
from eolearn.io import S2L2AWCSInput, AddSen2CorClassificationFeature, S1IWWCSInput, S2L1CWCSInput
from sentinelhub import BBox, CRS

in_id = #####

add_DAT = S2L2AWCSInput(
layer=“LAI”,
feature=(FeatureType.DATA, ‘BANDS_S2’), # save under name ‘BANDS’
resx=‘10m’, # resolution x
resy=‘10m’, # resolution y
maxcc=0.9, # maximum allowed cloud cover of original ESA tiles
instance_id = in_id,
time_difference = datetime.timedelta(minutes=5)
)

add_scl = AddSen2CorClassificationFeature(‘SCL’, layer=“SCL”, instance_id = in_id)

workflow_S2 = LinearWorkflow(
add_DAT, # make L2A standard data import
#add_S1,
add_scl # creates/adds a cloud mask layer
)

extra_param = {add_DAT: {‘bbox’: BBox(((660977.0343447, 5689713.093569177), (661295.2431077849, 5690002.2208702285)), crs=CRS(32631)), ‘time_interval’: (‘2018-05-01’, ‘2018-10-15’)}}

result = workflow_S2.execute(extra_param)

I have noticed that the sentinel-hub library internally creates a list of requests for each available image in the time-interval, then runs each of these requests. So it creates e.g. these two requests, the first one which works, and the 2nd which doesn’t:
https://services.sentinel-hub.com/ogc/wcs/?SERVICE=wcs&BBOX=660977.0343447%2C5689713.093569177%2C661295.2431077849%2C5690002.2208702285&FORMAT=image%2Ftiff%3Bdepth%3D32f&CRS=EPSG%3A32631&RESX=10m&RESY=10m&COVERAGE=LAI&REQUEST=GetCoverage&VERSION=1.1.2&TIME=2018-05-31T10%3A47%3A56%2F2018-05-31T10%3A57%3A56&MAXCC=90.0&ShowLogo=False&Transparent=True’,
https://services.sentinel-hub.com/ogc/wcs/?SERVICE=wcs&BBOX=660977.0343447%2C5689713.093569177%2C661295.2431077849%2C5690002.2208702285&FORMAT=image%2Ftiff%3Bdepth%3D32f&CRS=EPSG%3A32631&RESX=10m&RESY=10m&COVERAGE=LAI&REQUEST=GetCoverage&VERSION=1.1.2&TIME=2018-06-05T10%3A45%3A24%2F2018-06-05T10%3A55%3A24&MAXCC=90.0&ShowLogo=False&Transparent=True’,

As the failure of a request for a single date throws an error inn the sentinel-hub library, which causes the entire eo-learn time-series request to fail, it is not possible for me to use exception handling in my code to catch the problem (I already have such code for other reasons) unless I get the date of the failed sub-request, then make 2 EO-patches to request the dates either side to then merge later (which seems excessive).

Please can you look into this and let me know if there is anything I can do on my side.

Cheers

Sam

This is a new error on the side of the Sentinel hub service…
We will investigate it and come back to you.

Just playing around a bit myself, the request in my example works if you change the cloud coverage parameter to 100 (from 90). Could it be there is an inconsistency in how the cloud coverage is handled?
i.e. that for some reason when the request URL list is made, sometimes it includes requests which are actually more cloudy than the given threshold, and that when these requests are actually made, they return an error because the cloud threshold is not met.

As a work around, I just tested my eo-learn script with maxcc=1.0 and now the entire time-series request works at the cost of having to now request even completely cloudy scenes.
I will try this work around in my larger workflow for now but it would be better to solve the real issue and be able to use the maximum cloud cover parameter again…

Cheers
Sam

Ah, so that work-around worked for the original example, but doesn’t generally solve the problem.
The following example still fails:
https://services.sentinel-hub.com/ogc/wcs/#####?SERVICE=wcs&BBOX=567124.6659711839%2C5719245.849226062%2C567328.7468511594%2C5719707.203095406&FORMAT=image%2Ftiff%3Bdepth%3D32f&CRS=EPSG%3A32631&RESX=10m&RESY=10m&COVERAGE=LAI&REQUEST=GetCoverage&VERSION=1.1.2&TIME=2018-06-13T10%3A52%3A37%2F2018-06-13T11%3A02%3A37&MAXCC=100.0&ShowLogo=False&Transparent=True

We have deployed a fix so these things should no longer happen.
I would appreciate if you give it a try.

I’ve just started reprocessing the failed requests, no problems after 20… seems to be working (with cloud cover threshold back to 90%.

Thanks very much