Statistical API 504 Gateway Time-out

vitsyrovat · March 8, 2022, 3:42pm

Hi guys,
I am trying to use Statistical API to get a time series of index values for a single polygon. I need to get values for all available dates since 2018/01/01 till the most recent.

Sometimes it works, but usually no data are returned and the request timeouts after 5 mins.
When trying the same for a shorter period (one month), data are usually returned.

I am using sentinelhub sdk, but face the same behavior at the request builder page.

Is Statistical API suitable for such a “long” time-series?
Or is there a workaround to get the long time-series (all days over several years)?

Many thanks for any advice!
Vit

Here is a sample request from the request builder:

curl -X POST https://services.sentinel-hub.com/api/v1/statistics 
 -H 'Content-Type: application/json' 
 -H 'Authorization: Bearer <token>' 
 -d '{
  "input": {
    "bounds": {
      "geometry": {
        "type": "Polygon",
        "coordinates": [
          [
            [
              12.976141,
              42.203122
            ],
            [
              12.976484,
              42.201087
            ],
            [
              12.978758,
              42.199975
            ],
            [
              12.980474,
              42.200992
            ],
            [
              12.978715,
              42.202804
            ],
            [
              12.978844,
              42.203535
            ],
            [
              12.976141,
              42.203122
            ]
          ]
        ]
      }
    },
    "data": [
      {
        "dataFilter": {},
        "type": "sentinel-2-l2a"
      }
    ]
  },
  "aggregation": {
    "timeRange": {
      "from": "2018-01-01T00:00:00Z",
      "to": "2022-03-08T23:59:59Z"
    },
    "aggregationInterval": {
      "of": "P1D"
    },
    "width": 10,
    "height": 11.091,
    "evalscript": "//VERSION=3\nfunction setup() {\n  return {\n    input: [{\n      bands: [\n        \"B04\",\n        \"B08\",\n        \"SCL\",\n        \"dataMask\"\n      ]\n    }],\n    output: [\n      {\n        id: \"data\",\n        bands: 3\n      },\n      {\n        id: \"scl\",\n        sampleType: \"INT8\",\n        bands: 1\n      },\n      {\n        id: \"dataMask\",\n        bands: 1\n      }]\n  };\n}\n\nfunction evaluatePixel(samples) {\n    let index = (samples.B08 - samples.B04) / (samples.B08+samples.B04);\n    return {\n        data: [index, samples.B08, samples.B04],\n        dataMask: [samples.dataMask],\n        scl: [samples.SCL]\n    };\n}\n"
  },
  "calculations": {
    "default": {}
  }
}'

batic · March 8, 2022, 3:45pm

The best would be to split this into chunks of about a year worth of data. In principle such long time-series should work, but then it depends on the area (size of geometry).

Best,
Matej

vitsyrovat · March 8, 2022, 4:10pm

Dear Matej, Thanks for your quick reply!

The area of our polygons is usually a few ha, but may be up to several hundreds. Though I tested it with really small ones, less than 0.1 ha which timeout as well.

Is there any rule of thumb how to set the intervals (a year or shorter) perhaps depending on the polygon area when splitting into chunks?

All the best!
Vit

batic · March 9, 2022, 7:22am

Hi @vitsyrovat

The timeouts can happen because of too large amount of data that has to be read in order to fulfil your request. That can happen for several reasons:

too many observations in the time range (could be due to too long time range, too dense time series /e.g. daily observations/, being on the intersection of several S-2 tiles, …)
too many bands requested
too large area requested

The last two would just “tip the scale” in case that the time range requested is already at the limit. So splitting to 1 year is a good practice in any case. The rest is unfortunately even for us at the moment a bit of a “try and see if it works”.

vitsyrovat · March 9, 2022, 7:49am

Hi Matej,

Many thanks!

All the best,
Vit

vitsyrovat · March 29, 2022, 5:45pm

Following on the discussion above.
We implemented splitting the period into chunks of 180 days and tested it. For each of the chunks Statistical API is requested asynchronously.
Still, some of the requests timeout and compared to FIS (from which we want to migrate due to its deprecation) the requests are very slow. They take from 80 to more than one thousand seconds, while FIS request for the same data takes around 6 seconds.
We might split the period into even smaller chunks but then we would burn out quite many requests and also made much bigger traffic.
We are requesting data for agricultural fields usually around 10-20 ha, so large area should not be the reason for the slowness.
Does anyone have any thoughts on this?

primoz · April 1, 2022, 7:19am

Hi Vit,
Could you provide some examples of the larger timed-out requests (you can send them directly to me via https://zerobin.net/ or similar) so we can investigate. We’re constantly trying to improve our services so any such reports are very helpful.
If you make the chunks smaller you’ll certainly get shorter and more deterministic execution times and the traffic (in your direction) wouldn’t increase that much.

vitsyrovat · April 3, 2022, 10:36am

Hi @primoz ,
It seems to me that the problem arises when we send several (many) requests to Statistical API at a time.
When our users register a new polygon, we request cloud cover for all available dates since 2018 from Sentinel Hub.
It is a common case that users register tens of polygons at a time. Then our workers pick the polygons and process them in parallel, sending requests to Sentinel Hub to get the time-series.

As a single request for the whole time period tends to time out, we tried to split it into 180 day chunks, which leads to 9 requests. A single request for a 180 day period data takes usually a few seconds. All the 9 requests processed in a series then take about a minute. When trying to speed it up, we send all the 9 request in parallel. And here it seems that the Statistical API does not scale well. The first request returns fast, but the response time of remaining grows quickly and some may time out.
It helps, if we wait for a second before another request is sent, but this helps only in case a single polygon is processed at a time and I don’t think it is the right way of solving the issue. If there are 30 polygons registered by our users, we send 9*30 requests to Statistical API in a very short time.

Here are some examples of request timimgs. The polygon size is about 0.2 ha.

Requests sent in parallel:
(‘2018-01-01’, ‘2018-06-29’): start at 2022-04-03T08:57:18.147204Z
(‘2018-06-30’, ‘2018-12-26’): start at 2022-04-03T08:57:18.157782Z
(‘2020-12-16’, ‘2021-06-13’): start at 2022-04-03T08:57:18.160281Z
(‘2018-12-27’, ‘2019-06-24’): start at 2022-04-03T08:57:18.162932Z
(‘2020-06-19’, ‘2020-12-15’): start at 2022-04-03T08:57:18.164990Z
(‘2019-12-22’, ‘2020-06-18’): start at 2022-04-03T08:57:18.165364Z
(‘2019-06-25’, ‘2019-12-21’): start at 2022-04-03T08:57:18.166185Z
(‘2021-06-14’, ‘2021-12-10’): start at 2022-04-03T08:57:18.177563Z
(‘2021-12-11’, ‘2022-04-03’): start at 2022-04-03T08:57:18.180407Z
(‘2020-06-19’, ‘2020-12-15’): finished after 4 seconds
(‘2021-12-11’, ‘2022-04-03’): finished after 6 seconds
(‘2018-06-30’, ‘2018-12-26’): finished after 130 seconds
(‘2018-01-01’, ‘2018-06-29’): finished after 134 seconds
(‘2019-06-25’, ‘2019-12-21’): finished after 136 seconds
(‘2019-12-22’, ‘2020-06-18’): finished after 264 seconds
(‘2020-12-16’, ‘2021-06-13’): finished after 430 seconds
(‘2021-06-14’, ‘2021-12-10’): finished after 433 seconds
(‘2018-12-27’, ‘2019-06-24’): finished after 435 seconds

Sleeping 1 sec between requests:
(‘2018-01-01’, ‘2018-06-29’): start at 2022-04-03T09:09:08.377943Z
(‘2018-06-30’, ‘2018-12-26’): start at 2022-04-03T09:09:09.224317Z
(‘2018-12-27’, ‘2019-06-24’): start at 2022-04-03T09:09:10.222963Z
(‘2019-06-25’, ‘2019-12-21’): start at 2022-04-03T09:09:11.225285Z
(‘2019-12-22’, ‘2020-06-18’): start at 2022-04-03T09:09:12.225422Z
(‘2018-01-01’, ‘2018-06-29’): finished after 3 seconds
(‘2020-06-19’, ‘2020-12-15’): start at 2022-04-03T09:09:13.227838Z
(‘2020-12-16’, ‘2021-06-13’): start at 2022-04-03T09:09:14.229192Z
(‘2019-12-22’, ‘2020-06-18’): finished after 2 seconds
(‘2021-06-14’, ‘2021-12-10’): start at 2022-04-03T09:09:15.232453Z
(‘2021-12-11’, ‘2022-04-03’): start at 2022-04-03T09:09:16.235208Z
(‘2020-06-19’, ‘2020-12-15’): finished after 3 seconds
(‘2020-12-16’, ‘2021-06-13’): finished after 2 seconds
(‘2021-06-14’, ‘2021-12-10’): finished after 2 seconds
(‘2021-12-11’, ‘2022-04-03’): finished after 2 seconds
(‘2018-06-30’, ‘2018-12-26’): finished after 10 seconds
(‘2019-06-25’, ‘2019-12-21’): finished after 9 seconds
(‘2018-12-27’, ‘2019-06-24’): finished after 13 seconds

If you need more info, I will be glad to provide it. Thank you!
Vit

vitsyrovat · April 11, 2022, 7:48am

@primoz @batic Any update on this?

primoz · April 11, 2022, 9:29pm

@vitsyrovat Could you please provide the UID of the account you were/are using for the stat requests via direct message so we can investigate things properly.

vitsyrovat · April 12, 2022, 8:42am

@primoz Please let me know if the UID reached you. I did not find out how to send a direct message here I have a vitsyrovat account at gmail.

primoz · April 14, 2022, 9:37am

I did some investigation. What you were experiencing is a known usability issue that happens when we recieve bursts of requests from multiple users (at the same time) in periods when the whole api is overall very idle.Since the infrastructure needs to scale some orders of magnitude (relatively) there’s a delay that the end user experiences. We already have plans in our roadmap to mitigate this issues - improve the overall user experience.

vitsyrovat · April 14, 2022, 10:39am

@primoz Thank you for your investigation. When approximately can we expect the better experience?