Issue with downloading large datasets using Python SDK

Hello

I am currently working with the Sentinel Hub Python SDK to download EO data for a large area and longer time range. However; I keep running into issues with timeouts ; incomplete downloads when the datasets are too large. :upside_down_face:

Has anyone encountered similar issues when handling large requests; and are there any recommended best practices ; workarounds for managing large dataset downloads effectively?
i have referred https://sentinelhub-py.readthedocs.io/ documentation guide but still need help.

I’ve tried adjusting the max_threads parameter but still face the same issue.

Any advice would be greatly appreciated!

Thanks! :slightly_smiling_face:

Hi Sotafo,

For large areas and timespans there are several avenues that you could explore, as the Process API in the Python SDK is not the best tool over certain size thresholds:

  • for areas that are a bit too large, you can look at the large area utility: " the sentinelhub package implements utilities for splitting areas into smaller bounding boxes."

  • a second option is to use our eolearn package, which is designed to “to seamlessly access and process spatio-temporal image sequences”. Although it is geared toward ML applications, it is quite good at splitting large areas into tiles and dealing with large time-series.

  • The third option, is the most adapted for large ares/time-series: batch processing API, which you can also leverage with the Python SDK. “Sentinel Hub Batch Processing takes the geometry of a large area and divides it according to a specified tile grid. Next, it executes processing requests for each tile in the grid and stores results to a given location at AWS S3 storage. All this is efficiently executed on the server-side. Because of the optimized performance, it is significantly faster than running the same process locally.

These tools should help you scale-up your workflow efficiently!