Help with lowering PUs for Statistical API

Hello,

I am investigating the Statistical API to pull data for specific bands. The only values I am looking for are mean reflectance values for each of the bands. I have been able to successfully pull data from the API, but the processing units are higher than I expected. Can someone look at our API request and help me optimize it to lower our processing units? See below code for my API request:

from sentinelhub import SentinelHubStatistical, DataCollection, CRS, BBox, bbox_to_dimensions, \
    Geometry, SHConfig, parse_time, parse_time_interval, SentinelHubStatisticalDownloadClient
'''

Example Polygon in fields_gpd:
POLYGON ((-98.3142984128739 46.7284272171651, -98.313747 46.727692, -98.3121418637797 46.727692, -98.312192 46.728987, -98.3142984128739 46.7284272171651))

'''

#Sentinelhub API call
evalscript = """
    //VERSION=3
    function setup() {
        return {
            input: [{
            bands: ["B01", "B02", "B03", "B04", "B05", "B06", "B07", "B08", "B8A", "B09", "B11", "B12", "CLM", "CLP", "dataMask"],
            units: "DN"
            }],
            output: [
            {
                id: "bands",
                bands: ["B01", "B02", "B03", "B04", "B05", "B06", "B07", "B08", "B8A", "B09", "B11", "B12"],
                sampleType: "UINT16"
            },
            {
                id: "masks",
                bands: ["CLM"],
                sampleType: "UINT16"
            },
            {
                id: "indices",
                bands: ["CLP"],
                sampleType: "UINT16"
            },
            {
                id: "dataMask",
                bands: 1
            }]
        }
        }

        function evaluatePixel(samples) {
            // cloud probability normalized to interval [0, 1]
            let CLP = samples.CLP / 255.0;

            // masking cloudy pixels
            let combinedMask = samples.dataMask
            if (samples.CLM > 0) {
                combinedMask = 0;
            }

            const f = 5000;
            return {
                bands: [samples.B01, samples.B02, samples.B03, samples.B04, samples.B05,samples.B06,
                        samples.B07, samples.B08, samples.B8A, samples.B09, samples.B11, samples.B12],
                masks: [samples.CLM],
                indices: [toUINT(CLP, f)],
                dataMask: [combinedMask]
            };
        }

        function toUINT(product, constant){
        // Clamp the output to [-1, 10] and convert it to a UNIT16
        // value that can be converted back to float later.
        if (product < -1) {
            product = -1;
        } else if (product > 10) {
            product = 10;
        }
        return Math.round(product * constant) + constant;
        }

    """

ndvi_requests = []

#see example polygon for the geometry values in fields_gpd
for field in fields_gpd.geometry.values:

    request = SentinelHubStatistical(
    aggregation=SentinelHubStatistical.aggregation(
        evalscript=evalscript,
        time_interval=('2022-09-01', '2022-09-30'),
        aggregation_interval='P1D',
    ),
    input_data = [SentinelHubStatistical.input_data(
    DataCollection.SENTINEL2_L2A,
    
    )],
    geometry=Geometry(field, crs=CRS.WGS84),
    config=config,
    )
    ndvi_requests.append(request)

download_requests = [band_request.download_list[0] for band_request in ndvi_requests]
client = SentinelHubStatisticalDownloadClient(config=config)
band_stats = client.download(download_requests)

Hi! Thanks for the question. Looking at your API request, there is not too much more you can do to optimise it further. A couple of suggestions that could help you would be to:

  • Optimise the SampleType that you use for the cloud mask and cloud probability layers. They are stored in UINT8 format natively, so using UINT16 is not required. To find out more about different SampleTypes, I would really recommend reading @maxim.lamare 's Medium Post from earlier this year.
  • I would also advise adding a cloud filter to your input arguments when requesting over longer time periods as cloudy data is probably not useful data to you! You can filter excessively cloudy scenes by using the below in your input arguments:
input_data=[
        SentinelHubStatistical.input_data(
            DataCollection.SENTINEL2_L2A,            
            other_args={"dataFilter": {"maxCloudCoverage": 40}},            
      )],

Lastly, if you are requesting data over large areas, or several fields, we would recommend looking into using Batch Statistical API that has been designed for exactly this purpose.

I hope that this information is useful to you, but in case you have any further questions, don’t hesitate to reach out!

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.