Specifying constellation for HLS imagery

I have an eo-learn Python program to download Harmonized Landsat Sentinel (HLS) imagery. I would like to be able to download HLS images derived from Landsat and Sentinel images when they were both acquired on the same day. The current program only downloads one. The approach I’m testing is to do two downloads, one with “constellation” set to LANDSAT and then again with it set to SENTINEL. I think I should be able to do that using an input.data.dataFilter object of the process API, but I don’t know how to do that using eo-learn. I tried this in my evalscript and specifying “constellation” as “LANDSAT” and “SENTINEL” returned the same images:

function setup() {
      return {
        input: [{
          bands: ["CoastalAerosol", "Blue", "Green", "Red", "NIR_Narrow", "SWIR1", "SWIR2", "Cirrus",\
                  "QA", "VAA", "VZA", "SAA", "SZA", "dataMask"],
          units: "DN",
          dataFilter: {  // Add this to specify the data source filter
              "constellation": "LANDSAT"
          }
        }],
        output: [{
            id: "ms_data",
            bands: 14,
            sampleType: SampleType.UINT16
        }, {
            id: "indices",
            bands: 1,
            sampleType: SampleType.INT16
        }]
      }
    }```

Hi nedhorning,
It’s important to note that the datafilter should not be passed directly in the Evalscript. For optimal usage of EoLearn, I recommend replicating some of these examples from the sentinelhub-py library Sentinel Hub Process API — Sentinel Hub 3.10.1 documentation applied to your specific use case. EoLearn is built on top of sentinelhub-py, so experimenting with its examples will help you get the most out of it.

Thank you, @adrian.dipaolo. My current program follows an eolearn workflow using SentinelHubEvalscriptTask(), SaveTask(), OutputTask(), linearly_connect(), EOWorkflow(), and workflow.execute(). My understanding is that SentinelHubEvalscriptTask() provides high-level access to the Sentinel Hub Process API.

I looked into the Sentinel Hub Process API and can see how I could use dataFilter to set “constellation”: “LANDSAT”. Unfortunately, I have not figured out how to integrate functions from the Sentinel Hub Process API into my eolearn program. I expect I’m missing an obvious and simple point, but if you can provide any additional information or examples that combine Sentinel Hub Process API functions in an eolearn workflow, I’d appreciate it.

Please, if you have that workflow you should check the documentation of SentinelHubEvalscriptTask() class eolearn.io.sentinelhub_process — eo-learn 1.5.3 documentation, and check how to input the collection parameter for HLS. The bands available are in the documentation. You can follow some example present in the documentation with your required collection: Introduction — eo-learn 1.5.3 documentation

For example you can add the following line to the EO Learn task, which would return only Sentinel-2 data:

aux_request_args={"dataFilter": {"constellation": "SENTINEL"}}

Thank you @adrian.dipaolo. I appreciate your help. I looked at the documentation and examples you recommend, and from what I can tell, I need to use aux_request_args() with a dict as an argument. I tried that and still got the same result when selecting LANDSAT or SENTINEL. This is the code I’m using:

hls_aux_args = dict(
    dataFilter = {'constellation': 'LANDSAT'}
)
# this will add the indices: ndvi, gci
add_indices = SentinelHubEvalscriptTask(
    features=[(FeatureType.DATA, "indices")],
    evalscript=bands_evalscript,
    data_collection=DataCollection.HARMONIZED_LANDSAT_SENTINEL,
    resolution=resolution,
    time_difference=time_difference,
    aux_request_args=hls_aux_args,
    config=config,
    max_threads=3
)

I’m bumping this thread with the hope of getting some additional guidance. I have not gotten my script to return only Landsat or Sentinel imagery from the HLS archive. The script always returns both. I didn’t find an example similar to what I am trying to do, so I modeled my code in the previous post after this example. In the HLS documentation, it appears I should be able to set the satellite constellation using input.data.dataFilter object, “constellation”, using the process API. The two options for “constellation” are “LANDSAT” and “SENTINEL”. My assumption is that I can set those variables using the “aux_request_args” argument in SentinelHubEvalscriptTask(). Is that assumption correct? If not, can someone suggest how I can specify Landsat or Sentinel when retrieving HLS imagery?

Hi Ned,

You should have no problems with the example that was given above, I notice in your code you added this within a dict function that you do not need to do.

aux_request_args={"dataFilter": {"constellation": "SENTINEL"}}

If you are still having issues, please reach out.

Thank you @william.ray. I modified “aux_request_args” as you suggested and am still getting the same results with “constellation”: “SENTINEL” and “constellation”: “LANDSAT”. Can you think of anything else that might cause this behavior?

add_indices = SentinelHubEvalscriptTask(
    features=[(FeatureType.DATA, "indices"), (FeatureType.DATA, "ms_data")],
    evalscript=bands_evalscript,
    data_collection=DataCollection.HARMONIZED_LANDSAT_SENTINEL,
    resolution=resolution,
    time_difference=time_difference,
    aux_request_args={"dataFilter": {"constellation": "LANDSAT"}},
    config=config,
    max_threads=3
)

Hi Ned,

Can you confirm what mosaicking method you are using? It only works if mosaicking is SIMPLE or ORBIT. If the mosaicking is TILE the dataFilter will not work.

Thanks,

Will

I didn’t define “mosaicking,” so I assume it defaulted to “SIMPLE”. I just added mosaicking: Mosaicking.ORBIT, to my evalscript, and the problem persists. Here is the evalscript I’m using:

bands_evalscript = """
    //VERSION=3

    function setup() {
      return {
        input: [{
          bands: ["CoastalAerosol", "Blue", "Green", "Red", "NIR_Narrow", "SWIR1", "SWIR2", "Cirrus",\
                  "QA", "VAA", "VZA", "SAA", "SZA", "dataMask"],
          units: "DN",
        }],
        mosaicking: Mosaicking.ORBIT,
        output: [{
            id: "ms_data",
            bands: 14,
            sampleType: SampleType.UINT16
        }, {
            id: "indices",
            bands: 1,
            sampleType: SampleType.INT16
        }]
      }
    }

    function evaluatePixel(sample) {
        bandsDN = [ sample.CoastalAerosol, sample.Blue, sample.Green, sample.Red, sample.NIR_Narrow,
        sample.SWIR1, sample.SWIR2, sample.Cirrus, sample.QA, sample.VAA, sample.VZA, sample.SAA,\
        sample.SZA, sample.dataMask];
        let gci = Math.round(((sample.NIR_Narrow/sample.Green) - 1.0) * 1000)
        return {
            ms_data: bandsDN,
            indices: [gci]
        }
    }
"""

Hi Ned,

Testing your code, it does not appear as if it is working unexpectedly. Can you please try the folowing as your evalscript:

//VERSION=3

  function setup() {
    return {
      input: [{
        bands: [
          "ThermalInfrared1", "dataMask"
        ]
      }],
      output: {
        bands: 2,
        sampleType: "FLOAT32"
      }
    }
  }

  function evaluatePixel(samples) {
    return [samples.ThermalInfrared1, samples.dataMask]
  }

This band is only available through the Landsat constellation, therefore, if you try a time period where there is no Landsat availability, you should be able to give yourself peace of mind that the filter is working.

Thank you, @william.ray. I ran it again with your evalscrip. When I run it with the date range 2023-05-07- 2023-5-10 with “constellation”: “LANDSAT” I have two images returned and when using “SENTINEL” no images are returned. The problem is that for the time period I used, there were no Landsat images. I shouldn’t have any images returned when I specify “LANDSAT” but instead I get two Sentinel images. It appears as if when all the “bands” in the evalscript exist for both Landsat and Sentinel then the output results will be the same regardless if “LANDSAT” or “SENTINEL” are requested. If, as in your evalscrip, one or more bands do not exist in the specified collection, then no images are returned for that collection. Below is the script I’m using. Here are the image dates and the associated satellite (S30 or L30) in the date range specified in the script. I verified that these images exist in the Sentinel Hub archive and the NASA archive.

HLS.S30.T18TXQ.2023130T154819.v2.0
HLS.S30.T18TXQ.2023127T153819.v2.0
HLS.S30.T18TXQ.2023125T154811.v2.0
HLS.L30.T18TXQ.2023125T153155.v2.0
HLS.S30.T18TXQ.2023122T153811.v2.0

from sentinelhub import CRS, BBox, DataCollection, SHConfig
from eolearn.core import EOWorkflow, FeatureType, LoadTask, OutputTask, SaveTask, linearly_connect_tasks, MergeFeatureTask
from eolearn.io import SentinelHubDemTask, SentinelHubEvalscriptTask, SentinelHubInputTask

CLIENT_ID = ""
CLIENT_SECRET = ""
config = SHConfig()
if CLIENT_ID and CLIENT_SECRET:
    config.sh_client_id = CLIENT_ID
    config.sh_client_secret = CLIENT_SECRET

if config.sh_client_id == "" or config.sh_client_secret == "" or config.instance_id == "":
    print("Warning! To use Sentinel Hub services, please provide the credentials (client ID and client secret).")

bands_evalscript = """
  //VERSION=3

  function setup() {
    return {
      input: [{
        bands: [
          "ThermalInfrared1", "dataMask"
        ]
      }],
      output: {
        id: "ms_data",
        bands: 2,
        sampleType: "FLOAT32"
      }
    }
  }

  function evaluatePixel(samples) {
    return [samples.ThermalInfrared1, samples.dataMask]
  }
"""

eopatch_path = "/home/nedhorning/Abe/TestDataForJohn/EOLearn_Testing/"
epsg = "EPSG:32618"
time_interval = ("2023-05-01", "2023-5-10")  
resolution = 30

add_indices = SentinelHubEvalscriptTask(
    features=[(FeatureType.DATA, "ms_data")],
    evalscript=bands_evalscript,
    data_collection=DataCollection.HARMONIZED_LANDSAT_SENTINEL,
    resolution=resolution,
    aux_request_args={"dataFilter": {"constellation": "LANDSAT"}},
    config=config,
    max_threads=3
)

roi_bbox = BBox([656839, 4981014, 657406, 4981528], crs=CRS(epsg))
    
save = SaveTask(eopatch_path + "numpyHLS_Landsat/", overwrite_permission=2, compress_level=0) 
workflow_nodes = linearly_connect_tasks(add_indices, save)
workflow = EOWorkflow(workflow_nodes)

result = workflow.execute(
    {
        workflow_nodes[0]: {"bbox": roi_bbox, "time_interval": time_interval},
        workflow_nodes[-1]: {"eopatch_folder": 'test'},
    }
)
    




Hi @nedhorning,

jumping in to help with this. Let me see first if I understood the discussion correctly:

  1. you are interested in HLS data
  2. you wish to download separate S30 and L30 products from the HLS data, filtering to only those which were taken on the same date
  3. you wish to achieve this using the constellation parameter in the dataFilter part of the API
  4. if you request a S2 specific band in a time interval where there are no Sentinel-2 observations, you don’t get any data (as expected)
  5. same for LS as above
  6. if you request a band which is available in both collections, you get the same amount of output images, regardless of the constellation and regardless of the fact if the specific source data observations are missing during a given time interval

Everything is OK except for part 6, where you would expect to get no images if requesting a collection for which there are no observations, right?

Did you check the output of these images? It seems to me that if I specify a common band, I get some images which are basically no-data:

So, the behavior is such by design, but the datafilter controls the dataMask of the output. The output itself is expected and provides information that can be used to filter the data to the dates with both sources, meaning that you could use this dataMask to filter out dates with no data.

Example:

  • you download two eopatches, one with collection LANDSAT to eop_ls and one with collection SENTINEL to eop_s2

then you can filter them appropriately:

def no_data_coverage(no_data):
    return np.count_nonzero(no_data == 0) / np.prod(no_data.shape)

timestamps = eop_s2.get_timestamps()  # eop_ls has same timestamps
s2_timestamps = [timestamps[tdx] for tdx, img in enumerate(eop_s2.data['ms_data']) if no_data_coverage(img[1]) == 1]  # keep only those where no_data coverage is full
ls_timestamps = [timestamps[tdx] for tdx, img in enumerate(eop_ls.data['ms_data']) if no_data_coverage(img[1]) == 1]  # same here

Now, to check the dates which fall on the same day, you need to check the day part of the timestamp:

s2_dates = {ts.date() for ts in s2_timestamps}
ls_dates = {ts.date() for ts in ls_timestamps}
common_dates = sorted(ls_dates & s2_dates)

Hope this helps!

Cheers,
Matic

It should be possible to also do all of this in the same request/evalscript, but it would require you to loop over the scenes in the requested time interval and filtering them using the scene metadata and filter based on the dataPath info of the scenes/tiles objects

Thanks so much, @matic.lubej, for taking the time to give a detailed explanation. Your understanding of my interpretation and expectations was spot on. I will try using dataPath to filter eopatches.