Processing API bug: 1-pixel offset when targeting SentinelHub instead the Copernicus Data Space

I am comparing the results of targeting the standard SentinelHub service versus the Copernicus Data Space (CDS). I have manually downloaded a S2 product from the Copernicus Hub and compared the results provided by the Processing API using these endpoints.

What I am observing is a small offset (1 pixel) when using the SentinelHub endpoint. This is strange because, when using the CDS (with the same code), the result matches the original.

# Imports

import rasterio as rio
import geopandas as gpd
import math
from shapely.affinity import translate
from sentinelhub import (
    BBox,
    DataCollection,
    SentinelHubRequest,
    SHConfig,
    bbox_to_dimensions,
    MimeType,
    MosaickingOrder,
)
from oauthlib.oauth2 import BackendApplicationClient
from requests_oauthlib import OAuth2Session

# Aux. Functions

def round_to_multiple(number, multiple, direction="up"):
    """Round a number to the nearest multiple in the specified direction."""
    func = math.ceil if direction == "up" else math.floor
    return multiple * func(number / multiple)

def align_shape_to_spec_meters(shape, align_meters=10):
    # Extract the geometry and its bounds
    geom = shape.iloc[0].geometry
    min_x, min_y, max_x, max_y = geom.bounds

    # Calculate translations for upper-left alignment
    dx_ul = round_to_multiple(min_x, align_meters, "down") - min_x
    dy_ul = round_to_multiple(max_y, align_meters, "up") - max_y
    geom_ul = translate(geom, xoff=dx_ul, yoff=dy_ul)

    # Calculate translations for down-right alignment
    dx_dr = round_to_multiple(max_x, align_meters, "up") - max_x
    dy_dr = round_to_multiple(min_y, align_meters, "down") - min_y
    geom_dr = translate(geom, xoff=dx_dr, yoff=dy_dr)

    # Create a union of the original and both translated geometries
    union_geom = geom.union(geom_ul).union(geom_dr)
    return gpd.GeoDataFrame({'geometry': [union_geom]}, crs=shape.crs)
	
# SHConfig

endpoint = 'SentinelHub' # Compare SentinelHub vs. CDS

config = SHConfig()

if endpoint == 'SentinelHub':
    
    # SentinelHUB Sinergise
    config.sh_base_url = 'https://services.sentinel-hub.com'
    config.sh_token_url = 'https://services.sentinel-hub.com/oauth/token'
    config.sh_client_id= 'XXXXX-XXXXX-XXXXX-XXXXX'
    config.sh_client_secret= 'XXXXXXXXXXXXXXXXX'

else:
    # Copernicus Dataspace
    config.sh_base_url = 'https://sh.dataspace.copernicus.eu'
    config.sh_token_url = 'https://identity.dataspace.copernicus.eu/auth/realms/CDSE/protocol/openid-connect/token'
    config.sh_client_id= 'XXXXX-XXXXX-XXXXX-XXXXX'
    config.sh_client_secret= 'XXXXXXXXXXXXXXXXX'

config.save()

client = BackendApplicationClient(client_id=CLIENT_ID)
oauth = OAuth2Session(client=client)

token = oauth.fetch_token(token_url=config.sh_token_url, client_secret=CLIENT_SECRET)

# Processing API

query_gdf = gpd.read_file('bbox.geojson')
query_gdf = align_shape_to_spec_meters(query_gdf)
query_sh_bbox = BBox(tuple(*query_gdf.bounds.values.tolist()), crs=f'EPSG:{query_gdf.crs.to_epsg()}')
bbox_dimensions = bbox_to_dimensions(query_sh_bbox, 10)

eval_script = """
    //VERSION=3
    function setup() {
        return {
            input: [{
                bands: ["B04", "B03", "B02", "B08"],
                units: "DN"
            }],
            output: {
                bands: 4,
                sampleType: "UINT16"
            }
        };
    }
    function evaluatePixel(sample) {
        return [sample.B04, sample.B03, sample.B02, sample.B08];
    }
"""

request = SentinelHubRequest(
    evalscript=eval_script,
    input_data=[
        SentinelHubRequest.input_data(
            data_collection = DataCollection.SENTINEL2_L2A if endpoint == 'SentinelHub' else DataCollection.SENTINEL2_L2A.define_from("s2l2a-cds", service_url=config.sh_base_url),
            time_interval = ("2021-01-01", "2021-01-03"),
            mosaicking_order = MosaickingOrder.LEAST_CC,
            other_args = {"harmonizeValues":False},  
        ),  
    ],
    responses=[SentinelHubRequest.output_response("default", MimeType.TIFF)],
    bbox = query_sh_bbox,
    size = bbox_dimensions,
    config=config,
)

x = request.get_data()[0]

h, w, c = x.shape 

tfm = rio.transform.from_bounds(*query_sh_bbox.geometry.bounds, w, h)

meta = {
    'driver': 'GTiff',
    'dtype': 'uint16',
    'nodata': 0.0,
    'width': w,
    'height': h,
    'count': c,
    'crs': f'epsg:{query_sh_bbox.crs.epsg}',
    'transform': tfm   
}

with rio.open(f'output_{endpoint}.tif', 'w', **meta) as dst: 
    dst.write(x.transpose(2, 0, 1))

Download de data from https://next.itracasa.es/s/4SX7CZXNCqbztjP using the password “sentinelhubforum”. Inside the zipfile you may find the bounding box (bbox.geojson), the product manually downloaded from the Copernicus Hub masked with that bbox (copernicus_hub.tif), and the ones outputed by this code (output_SentinelHub.tif and output_CDS.tif).

Hi,

Sentinel Hub APIs will process the data for the input area. This means there will likely always be an offset between the APIs output and the original data.

If you’d like to have pixels perfectly aligned, you need to make sure the extent of your input area is aligned to the Sentinel-2 grids and is the exact same resolution (this will vary from band to band).

If you are still concerned then I suggest you post your question on the CDSE Community Forum, which is more relevant to your question.

I ensure that the extent of my input area is aligned with the Sentinel-2 grids (this is precisely what the line query_gdf = align_shape_to_spec_meters(query_gdf) achieves). Additionally, I am only considering the 10m resolution bands.

I do not think I should post this question on the CDSE Forum, as the issue lies with SentinelHub. The CDS endpoint provides me with the exact same data that I can download manually from the Copernicus hub.

Edit: I have observed that you haven’t downloaded the data necessary to reproduce the error. Therefore, rather than providing a quick reply just to be the first responder, I would kindly encourage you to invest a few minutes reading the post I have written (on which I have spent considerable time). Please take the time to understand and reproduce the error. Only then will you realize that your answer does not address the question I have posed.

Please can you then provide some fully replicable code. This can be a curl request with an AOI already predefined by yourself, you can generate this in the Request Builder app if you wish.

For security reasons, I will not be downloading files from an unknown source, so reproducing your request is the easiest way to do this.

The fully replicable code is in the issue description.

It is a safe source since it is Self-hosted cloud collaboration platform within our company. However, if you still do not want to download files from there here you have the bounding box.

{
"type": "FeatureCollection",
"name": "bbox",
"crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:EPSG::32630" } },
"features": [
{ "type": "Feature", "properties": { }, "geometry": { "type": "Polygon", "coordinates": [ [ [ 773198.315625000162981, 4295511.257812498137355 ], [ 773198.315625000162981, 4299744.562499998137355 ], [ 777629.775000000256114, 4299744.562499998137355 ], [ 777629.775000000256114, 4295511.257812498137355 ], [ 773198.315625000162981, 4295511.257812498137355 ] ] ] } }
]
}

Thank you, having the AOI was really helpful. I was able to replicate your pixel offset running ‘identical’ requests using the Sentinel Hub and CDSE endpoints and there is a shift.

We have looked into this and this is most likely due to reprocessing of Sentinel-2 data since the processing baseline changes that have been made since 2021 when this image was acquired and processed. You can read more on these here.

The difference is due to data retrieved from CDSE being fully reprocessed ( Sentinel-2 Collection-1 Products Availability), whereas in Sentinel Hub some of the older data has still not been reprocessed.

The differences are minimal, but there are some such as the pixel offset you have highlighted. The most recent data will be identical from both Sentinel Hub and CDSE, for example, I requested Sentinel-2 L2A data for the AOI you shared for January 2025 and both endpoints returned the exact same image with no offset.

We are planning to update all remaining archive data, but rely on our cloud infrastructure provider on this. Currently, there is no timeline for when this will be completed.

Thanks again for highlighting this, I hope this answer helps you out!

1 Like

Thank you, William! Although it makes total sense, this 10-meter pixel offset has a negative effect on subsequent analysis, especially when dealing with time series. Looking forward to the reprocessing!