Data Errors: 32bit Float Values Greater Than 1

MMM · June 11, 2020, 1:53am

According to this link, 32bit float values should be between 0-1 (https://www.sentinel-hub.com/faq/how-are-values-calculated-within-sentinel-hub-and-how-are-they-returned-output/)

However this is not the case in Patch 2. Patch 2 is within Patch 1 if this was normal behaviour as per this link (Sentinel 2 L2A band data not in range 0-1), then Patch 1 should also have these greater than 1 values.

Questions:
1)Please advise - are values greater than 1 correct?
2)Is the first link wrong and the second link correct?
3)If the second link is correct and values greater than 1 for 32bit float are ok, then does that mean the same for 8-bit values can exceed 255 and also for 16-bit, i.e. values can exceed 65535?

Many thanks,
M

from eolearn.core.eoworkflow import LinearWorkflow, Dependency
from eolearn.core.eodata import FeatureType
from sentinelhub import BBox, CRS
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import datetime
from eolearn.core import SaveToDisk, LoadFromDisk
from eolearn.io import S2L1CWCSInput, AddSen2CorClassificationFeature, DEMWCSInput, S2L2AWCSInput, L8L1CWCSInput

layer = ‘BANDS-S2-L1C’
save = SaveToDisk(‘io_example_1’, overwrite_permission=2, compress_level=1)
input_task = S2L1CWCSInput(layer=layer,
resx=‘10m’, resy=‘10m’, time_difference=datetime.timedelta(hours=2))

time_i = [‘2020-01-05’,‘2020-01-09’]

bb_1 = BBox(bbox=[-0.421, 11.008, -0.420, 11.0079],crs=CRS.WGS84)

workflow = LinearWorkflow(input_task, save)

result_1 = workflow.execute({input_task: {‘bbox’: bb_1, ‘time_interval’: time_i},
save: {‘eopatch_folder’: ‘eopatch’}})

eopatch_1 = result_1[save]
image_1 = eopatch_1.data[‘BANDS-S2-L1C’]
d, w, h, bands = image_1.shape
img_2D_1 = image_1.reshape(d* w * h, bands)
img_2D_1.shape
print(img_2D_1)

[[0.1839 0.1718 0.1652 0.1949 0.2039 0.2239 0.2531 0.2471 0.2849 0.0737
0.0019 0.3589 0.2426]
[0.1839 0.1702 0.1656 0.1942 0.2039 0.2239 0.2531 0.2472 0.2849 0.0737
0.0019 0.3589 0.2426]
[0.1839 0.1677 0.1622 0.1902 0.1964 0.2146 0.2404 0.2387 0.2748 0.0737
0.0019 0.3499 0.2275]
[0.1839 0.168 0.16 0.1872 0.1964 0.2146 0.2404 0.2363 0.2748 0.0737
0.0019 0.3499 0.2275]
[0.1827 0.1671 0.1598 0.1871 0.1949 0.2138 0.2375 0.2345 0.2683 0.0716
0.0017 0.3508 0.2344]
[0.1827 0.1674 0.161 0.1879 0.1949 0.2138 0.2375 0.2364 0.2683 0.0716
0.0017 0.3508 0.2344]
[0.1827 0.1669 0.1616 0.1879 0.1948 0.2138 0.2413 0.2366 0.2698 0.0716
0.0017 0.351 0.2336]
[0.1827 0.1673 0.1614 0.186 0.1948 0.2138 0.2413 0.234 0.2698 0.0716
0.0017 0.351 0.2336]
[0.1827 0.1673 0.1611 0.1863 0.1926 0.21 0.2367 0.2335 0.2662 0.0716
0.0017 0.3462 0.2318]
[0.1827 0.1663 0.1597 0.1858 0.1926 0.21 0.2367 0.2331 0.2662 0.0716
0.0017 0.3462 0.2318]
[0.1833 0.1648 0.1566 0.1785 0.1896 0.2074 0.234 0.2278 0.2654 0.0717
0.0018 0.3387 0.2198]]

layer = ‘BANDS-S2-L1C’
save = SaveToDisk(‘io_example_2’, overwrite_permission=2, compress_level=1)
input_task = S2L1CWCSInput(layer=layer,
resx=‘10m’, resy=‘10m’, time_difference=datetime.timedelta(hours=2))

time_i = [‘2020-01-05’,‘2020-01-09’]

bb_2 = BBox(bbox=[-0.4213, 11.0086, -0.4211, 11.0084],crs=CRS.WGS84)

workflow = LinearWorkflow(input_task, save)

result_2 = workflow.execute({input_task: {‘bbox’: bb_2, ‘time_interval’: time_i},
save: {‘eopatch_folder’: ‘eopatch’}})

eopatch_2 = result_2[save]
image_2 = eopatch_2.data[‘BANDS-S2-L1C’]
d, w, h, bands = image_2.shape
img_2D_2 = image_2.reshape(d* w * h, bands)
img_2D_2.shape

print(img_2D_2)

[[0.1839 0.1721 0.1671 0.1998 1.0769 1.086 1.0973 0.2519 1.1153 0.0761
0.0019 1.1439 1.0934]
[0.1837 0.173 0.1727 0.2033 1.1589 1.1668 1.1765 0.2545 1.1925 0.0745
0.0019 1.2284 1.1927]
[0.1839 0.1713 0.1653 0.1973 1.0769 1.086 1.0973 0.25 1.1153 0.0761
0.0019 1.1439 1.0934]
[0.1837 0.1736 0.1728 0.2024 1.1589 1.1668 1.1765 0.2534 1.1925 0.0745
0.0019 1.2284 1.1927]]

MMM · June 11, 2020, 11:44pm

Some more:-)…

I dowloaded patch 2 in 16 bit. According to this link (https://www.sentinel-hub.com/faq/how-are-values-calculated-within-sentinel-hub-and-how-are-they-returned-output/)

“reflectance values where reflectance = 1 is at a pixel value of 65535 (format=image/tiff;depth=16)”

A reflectance value of 1 should equal 65535. But when comparing Patch 2 (32 bit) to Patch 2 (16 bit), the last value is 1.0934 and 10934 respectively (and not >65535). So something is wrong.

First Row Patch 2 (32 bit): [[0.1839 0.1721 0.1671 0.1998 1.0769 1.086 1.0973 0.2519 1.1153 0.0761
0.0019 1.1439 1.0934]

MMM · June 12, 2020, 4:58pm

Trying in 8-bit for Patch 2 gives negative values in some cases.

Again according to this link (https://www.sentinel-hub.com/faq/how-are-values-calculated-within-sentinel-hub-and-how-are-they-returned-output/), values should be between 0 and 255…not negative;-(

“if format is 8-bit, it is 0-255; 0 -> 0 and 1 -> 255”

[[ 47 -71 -121 -50 17 108 -35 -41 -111 -7 19 -81 -74]
[ 45 -62 -65 -15 69 -108 -11 -15 -107 -23 19 -4 -105]
[ 47 -79 117 -75 17 108 -35 -60 -111 -7 19 -81 -74]
[ 45 -56 -64 -24 69 -108 -11 -26 -107 -23 19 -4 -105]]

gmilcinski · June 12, 2020, 5:11pm

Hi @mmm,
thanks for all these debug inputs.
Can you perhaps check, whether these negative values come directly from the service or they happen somewhere later in the process of patch generation?
Thanks,
Grega

MMM · June 12, 2020, 7:55pm

I originally found these errors when using the ‘old’ service for the same bounding box locations:

from sentinelhub import FisRequest, BBox, Geometry, CRS, OsmSplitter, WcsRequest, WmsRequest, WcsRequest, MimeType, CustomUrlParam,
DataSource, HistogramType
from sentinelhub.time_utils import iso_to_datetime
import numpy as np
import itertools
import pandas as pd
#import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib import cm
from shapely.geometry import Point, box, shape, Polygon, MultiPolygon

wcs_bands = WcsRequest(layer='BANDS-S2-L1C',
                     bbox=mbx,
                     time=time_interval,
                     data_folder='./aaa/train/water',
                     resx='10m', resy='10m',
                     image_format=MimeType.TIFF_d32f,
                     instance_id=INSTANCE_ID)
wbox_f1 = wcs_bands.save_data()

I then tried downloading the same bounding boxes using EO-Learn - and verified the same issue (hence the week delay between my original email to support and my subsequent posting to the Technical Forum).

Both services give incorrect data.

MMM · June 12, 2020, 8:00pm

Forgot to mention;-)

I also spent time this week reviewing some datasets that I had downloaded in December - those datasets also have the same issue. So this problem seems to be at least 6 months old.

MMM · June 12, 2020, 8:03pm

Could you please verify, is this link correct in terms of the ranges the numbers should be? I.e. between 0-1, 0-65535 or 0-255 (depending on 32, 16, 8 bit)

https://www.sentinel-hub.com/faq/how-are-values-calculated-within-sentinel-hub-and-how-are-they-returned-output/

MMM · June 12, 2020, 8:05pm

This response says values greater than 1 are valid for 32-bit?

https://shforum.sinergise.com/t/sentinel-2-l2a-band-data-not-in-range-0-1/1656

Which makes it impossible for me to work out which data results are wrong or ok.

MMM · June 12, 2020, 8:12pm

I had also tried bounding boxes of different sizes: smaller than 10m and much larger. I thought it might be a mosaic-ing issue Again results appear erroneous in both small and large boxes.

gmilcinski · June 12, 2020, 8:29pm

I think, but not perfectly sure, that if you use 32f, then this rule of 0-1 does not apply as (this I am sure) you can get negative values as well.
And yes, the reflectance of L2A can be more than 1. It happens rarely and it is not clear why this happens (we are just using the data), but it does happen.

Anyway, exactly because of all this confusion we moved to EVALSCRIPT 3 and process API, where there is no “magic” happening anymore - whatever the data are, goes through.
See this blog post:
https://shforum.sinergise.com/t/impacts-of-the-migration-to-evalscript-v3/2090

It might be that when going through this process the FAQ you are referring to is no longer fully correct. I am pretty certain that things that are no longer correct are not super important (corner cases) but we will look into it and revise it.

Anyway, I strongly suggest you use process API and EVALSCRIPT V3 and you will get exactly the results you are expecting.

MMM · June 12, 2020, 10:12pm

Dear Grega,

Many thanks for the quick response;-)

I will try the Process API and Evalscript 3 on these bounding boxes.

The FAQ link you provided is the same as the one I referenced in my previous posts. This is what led me to uncertainty for 32bit floats - i.e. result values that are greater than 1. Since this page does not mention that occasionally values can exceed 1. (I would humbly suggest this page is updated with this info;-)

Your clarification that this is a result of the underlying data, confirms your colleague’s answer/post.

This is very helpful. I assumed (and probably others have also) that since the range for values will be between 0-1 (as per FAQ page), data normalisation is not required across bands (important for ML/AI training). This assumption is not correct. Easiest solution for me is to cap any band to 1, and for negative results make them 0’s (particular since these seem to be corner cases).

16 bit and 8 bit results still seem peculiar, as any 32 bit results that are greater than 1, should also then be greater than 65535 and 255 respectively(?)

I will try the Process API and Evalscript 3;-)

Best,
M

gmilcinski · June 14, 2020, 6:16am

I see that I linked a wrong FAQ post in the previous response, thank you for noticing. Updated now to the correct one.
As mentioned, in the V3 the data come in through as they come from the sensor, unless you do something with it in the EVALSCRIPT. Exactly as it should have been.