Does extrapolation in interpolation eo-learn classes work properly?

Hi everyone,

I’m using eo-learn python package. I have an image timeseries (i.e. t x m x n x b) where I need to interpolate on 29 missing dates. Those dates are positioned in all possible places in the timeseries, i.e. right edge, left edge, in the middle. A couple of them are consecutive especially at the beginning of the timeseries.

The issue is that the sequence of interpolated images only at the left edge of the timeseries is full of NaN values, despite I explicitly use the extrapolation option that functions such as scipy.interpolate.UnivariateSpline (for SplineInterpolation) and scipy.interpolate.interp1d (for CubicInterpolation) use. Default option in numpy.interp (for LinearInterpolation) does not work either.

In this thread an answer from Matic (@matic.lubej) gives more insight on the extrapolation issue.
I also took a glance at the source code here on lines 324-325.
All these lead me to believe that extrapolation on many consecutive dates is not possible with eo-learn.

So is extrapolation still an issue or I miss something?
Below I provide an example of my code:

# compute valid data mask
s3_eop_cp.mask['IS_VALID_S3'] = ~np.isnan(s3_eop_cp.data['S3'])

# Compute full timeseries dates as datestrings
resampled_range = (days_to_datetimes(2263)[0].strftime('%Y-%m-%d'), days_to_datetimes(2355)[0].strftime('%Y-%m-%d'), 1)

# Compute interpolation function
cubic_interp = CubicInterpolation(feature=(FeatureType.DATA, 'S3', 'S3_interp'), mask_feature=(FeatureType.MASK, 'IS_VALID_S3'), copy_features=[(FeatureType.DATA, 'S3'), (FeatureType.MASK, 'IS_VALID_S3')], resample_range=resampled_range, interpolate_pixel_wise=False, **{'fill_value': 'extrapolate', 'bounds_error': False})

# Apply interpolation
s3_eop_cp = cubic_interp.execute(s3_eop_cp)

Hi @kvlachos.geo

We made a “design decision” not to allow extrapolation to the dates before the first (non-nan) and after the last (non-nan) observations.

The decision comes from the fact that we can always request more satellite data before/after, and if we cannot, than we do not know how to do a proper extrapolation; in the simplest case we would be just copying the first non-nan values to extrapolate to dates before first, or copying the last non-nan values to extrapolate to dates after last (which can both be rather simply achieved with numpy…)

On the other hand, the interpolation to several consecutive dates, which are between two (valid) observations should work with any of the interpolation functions.

Best,
Matej

1 Like

Hi Matej,

Thank you for your fast response!
I see; what you mention is surely a valid reason not to implement any extrapolation approach, I agree.

I will try to reframe my problem then.
Thanks again! :slightly_smiling_face:

Best Regards,
Kostas