Equal Point Sampling

Hello forum@sentinel hub,
I have been trying to make a custom version of the example in land-cover-classification-with-eo-learn tutorials.
Unfortunately I find that the result of my classification is that PointSamplingTask only finds 3 feature classes, out of 10 in total. As I understood in the example it takes 40.000 pixels from 1.000.000 pixels. I have tried with the PointSamplingTask set to 800.000 pixels as an experiment, and it did find more feature classes but it caused the GBM machine learning algorithm to overfit.

Therefore my question: is there a way to set a minimum amount (for example 20.000 pixels) for every unique feature class for the PointSamplingTask?

Thanks a lot in advance,
Thijs

Hi Thijs,

can you be more specific of what kind of changes you did in the example? Could it be that youā€™re sampling from a single EOPatch, which is only covered in majority with 3 different classes (some class(es) represent only very small fraction of patch area)?

Otherwise PointSampligTaks does have an option to sample the same number of pixels for each class. The parameter is called even_sampling is described here. But beware, that if for example only 1000 pixels belong to class A and if you request minimum 20000 pixels to be sampled for each class then all pixels from class A will be sampled in average 20 times. This will limit modelā€™s ability to generalizeā€¦

Hey Anze,
Yes you are correct, I am sampling form one single EO Patch, which is covered in majority with 3 different classes as you said.

Im not sure how to use the even_sampling task in this case, could you give an example? Should I use PointRasterSampler instead of PointSamplingTask or is there a way I can give even_sampling as input to PointSamplingTask?

Thanks for your help.

I believe you can simply set even_sampling=True in the constructor of PointSamplingTask. It will be passed to PointRasterSampler.

I have the feeling that I am missing something. So first I define the PointSamplingTask as:

spatial_sampling = PointSamplingTask(
    n_samples=n_samples, 
    ref_mask_feature='LULC_ERODED', 
    ref_labels=ref_labels, 
    sample_features=[  # tag fields to sample
        (FeatureType.DATA, 'FEATURES'),
        (FeatureType.MASK_TIMELESS, 'LULC_ERODED')
    ], even_sampling=True) 

How can I then call the PointRasterSampler? With:
PointRasterSampler( spatial_sampling, even_sampling = True) ?

Perhaps Iā€™m misunderstanding your question. If thatā€™s the case, pleaser let me know.

One of the arguments of PointSamplingTask are **sampling_params, which can according to the documentation be: Any other parameter used by PointRasterSampler class. The PointSamplingTask creates in its execute method an instance of PointRasterSampler with **sampling_params passed to it.

Dear Anze,
Thanks for your help, it did indeed sample my pixels evenly in the available classes, but it brought me to a different issue answered in this post: Performing ErosionTask on MASK_TIMELESS.LULC with very few valid pixels

Since I have very few pixels for certain classes, they got eroded in the erosion step. Even though the erosion step by default is set to 1 pixel.

But now I have a different issue, because I tried to only erode the major classes in my patch, but that resulted in a map of only those 3 major classes. In the reply to William you mentioned giving different strengths for Erosion to different classes, how to do this? This is not clear to me through the documentation.

Thanks again for the ongoing help,
Thijs

Hi Thijs,

Letā€™s assume that you wish to erode a timeless mask feature with name ā€œTRUTHā€, which has 4 classes labeled 1,2,3,4 and 0 is the no-data value. Classes 1 and 2 can be eroded with disk with radius 3, class 3 with disk with radius 1 and keep class 4 as is, for example, as:

erode_12 = ErosionTask((FeatureType.MASK_TIMELESS, 'TRUTH'),  disk_radius=3, erode_labels=[1,2])
erode_3  = ErosionTask((FeatureType.MASK_TIMELESS, 'TRUTH'),  disk_radius=1, erode_labels=[3])

In summary, for each morphological operation (small/large erosion) you need to create a task, which however acts only on subset of classes/labels specified via erode_labels argument.

Best regards,

Anze

Hi Anze,
This procedure is clear to me. However it overwrites my existing mask made in erode_12 made in step one. Lets take the example classes from the tutorial:
erode_1 = ErosionTask((FeatureType.MASK_TIMELESS, ā€˜LULCā€™, ā€˜LULC_ERODEDā€™), disk_radius=3, erode_labels=[0,1,2,9])
erode_2 = ErosionTask((FeatureType.MASK_TIMELESS, ā€˜LULCā€™, ā€˜LULC_ERODEDā€™), disk_radius=1, erode_labels=[5,8])

With erode_1 I erode my majority covered classes, and with erode_2 my smaller coverage classes. Unfortunately, I end up with a LULC_ERODED mask with only the results from erode_2.

I have tried only working in the ā€˜LULCā€™ timeless mask feature as per your example where you only have the ā€˜TRUTHā€™ timeless mask feature but this just overrode my LULC class with the same result.

Why did I loose my erode_labels=[0,1,2,9] in the result ?

Regards,
Thijs

Hi Thijs,

OK, now I understand the problem. It looks like the ErosionTask keeps only labels that are being eroded. The rest are set to no_data_value. IMO, this is a bug and not a feature.

For the time being you can use this custom task (until the ErosionTask is fixed):

class ClassFilterTask(EOTask):   
    """
    Run class specific morphological operation.
    """
    def __init__(self, lulc_feature, lulc_values, morph_operation, struct_elem=None):
        self.lulc_feature_type, self.lulc_feature_name = next(iter(self._parse_features(lulc_feature)))
        self.lulc_values=lulc_values
        
        if isinstance(morph_operation, MorphologicalOperations):
            self.morph_operation = MorphologicalOperations.get_operation(morph_operation)
        else:
            self.morph_operation = morph_operation
        self.struct_elem = struct_elem
    
    def execute(self, eopatch):
        lulc = eopatch[self.lulc_feature_type][self.lulc_feature_name].copy()
        
        for lulc_value in self.lulc_values:
            lulc_mod = self.morph_operation((lulc==lulc_value).squeeze(), self.struct_elem) * lulc_value
            lulc_mod = lulc_mod[...,np.newaxis]
            lulc[lulc==lulc_value]=lulc_mod[lulc==lulc_value]
        
        eopatch.add_feature(self.lulc_feature_type, self.lulc_feature_name, lulc)

        return eopatch

which can be then used as

from eolearn.ml_tools import MorphologicalFilterTask, MorphologicalOperations, MorphologicalStructFactory

erode_1 = ClassFilterTask((FeatureType.MASK_TIMELESS, 'LULC'),[1,2,9],
                          MorphologicalOperations.EROSION,struct_elem=MorphologicalStructFactory.get_disk(3))
erode_2 = ClassFilterTask((FeatureType.MASK_TIMELESS, 'LULC'),[5,8],
                          MorphologicalOperations.EROSION,struct_elem=MorphologicalStructFactory.get_disk(1))

Please note that you shouldnā€™t include no_data_value (0) to labels to be eroded.

I hope this will work for you.

2 Likes

Dear Anze,
This perfectly described a fix for the issue that I had, thanks for the outstanding support!
Regards,
Thijs

1 Like