Question about NO DATA labels in eopatches

TLDR: What is causing the pixels in the examples to be labeled as NO_DATA, and how are these pixels used? If they are not used, how did you exclude them from the data without doing some sort of resizing?

Hello, I have another question that has been troubling me. Specifically in the SI_LULUC_pipeline and also in the eoflow example one of the labels is NO-DATA. I am wondering where this comes from. What is causing these pixels to be labeled as no data? Are they being used for prediction or are they being simply ignored in some way? How do the NO-Data pixels differ from NaNs in these examples? Sorry have just been confused about this.

Hi @ncouch

The NO_DATA label is assigned to pixels for which we don’t have any reference data. This could happen if your reference data is sparse (doesn’t cover whole AOI), or if you are eroding your pixels to have more clean data (e.g. removing pixels from border of the forest to reduce mixing of forest class with something else).

Later before the training part, pixels with such labels are removed, as we do not want the model to learn to classify pixels to NO_DATA class. See for instance line 23 in eo-learn LULC example model construction and training:

# Remove points with no reference from training (so we dont train to recognize "no data")
mask_train = labels_train == 0
features_train = features_train[~mask_train]
labels_train = labels_train[~mask_train]

# Remove points with no reference from test (so we dont validate on "no data", which doesn't make sense)
mask_test = labels_test == 0
features_test = features_test[~mask_test]
labels_test = labels_test[~mask_test]

As you can see in the confusion matrix plots, the NO_DATA class does not exist.

Hope this answers your questions. All the best!

Hello! Thank you for answering my question this does help me. I do have two further one based on this answer.

  1. So we are removing the pixels that are labeled as no data, how is this done without changing the patch size? This may seem trivial, I am just confused on this.
  2. Would this approach of removing these pixels labeled as NO-DATA work for all eo examples? I am working on two, one being the LULC example with LightGBM and the other being the EO_Flow example using TFCN.
    [/quote]

Hi @ncouch,

I’ll try to answer, although I might be interpreting your question wrongly…

Think of the NO_DATA pixels as pixels without any label/class attached to them. So the pixels are still there (they have satellite data, masks etc.), but they are not assigned any class/label, so they are (in pixel-based approaches) not used for training (as shown in the code above) and thus never predicted.

In case of convolutional networks using spatial convolutions (e.g. predicting on patchlets/images instead of pixels), one would typically either train on patchlets that have no NO_DATA pixels, or train with NO_DATA class as a “background” class and then, during the prediction, not use this class (e.g. only use the rest of the classes as input to max pooling.)