Calculate mean and median daily NDVI using Processing API

Hello everyone,

I’m trying to calculate the mean and the median of daily NDVI using the Processing API (I’m aware that there’s the Statistical one, but would like to proceed with just one request, since I’m also fetching weekly max NDVI with this script - not included). The problem I’m facing is that the calcDailyNDVI function only returns arrays filled with 0.

//VERSION=3
function setup() {
    return {
        input: [{bands: ["B03", "B04", "B08", "SCL", "CLD", "dataMask"], units: "DN"}],
        output: [
            {id: "meanNDVI", bands: 1, sampleType: "UINT16"},
            {id: "NDVI", bands: 1, sampleType: "UINT16"}],
        mosaicking: "ORBIT"
    };
}

const MIN_INT16 = -32768
const OFFSET = 32767
const FACTOR = 10000
let sampleDates = []

function calcNDVI(sample) {
    // Invalid data.
    if (!isValid(sample)) {
        return MIN_INT16
    }

    // Cloud detection.
    if (isCloud(sample)) {
        return -2
    }

    // Water detection.
    if (isWater(sample)) {
        return 3
    }
    if (isWaterCandidate(sample)) {
        return 2
    }

    return index(sample.B08, sample.B04)
}


function preProcessScenes (collections) {
    // Sort scenes by dateFrom in ascending order
    collections.scenes.orbits.sort(function (s1, s2) {
        const date1 = new Date(s1.dateFrom)
        const date2 = new Date(s2.dateFrom)
        return date1 - date2
    })
    return collections
}

// Function to calculate daily NDVI
function calcDailyNDVI(samples, scenes) {
    let dailyNDVI = {};

    samples.forEach((sample, index) => {
        const date = scenes[index].date.toISOString().split('T')[0];
        const ndvi = calcNDVI(sample);
        
        if (!dailyNDVI[date]) {
            dailyNDVI[date] = [];
        }
        dailyNDVI[date].push(ndvi);
    });

    return dailyNDVI;
}


function updateOutput(outputs, collections) {
    let uniqueDailyDates = new Set()

    collections.scenes.forEach(scene => {
        uniqueDailyDates.add(scene.date.toISOString().split('T')[0])
    })


    sampleDates = Array.from(uniqueDailyDates).sort()
    outputs.medianNDVI.bands = Math.max(1, sampleDates.length)
    outputs.meanNDVI.bands = Math.max(1, sampleDates.length)
}


function updateOutputMetadata(scenes, inputMetadata, outputMetadata){
    outputMetadata.userData = {
        "dates":  [...sampleDates]
    }
}


function evaluatePixel(samples, scenes) {
    let dailyNDVI = calcDailyNDVI(samples, scenes)

    return {
        medianNDVI: sampleDates.map(x => {
            const ndviValues = dailyNDVI[x];
            return ndviValues ? calculateMedian(ndviValues) * FACTOR + OFFSET : 0;
        }),
        meanNDVI: sampleDates.map(x => {
            const ndviValues = dailyNDVI[x];
            return ndviValues ? calculateMean(ndviValues) * FACTOR + OFFSET : 0;
        })
    }
}


function calculateMedian(values) {
    if (values.length === 0) {
        return undefined; 
    }
    
    values.sort((a, b) => a - b);

    const n = values.length;
    if (n % 2 !== 0) {
        return values[Math.floor(n / 2)];
    } else {
        const mid = n / 2;
        return (values[mid - 1] + values[mid]) / 2;
    }
}


function calculateMean(values) {
    if (values.length === 0) {
        return undefined; 
    }
    const sum = values.reduce((acc, curr) => acc + curr, 0);
    return sum / values.length;
}

Thanks a lot in advance for any help!

Hello,

unfortunately Process API does not work how you are planning it to work. evaluatePixel only ever works on one pixel at a time.

So if you have a look at the dailyNDVI variable it looks like this:

{\"2024-01-10\":[0.09416628387689481],\"2024-01-15\":[-0.005043013942450311],\"2024-01-20\":[0.07482993197278912],\"2024-01-25\":[0.1424629182476716],\"2024-01-30\":[0.7732629012596506],\"2024-02-04\":[0.7909037212049616]}

so always only one value per day. And there’s no real way around that. In your case I would suggest to use the statistical API since It does exactly what you need here.

Hello,

Thank you very much for you reply.
Have you gotten these variables with the Processing API or the Statistical one? I’m just wondering because my code only returns an array filled with 0s. If it’d would return one value per day (mean and median), it’d also be ok, i.e. if I get the result you posted with my code it’d be sufficient.

What I’m wondering is, why would dailyNDVI only return one value instead of a tiff file? I use the same code for weekly max NDVI and it works correctly.

So in the end what is it you want to achieve? From the code I thought you want to get the median and mean over your entire area of interest for every day in the time series.

dailyNDVI is reinitialized every time that calcDailyNDVI() is called, which is for every single pixel. The output I posted thus corresponds to all the NDVI values for the first pixel in your area of interest.

To reiterate: evaluatePixel(sample) is called as many times as there are pixels in your request. It never returns the full image, it will only ever return one pixel at a time. This is why your approach doesn’t work in this case. The statistical API is what should be used in cases like this.

I see. What I’d like to achieve is: get daily NDVI tiles by calling calcDailyNDVI in evaluatePixel, then calculate for each of these tiles their mean and median and have one value per day, instead of an array. But maybe I didn’t understand then how evaluatePixel actually works.

Maybe an easier approach would be to get the daily NDVI tiles available in a time range and then calculate the mean and median outside the eval script, i.e. :

function evaluatePixel(samples, scenes) {
    let dailyNDVI = calcDailyNDVI(samples, scenes)

    return {
        dailyNDVI: sampleDates.map(x => dailyNDVI[x]).map(y => y !== MIN_INT16 ? y * FACTOR + OFFSET : 0)
    };

But still here I get an array of 0s.

array([[[0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
        ...,
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0]],

       [[0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
.....

So that’s why I’m wondering if you used my code to get these first pixel values or not.

This really isn’t how Process API works. Doing it like that might work but it really isn’t intended.

The best approach is to do two requests. One for the long term mean and another one for the spatial mean and median using the Statistical API.

Using the statistical API to get your desired values is much more straight forward than what you are trying to do here. It really is much more convenient, even if it is one more request.

You can see some examples here: Examples of Statistical API

Edit: To get the median with a statistical API request you need to request the 50th percentile value. See here: Statistical API

Thanks for the reply. I understand your point, but given that I need to process a large load of requests, this would mean 2x the number of requests I’d have to make, i.e. one Processing + one Statistical.

So, instead I’d just like to get the daily NDVI tiles if possible (without calculating the mean and median in the evaluation script), for this I believe the Processing API would be the most convenient one. And still I cannot get this right.

Yes, that works. In that case what is tripping you up is likely that samplesDates in evaluatePixel is not defined.

But also keep in mind that doing it in this way you need to transmit a lot more data per request. The number of requests might be lower but the amount of transmitted data is much higher than with one process API call and one statistical API call.

I see, I thought I had defined it in updateOutput accordingly, it even gets printed correctly in the userdata metadata. So I guess the dates I’m passing to the dict dailyNDVI are not the same… even though they were defined in the same way i.e. using either “.date.toISOString().split(‘T’)[0]” or JSON.stringify (not in the code).

As I’m fairly a beginner to using evaluation scripts, one thing I still haven’t understood after reading the docs, is if the samples are already sorted by date. Is this the case?

Yes, samples are already sorted by date. The default is for the samples to be sorted by most recent.