Sentinel 1 productInfo.json and metadata.xml in product root

Hey Sentinel folks,

I’m starting to catalog the Sentinel 1 products and noticed that the productInfo.json is an empty file and metadata.xml are not present in the product root dir. Those 2 files we used to gather metadata and populate our catalog. We also used the product GML files for geometry info.

For example in this product path, it has an empty productInfo.json and missing metadata.xml file, what would be the best files to parse to grab the necessary metadata to catalog?

s3://sentinel-s1-l1c/GRD/2018/11/15/IW/DV/S1B_IW_GRDH_1SDV_20181115T170549_20181115T170614_013618_019365_8C97

I do not think we have metadata.xml in Sentinel-1 files (in S-2 it is provided by ESA).
This specific productInfo seems to be erroneous. We will investigate this but if it is a sole or rare case, I cannot promise we will treat it as a high priority.
You can always get missing data from the OpenHub.

I’ve noticed that all the productInfo.json files are empty in every product directory I evaluated.

I don’t know much about OpenHub, does that contain Sentinel 1 metadata like I was able to get from the Sentinel 2 metadata.xml and productInfo.json ?

I checked a few random products and they all had productInfo.json.
OpenHub is the source of these data (http://scihub.copernicus.eu/) and no, it does not contain metadata.xml. If it would have it, it would be on AWS as well.

@gmilcinski Thanks for checking, did the productInfo.json files contain data? When I checked, the file was empty, 0 bytes

Well, this one for example has data
https://roda.sentinel-hub.com/sentinel-s1-l1c/GRD/2018/11/15/EW/DH/S1A_EW_GRDM_1SDH_20181115T014318_20181115T014423_024592_02B346_9BA2/productInfo.json

@gmilcinski Okay yes I can see the metadata now in that productInfo.json. I attached an SQS queue to your SNS topic and the first 50 or so products all had missing metadata so I assumed they all did. Just a heads up

@gmilcinski I just did some listing of the Sentinel 1 bucket and it seems like the productInfo.json file gets populated after the file was written. It seems like the file is populated 24 hours after being written to S3. The file is typically 3kb and is very consistent on the 14th of November for example.

command I’m using to list:

aws s3 ls --summarize --recursive --request-payer --human-readable s3://sentinel-s1-l1c/GRD/2018/11/15/ | grep productInfo.json

Note that we have identified the problem and hopefully solved it. These issues should not be appearing in the future.
As there might have been some other corrupted files, we have temporarily removed S1 GRD products from 15th to 24th November and are downloading them once again. It might take a few weeks to fully recover the backlog.

The problem was fixed, the missing products were re-ingested.
We have also ingested all the data until 1/1/17. There are no older data available in the Hubs unfortunately so this will stay so until foreseeable future.