Skip to content

Conversation

franzpoeschel
Copy link
Contributor

@franzpoeschel franzpoeschel commented May 17, 2022

Not relevant for next release

  • Until now: Simulations specify their metadata in-code via API calls
  • With this PR: In some workflows (e.g. experiments) there is no omniscient simulation, but metadata is instead input by the experimentors via configuration files, using the API is not a good workflow for that

Idea: We already have a JSON backend, use an openPMD-conforming JSON dataset to define only metadata. With this, the configuration file will be just another openPMD dataset.
Then, add some functionality to initialize an empty Series from such a metadata file.

TODO:

https://github.com/franzpoeschel/openPMD-api/compare/topic-json-short-modes..topic-json-template

@franzpoeschel franzpoeschel added backend: JSON api: new additions to the API labels May 17, 2022
@franzpoeschel franzpoeschel force-pushed the topic-json-template branch from f10fc90 to c63c06a Compare May 18, 2022 12:26
@franzpoeschel
Copy link
Contributor Author

franzpoeschel commented May 18, 2022

An openPMD dataset in TOML:

[platform_byte_widths]
USHORT = 2
ULONG = 8
BOOL = 1
CLONG_DOUBLE = 32
LONGLONG = 8
CFLOAT = 8
CHAR = 1
DOUBLE = 8
CDOUBLE = 16
SHORT = 2
UCHAR = 1
FLOAT = 4
INT = 4
ULONGLONG = 8
UINT = 4
LONG = 8
LONG_DOUBLE = 16

[data]

[data.0]

[data.0.meshes]

[data.0.meshes.E]

[data.0.meshes.E.x]
datatype = "FLOAT"
data = [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]

[data.0.meshes.E.x.attributes]

[data.0.meshes.E.x.attributes.unitSI]
value = 1.0
datatype = "DOUBLE"

[data.0.meshes.E.x.attributes.position]
value = [0.0]
datatype = "VEC_DOUBLE"

[data.0.meshes.E.attributes]

[data.0.meshes.E.attributes.timeOffset]
value = 0.0
datatype = "FLOAT"

[data.0.meshes.E.attributes.gridUnitSI]
value = 1.0
datatype = "DOUBLE"

[data.0.meshes.E.attributes.gridSpacing]
value = [1.0]
datatype = "VEC_DOUBLE"

[data.0.meshes.E.attributes.gridGlobalOffset]
value = [0.0]
datatype = "VEC_DOUBLE"

[data.0.meshes.E.attributes.unitDimension]
value = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
datatype = "ARR_DBL_7"

[data.0.meshes.E.attributes.geometry]
value = "cartesian"
datatype = "STRING"

[data.0.meshes.E.attributes.dataOrder]
value = "C"
datatype = "STRING"

[data.0.meshes.E.attributes.axisLabels]
value = ["x"]
datatype = "VEC_STRING"

[data.0.attributes]

[data.0.attributes.timeUnitSI]
value = 1.0
datatype = "DOUBLE"

[data.0.attributes.time]
value = 0.0
datatype = "DOUBLE"

[data.0.attributes.dt]
value = 1.0
datatype = "DOUBLE"

[attributes]

[attributes.softwareVersion]
value = "0.15.0-dev"
datatype = "STRING"

[attributes.software]
value = "openPMD-api"
datatype = "STRING"

[attributes.openPMDextension]
value = 0
datatype = "UINT"

[attributes.meshesPath]
value = "meshes/"
datatype = "STRING"

[attributes.iterationFormat]
value = "many_iterations_%T"
datatype = "STRING"

[attributes.iterationEncoding]
value = "fileBased"
datatype = "STRING"

[attributes.openPMD]
value = "1.1.0"
datatype = "STRING"

[attributes.date]
value = "2022-05-18 12:20:23 +0000"
datatype = "STRING"

[attributes.basePath]
value = "/data/%T/"
datatype = "STRING"

@franzpoeschel franzpoeschel force-pushed the topic-json-template branch 3 times, most recently from 0d475a5 to 1a23a03 Compare May 19, 2022 11:54
@franzpoeschel
Copy link
Contributor Author

franzpoeschel commented May 19, 2022

This is now a simplified TOML openPMD template, created by {"json":{"mode": "template"}}:

[data]

[data.meshes]

[data.meshes.temperature]
extent = [5, 5]
datatype = "FLOAT"

[data.meshes.temperature.attributes]
timeOffset = 0.0
# Explicit datatype can still be used if needed
unitSI = {"value" = 1.0, "datatype" = "FLOAT"}
position = [0.0]
gridUnitSI = 1.0
gridSpacing = [1.0]
gridGlobalOffset = [0.0]
unitDimension = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
geometry = "cartesian"
dataOrder = "C"
axisLabels = ["x"]

[data.attributes]
timeUnitSI = 1.0
snapshot = 0
time = 0.0
dt = 1.0

[attributes]
softwareVersion = "0.15.0-dev"
software = "openPMD-api"
openPMDextension = 0
meshesPath = "meshes/"
iterationFormat = "/data"
iterationEncoding = "variableBased"
openPMD = "1.1.0"
date = "2022-05-19 11:55:07 +0000"
basePath = "/data"

Differences to regular JSON/TOML openPMD datasets:

  1. Platform byte width table is missing
  2. Attributes don't explicitly store their datatypes, datatypes are dynamically (and a bit heuristically) restored from what is there.
  3. No actual datasets can be written, instead just the extent is stored.

Template mode is also available in json:

{
  "attributes": {
    "basePath": "/data",
    "date": "2022-05-19 12:00:09 +0000",
    "iterationEncoding": "variableBased",
    "iterationFormat": "/data",
    "meshesPath": "meshes/",
    "openPMD": "1.1.0",
    "openPMDextension": 0,
    "software": "openPMD-api",
    "softwareVersion": "0.15.0-dev"
  },
  "data": {
    "attributes": {
      "dt": 1,
      "snapshot": 0,
      "time": 0,
      "timeUnitSI": 1
    },
    "meshes": {
      "temperature": {
        "attributes": {
          "axisLabels": [
            "x"
          ],
          "dataOrder": "C",
          "geometry": "cartesian",
          "gridGlobalOffset": [
            0
          ],
          "gridSpacing": [
            1
          ],
          "gridUnitSI": 1,
          "position": [
            0
          ],
          "timeOffset": 0,
          "unitDimension": [
            0,
            0,
            0,
            0,
            0,
            0,
            0
          ],
          "unitSI": 1
        },
        "datatype": "FLOAT",
        "extent": [
          5,
          5
        ]
      }
    }
  }
}

@franzpoeschel
Copy link
Contributor Author

Longer example:

[data]

[data.particles]

[data.particles.e]

[data.particles.e.positionOffset]

[data.particles.e.positionOffset.z]

[data.particles.e.positionOffset.z.attributes]
value = 3.14
unitSI = 1.0
shape = [5, 5]

[data.particles.e.positionOffset.y]

[data.particles.e.positionOffset.y.attributes]
value = 3.14
unitSI = 1.0
shape = [5, 5]

[data.particles.e.positionOffset.x]

[data.particles.e.positionOffset.x.attributes]
value = 3.14
unitSI = 1.0
shape = [5, 5]

[data.particles.e.positionOffset.attributes]
unitDimension = [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
timeOffset = 0.0

[data.particles.e.position]

[data.particles.e.position.z]
extent = [5, 5]
datatype = "FLOAT"

[data.particles.e.position.z.attributes]
unitSI = 1.0

[data.particles.e.position.y]
extent = [5, 5]
datatype = "FLOAT"

[data.particles.e.position.y.attributes]
unitSI = 1.0

[data.particles.e.position.x]
extent = [5, 5]
datatype = "FLOAT"

[data.particles.e.position.x.attributes]
unitSI = 1.0

[data.particles.e.position.attributes]
unitDimension = [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
timeOffset = 0.0

[data.particles.e.particlePatches]

[data.particles.e.particlePatches.numParticlesOffset]
extent = [5, 5]
datatype = "FLOAT"

[data.particles.e.particlePatches.numParticlesOffset.attributes]
unitSI = 1.0

[data.particles.e.particlePatches.numParticles]
extent = [5, 5]
datatype = "FLOAT"

[data.particles.e.particlePatches.numParticles.attributes]
unitSI = 1.0

[data.particles.e.particlePatches.offset]

[data.particles.e.particlePatches.offset.z]
extent = [5, 5]
datatype = "FLOAT"

[data.particles.e.particlePatches.offset.z.attributes]
unitSI = 1.0

[data.particles.e.particlePatches.offset.y]
extent = [5, 5]
datatype = "FLOAT"

[data.particles.e.particlePatches.offset.y.attributes]
unitSI = 1.0

[data.particles.e.particlePatches.offset.x]
extent = [5, 5]
datatype = "FLOAT"

[data.particles.e.particlePatches.offset.x.attributes]
unitSI = 1.0

[data.particles.e.particlePatches.offset.attributes]
unitDimension = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]

[data.particles.e.particlePatches.extent]

[data.particles.e.particlePatches.extent.z]
extent = [5, 5]
datatype = "FLOAT"

[data.particles.e.particlePatches.extent.z.attributes]
unitSI = 1.0

[data.particles.e.particlePatches.extent.y]
extent = [5, 5]
datatype = "FLOAT"

[data.particles.e.particlePatches.extent.y.attributes]
unitSI = 1.0

[data.particles.e.particlePatches.extent.x]
extent = [5, 5]
datatype = "FLOAT"

[data.particles.e.particlePatches.extent.x.attributes]
unitSI = 1.0

[data.particles.e.particlePatches.extent.attributes]
unitDimension = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]

[data.meshes]

[data.meshes.temperature]
extent = [5, 5]
datatype = "FLOAT"

[data.meshes.temperature.attributes]
timeOffset = 0.0
unitSI = 1.0
position = [0.0]
gridUnitSI = 1.0
gridSpacing = [1.0]
gridGlobalOffset = [0.0]
unitDimension = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
geometry = "cartesian"
dataOrder = "C"
axisLabels = ["x"]

[data.meshes.E]

[data.meshes.E.z]
extent = [5, 5]
datatype = "FLOAT"

[data.meshes.E.z.attributes]
unitSI = 1.0
position = [0.0]

[data.meshes.E.y]
extent = [5, 5]
datatype = "FLOAT"

[data.meshes.E.y.attributes]
unitSI = 1.0
position = [0.0]

[data.meshes.E.x]
extent = [5, 5]
datatype = "FLOAT"

[data.meshes.E.x.attributes]
unitSI = 1.0
position = [0.0]

[data.meshes.E.attributes]
timeOffset = 0.0
gridUnitSI = 1.0
gridSpacing = [1.0]
gridGlobalOffset = [0.0]
unitDimension = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
geometry = "cartesian"
dataOrder = "C"
axisLabels = ["x"]

[data.attributes]
timeUnitSI = 1.0
snapshot = 0
time = 0.0
dt = 1.0

[attributes]
softwareVersion = "0.15.0-dev"
particlesPath = "particles/"
software = "openPMD-api"
openPMDextension = 0
meshesPath = "meshes/"
iterationFormat = "/data"
iterationEncoding = "variableBased"
openPMD = "1.1.0"
date = "2022-05-19 15:26:37 +0000"
basePath = "/data"

@franzpoeschel franzpoeschel force-pushed the topic-json-template branch 5 times, most recently from 475be7b to ee8bdf1 Compare May 23, 2022 11:14
@franzpoeschel franzpoeschel changed the title Use JSON/TOML template for defining openPMD metadata in a config file [WIP] Use JSON/TOML template for defining openPMD metadata in a config file May 23, 2022
@franzpoeschel franzpoeschel force-pushed the topic-json-template branch 3 times, most recently from d32fff3 to 376bc2a Compare July 5, 2022 09:23
@franzpoeschel franzpoeschel force-pushed the topic-json-template branch 2 times, most recently from 3ee509d to a662865 Compare July 21, 2022 16:35
@franzpoeschel
Copy link
Contributor Author

Notes for myself on the recent reodering of commits:

5 3ee509de (HEAD -> topic-json-template, origin/topic-json-template) Properly deal with undefined datasets
2 06da2d58 Make JSON and TOML look like two different backends
5 960ab21a Initialize Dataset definitions from template
5 b88bae67 Initialize Series attributes from template
3 6302a33c Fix NVHPC Toml11 open mode
2 d825008b Fix precision-losing type conversion
4 da960a23 Enable .toml tests in generic tests
4 0398b86f Extend example
3 7332996e Windows compatibility
x 85527799 Add and use Attribute::getOptional<T>()
1 64cde966 Template mode: Fill with zero upon read
1 fa483843 Write/read shorthand attributes without explicit datatype
3 bd8da013 CI fixes
1 d802d2ac Don't write platform datatype size table in template mode
2 cba71f7f Use .toml as filename extension
2 b019a7d1 TOML as alternative backend for JSON backend
1 4b25de8c Select template mode via JSON param
1 8ef4753f Add template mode to JSON backend

@franzpoeschel franzpoeschel force-pushed the topic-json-template branch 2 times, most recently from 1db63f6 to 55b72f8 Compare July 29, 2022 09:00
@franzpoeschel franzpoeschel force-pushed the topic-json-template branch 2 times, most recently from b07a2a8 to a41c2c6 Compare August 17, 2022 09:21
// throw error::WrongAPIUsage(
// "[RecordComponent] Must set specific datatype (Use "
// "resetDataset call).");
// }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this check was inactive up to now, since RecordComponentData::RecordComponentData initialized that field with Datatype::CHAR. Using an optional would make these things more obvious and avoid such pitfalls.
To be done in a different PR though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#1316 now uses std::optional

@franzpoeschel franzpoeschel force-pushed the topic-json-template branch 2 times, most recently from 2f6b7d1 to 68dc5e5 Compare March 26, 2024 11:49
@franzpoeschel franzpoeschel force-pushed the topic-json-template branch from 68dc5e5 to fc1578c Compare May 14, 2024 14:30
@franzpoeschel franzpoeschel force-pushed the topic-json-template branch from fc1578c to 2249002 Compare May 30, 2024 08:17
@franzpoeschel franzpoeschel force-pushed the topic-json-template branch from 2249002 to b11d83c Compare June 7, 2024 12:37
@franzpoeschel franzpoeschel force-pushed the topic-json-template branch 2 times, most recently from df04b3e to 074a93d Compare August 5, 2024 09:45
@franzpoeschel franzpoeschel force-pushed the topic-json-template branch 2 times, most recently from c85e3e7 to 5863d6b Compare December 17, 2024 12:52
@franzpoeschel franzpoeschel force-pushed the topic-json-template branch 2 times, most recently from 2e38ad8 to cd86207 Compare April 4, 2025 08:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: new additions to the API backend: JSON
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants