Skip to content

Incorrect ids in dg_regions #18

@dpprdan

Description

@dpprdan

There are errors in regions.csv/dg_regions with respect to the AGS/ids of Gemeinden in Gemeindeverbänden in Lower Saxony and Rhineland-Palatinate if I'm not entirely mistaken.

Regionaldatenbank Deutschland used to contain data for both Gemeindeverbände as well as Gemeinden in Lower Saxony and Rhineland-Palatinate. dg_regions contains the corresponding ids:

library(datenguideR)
get_region(name, "Tostedt")
#> # A tibble: 2 x 4
#>   id          name    level parent  
#>   <chr>       <chr>   <chr> <chr>   
#> 1 03353406    Tostedt lau   03353   
#> 2 03353406035 Tostedt lau   03353406
get_region(name, "Cochem")
#> # A tibble: 2 x 4
#>   id         name   level parent 
#>   <chr>      <chr>  <chr> <chr>  
#> 1 0713500020 Cochem lau   07135  
#> 2 0713501020 Cochem lau   0713501

"03353406" is a (shortened) Regionalschlüssel of Samtgemeinde Tostedt, the Gemeindeverband, and "03353406035" is the (shortened) "Regionalschlüssel of Tostedt. ("shortened", because both are missing a "5" in sixth place, but this is how they were used in Regionaldatenbank Deutschland. I don't know/recall the reason for this).

However, these are not used by Regionaldatenbank Deutschland anymore, see the https://www.regionalstatistik.de/ homepage:

Ab 01.02.2019 wird der gesamte Datenbestand der Regionaldatenbank Deutschland mit den korrekten AGS-Gebietsschlüsseln gemäß Gemeindeverzeichnis angeboten. Von den Änderungen sind alle Gemeindetabellen für die Länder Niedersachsen und Rheinland-Pfalz betroffen, welche bisher länderspezifische Gebietsschlüssel enthielten. Die Ebene der Gemeindeverbände (LAU 1) wird für beide Länder nicht mehr nachgewiesen, da diese Ebene nicht im AGS abgebildet wird.

So unless these ids are somehow necessary for backward compatibility, I think they can be omitted from dg_region/regions.csv.

More importantly, however, the correct AGS/ids are missing for Gemeinden in Gemeindeverbänden in the abovementioned Bundesländer. Tostedt's actual AGS is 03353035, Cochem's is 07135020, for example, see e.g. Destatis' Gemeindeverzeichnis. These are also the ones used by Regionalstatistik, see e.g. table 12411-01-01-5.

I strongly suspect that this issue has to be fixed upstream as well. For example

{
  region(id: "03353035") {
    id
    name
    BEVSTD {
      year
      value
    }
  }
}

returns "region": null on https://api-next.datengui.de/graphql, whereas the same query with id: "03353406" or "03353406035" returns id and name (but no BEVSTD data). The same is returned for Gemeinden in Gemeindeverbänden by the following query (the first six municipalities are not in Gemeindeverbänden, so their AGS is correct and hence data is returned).

{
  allRegions(itemsPerPage: 50) {
    page
    total
    itemsPerPage
    regions(lau:1, parent:"03353") {
      id
      name
      BEVSTD(year: 2017) {
        year
        value
      }
    }
  }
}

The corresponding query on tabular.genesapi.org returns the appropriate data, but shows the ids as labels for the Gemeinden in Gemeindeverbänden and not their names.

Querying for the region id directly returns data for 03353035 and not data for 03353406 or 03353406035.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions