Skip to content

Conversation

@HugoOnghai
Copy link
Collaborator

Summary

New source code to find old DOI records, update them, and post new records.
some changes and misc. notes

  • core.py contains Python functions that will be used to orchestrate the constant updating and matching of OSTI's ELink Record database and our mongo DOI collection
  • no handling of "missing sponsor organization identifiers" is handled by the update/posting scripts

test_core.py contains the steps to ...

  1. turn a stored json file into an ELink record
  2. post that record to the review environment
  3. take the resulting record response and make a doi document for our mongo doi collection
  4. (Temporary fix due to using json files instead of prod-environment directly) updates missing datetime information in the collection by querying matching records in the production environment
  5. from this datetime information, find the records that need to be updated with new robocrys descriptions
  6. update those records on the ELink Review Environment

Todos

  • make new DOI collection in mp-core. Currently all tests are done on my local mongo database
  • build more tests
  • eventually get code operating for mp-core and ELink's production environment
  • use uv to package the project and get it onto TestPyPI (for Dagster use)

HugoOnghai and others added 16 commits August 5, 2025 09:37
after upgrading... pip freeze > requirements.txt

the old requirements are saved in oldrequirements.txt for reference
- deserialization works
- re-serialization does not, there are additional dictionary keys that exist that don't match the provided config_file. will need to discuss the goal of this doi_builder a bit more to understand it
…on files) on ELink found bug with rows greater than 100 on ElinkAPI query_records (144845 dois under 10.17188, 12 are not titled Materials Data On... (edge cases), 144833 Materials have DOIs)
* move old code to 'legacy'

* setup project using uv

* add license

* testing skeleton

* gh actions skeleton

* remove old reqs file to prevent dependabot alerts

---------

Co-authored-by: Tyler Mathis <35553152+tsmathis@users.noreply.github.com>
Copy link
Collaborator

@tsmathis tsmathis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copied over my review comments from the previous PR (forgot to add the review there)

@tsmathis
Copy link
Collaborator

tsmathis commented Aug 6, 2025

is oldrequirements.txt needed?

@tsmathis
Copy link
Collaborator

tsmathis commented Aug 6, 2025

re: the models.py and doi_builder.py files, would it make sense to rename doi_builder.py to models.py since it only has pydantic models in it?

And to also move MinimumDARecord to models.py?

@tsmathis tsmathis merged commit 2fbaff7 into materialsproject:master Aug 7, 2025
0 of 3 checks passed
@tsmathis tsmathis linked an issue Aug 8, 2025 that may be closed by this pull request
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add core functionality for using Elink 2.0 python client

2 participants