Skip to content

Conversation

@Rahul29999
Copy link

Description

Fixes Issue #122 by shortening embedding output filenames using a simplified function model_id_to_filename(), which extracts the final part of the HuggingFace model ID.

Changes Made

  • Added model_id_to_filename() function to extract short model names
  • Used it in output JSON file naming logic in cookbook/populate_embeddings.ipynb

Outcome

This resolves the issue of long file names like:
prompt_sentences-sentence-transformers-all-MiniLM-L6-v2.json
and converts them to shorter ones like:
prompt_sentences-all-minilm-l6-v2.json

Let me know if further refinements are needed!

Signed-off-by: Rahulkumarsharma01 <20je0749@iitism.ac.in>
@Rahul29999 Rahul29999 force-pushed the fix-short-filenames branch from 7437f29 to b332bf1 Compare August 6, 2025 19:00
@Rahul29999
Copy link
Author

Hi @cassiasamp , I noticed the acknowledgment on the issue thread 👍. Just checking in here on the PR (#123) to see if it’s ready for merge or if you’d like me to make any further changes.

@cassiasamp
Copy link
Collaborator

Hi @Rahul29999, in the notebook I reviewed there was no change regarding the json filenames. Just to be sure, am I looking at the correct notebook?

There is also an specification about the issue, that I am probably going to fix because that might have been confusing.. but the issue mainly has to do with the generated and translated json files.

I believe there can also be an update to the populate_embeddings.ipynb cookbook, but the idea was to create a new notebook in which these json files names were shortened. Does that make sense?

@cassiasamp
Copy link
Collaborator

Hi @Rahul29999, I'm temporarily unassigning the #122 issue due to doubts regarding this PR. If there are any updates, I can reassign it later ✌️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants