Skip to content

Conversation

Copy link

Copilot AI commented Nov 12, 2025

Addresses feedback to extract hardcoded default filename logic into a task-type-specific configuration pattern, with path parameter now optional to respect user-provided values. Also optimizes to eliminate redundant API calls.

Changes

  • Added DEFAULT_DATA_FILENAME dictionary: Maps task types to their default data filenames ("simulation_data.hdf5" for FDTD/MODE_SOLVER/etc., "cm_data.hdf5" for RF/COMPONENT_MODELER)

  • Created _get_default_path() helper: Determines appropriate filename based on task type. When path is None, uses task-type-specific default. When path is provided, respects it exactly without overwriting. Returns a tuple (Path, bool, Optional[str]) containing path, batch status, and task type to avoid redundant API calls.

  • Made path parameter optional: Changed path: PathLike = "simulation_data.hdf5" to path: Optional[PathLike] = None in download(), load(), run(), Job.download(), and Job.load() functions

  • Optimized API calls: Refactored download() and load() to reuse task information from _get_default_path(), eliminating redundant get_info() and _is_modeler_batch() calls. Reduces API calls from 2-3 to 1 when path=None (50-67% latency reduction).

  • Refactored logic: Replaced hardcoded conditional logic with centralized helper function

Behavior

Before:

# path defaulted to "simulation_data.hdf5"
# Even explicit "simulation_data.hdf5" was silently overwritten for certain task types
download(task_id, "simulation_data.hdf5")  # Could become cm_data.hdf5
# Made 2-3 redundant API calls

After:

# path=None uses task-type-specific default
download(task_id)  # Gets cm_data.hdf5 for RF tasks, simulation_data.hdf5 for others

# Explicit paths are always respected
download(task_id, "simulation_data.hdf5")  # Gets simulation_data.hdf5 exactly
download(task_id, "my_file.hdf5")  # Gets my_file.hdf5 exactly

# Makes only 1 optimized API call

Benefits

  • Task-type-specific defaults are applied only when path is not provided
  • User-provided paths are always respected (no silent overwriting)
  • Adding new task types requires only updating DEFAULT_DATA_FILENAME
  • More maintainable and follows configuration pattern
  • Significant performance improvement: 50-67% reduction in API latency when using default paths

Performance Impact

  • Before: 2-3 API calls per download()/load() operation
  • After: 1 API call per download()/load() operation
  • Improvement: 50-67% reduction in latency

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI changed the title [WIP] Refactor RF webapi for new workflow based on feedback Refactor default path handling to use task-type configuration Nov 12, 2025
Copilot AI requested a review from momchil-flex November 12, 2025 15:42
Copilot finished work on behalf of momchil-flex November 12, 2025 15:42
Copy link
Collaborator

@momchil-flex momchil-flex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot I'm not too happy with silently overwriting the default "simulation_data.hdf5", maybe the user wanted that. I feel like the path should just be optional (there may be many methods where it's currently set to simulation_data.hdf5 though, explore all that will need to change) and if not provided then we use the default mapping.


# If user provided a custom path (not the generic default), respect it
if path.name not in {"simulation_data.hdf5", "simulation_data.hdf5.gz"}:
return path
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot I'm not too happy with silently overwriting the default "simulation_data.hdf5", maybe the user wanted that. I feel like the path should just be optional (there may be many methods where it's currently set to simulation_data.hdf5 though, explore all that will need to change) and if not provided then we use the default mapping.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the approach: path parameter is now Optional[PathLike] = None instead of having a default value. When path=None (not provided), the task-type-specific default is used. When a path is explicitly provided (including "simulation_data.hdf5"), it's respected exactly without any silent overwriting. (commit 8e2b3b7)

Copy link
Author

Copilot AI commented Nov 12, 2025

@copilot I'm not too happy with silently overwriting the default "simulation_data.hdf5", maybe the user wanted that. I feel like the path should just be optional (there may be many methods where it'...

Changed the approach: path parameter is now Optional[PathLike] = None instead of having a default value. When path=None (not provided), the task-type-specific default is used. When a path is explicitly provided (including "simulation_data.hdf5"), it's respected exactly without any silent overwriting. (commit 8e2b3b7)

Copilot AI changed the title Refactor default path handling to use task-type configuration Refactor default path handling to use task-type configuration with optional path parameter Nov 12, 2025
Copilot AI requested a review from momchil-flex November 12, 2025 16:17
Copilot finished work on behalf of momchil-flex November 12, 2025 16:17
@momchil-flex
Copy link
Collaborator

What do you think @daquintero @yaugenst ? I didn't like that in the previous CM integration introduced by Dario, the simulation_data.hdf5 was being silently overwritten. But I don't know if I'm a huge fan of the current modification either as it doesn't make the default manifest to the user.

@yaugenst-flex
Copy link
Collaborator

@marcorudolphflex i think this affects local cache right? could you take a look

"""
path = Path(path)
# Get the appropriate default path based on task type
path = _get_default_path(task_id, path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whenever path=None, this calls get_info(), which will do an API call, and then we call get_info() below again. Same thing in load(). I guess _get_default_path could return some structured data so we don't have to call the API again? Otherwise we've just doubled latency.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by refactoring _get_default_path() to return a tuple with path, batch status, and task type. Now when path=None, we make only 1 API call instead of 2-3. The returned info is reused in download() and load() to avoid redundant get_info() calls. (commit dc5ddf9)

# Get the appropriate default path based on task type
path = _get_default_path(task_id, path)

if _is_modeler_batch(task_id):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar here, the _is_modeler_batch should be redundant at this point.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. The _is_modeler_batch() check is now done once in _get_default_path() and the result is returned and reused, eliminating the redundant call. (commit dc5ddf9)

@marcorudolphflex
Copy link
Contributor

@marcorudolphflex i think this affects local cache right? could you take a look

Yup. If a simulation is loaded from the cache with web.load, we do not provide a task_id and we would get another default path (always simulation.hdf5) compared to the non-cached load.

Good news: We currently store the original task_id (from the cached run), get this cached_task_id in restore_simulation_if_cached and could provide it to load. Then we would just need another flag in web.load which shows that we come from the cache and do not want to download...

Cheaper alternative: we could store the task type in the cache entry and provide it to web.load.

@marcorudolphflex
Copy link
Contributor

@copilot Check Job.load in web/api/container.py which defines its "own" default path and change to None.

@momchil-flex
Copy link
Collaborator

@copilot Check Job.load in web/api/container.py which defines its "own" default path and change to None.

This didn't seem to take effect, but I was anyway planning to ask copilot to do this locally.

Good news: We currently store the original task_id (from the cached run), get this cached_task_id in restore_simulation_if_cached and could provide it to load. Then we would just need another flag in web.load which shows that we come from the cache and do not want to download...

Cheaper alternative: we could store the task type in the cache entry and provide it to web.load.

I feel like the second option is better?

Co-authored-by: momchil-flex <92756559+momchil-flex@users.noreply.github.com>
Copy link
Author

Copilot AI commented Nov 18, 2025

@copilot Check Job.load in web/api/container.py which defines its "own" default path and change to None.

Changed Job.load() and Job.download() default path parameter from DEFAULT_DATA_PATH to None. For cached loads, falls back to DEFAULT_DATA_PATH when needed by _materialize_from_stash(). (commit dc5ddf9)

Copilot AI changed the title Refactor default path handling to use task-type configuration with optional path parameter Refactor default path handling to use task-type configuration with optional path parameter and optimized API calls Nov 18, 2025
Copilot finished work on behalf of momchil-flex November 18, 2025 10:46
@momchil-flex momchil-flex force-pushed the momchil/rf_refactor branch 3 times, most recently from c45cf5f to 98014ee Compare November 19, 2025 10:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants