Skip to content

Conversation

mawad-amd
Copy link
Collaborator

@mawad-amd mawad-amd commented Jul 13, 2025

First attempt at improving docs. local/remote is still a bit confusing.

Closes #46

  • Translate should be private
  • Improve docs

@Copilot Copilot AI review requested due to automatic review settings July 13, 2025 08:09
@mawad-amd mawad-amd requested review from neoblizz and BKP as code owners July 13, 2025 08:09
Copilot

This comment was marked as outdated.

@neoblizz
Copy link
Member

Why do we have to describe it as local/remote only? If we use iris.load it should do local-load if your source and destination ranks are the same, correct?

@mawad-amd
Copy link
Collaborator Author

Correct. It is confusing and I would like to resolve that. Do you have suggestions?

I have a few:

# 1. Emphasizes direction of data flow
def load(pointer, to_rank, from_rank, heap_bases, mask=None):

# 2. Uses 'cur' to indicate the calling rank
def load(pointer, cur_rank, dst_rank, heap_bases, mask=None):

# 3. Generalizes roles as caller/target
def load(pointer, caller_rank, target_rank, heap_bases, mask=None):

The pointer argument can be named local_ptr, sym_ptr, address, or pointer.

I do like the from/to at the moment. Here are all APIs:

def load(pointer, to_rank, from_rank, heap_bases, mask=None):

def store(pointer, val, from_rank, to_rank, heap_bases, mask=None):

def get(from_pointer, to_pointer, to_rank, from_rank, heap_bases, mask=None):

def put(from_pointer, to_pointer, from_rank, to_rank, heap_bases, mask=None):

def atomic_add(pointer, val, from_rank, to_rank, heap_bases, mask=None, sem=None, scope=None):

def atomic_sub(pointer, val, from_rank, to_rank, heap_bases, mask=None, sem=None, scope=None):

def atomic_cas(pointer, cmp, val, from_rank, to_rank, heap_bases, sem=None, scope=None):

def atomic_xchg(pointer, val, from_rank, to_rank, heap_bases, mask=None, sem=None, scope=None):

@neoblizz
Copy link
Member

neoblizz commented Jul 13, 2025

Correct. It is confusing and I would like to resolve that. Do you have suggestions?

I have a few:

# 1. Emphasizes direction of data flow
def load(pointer, to_rank, from_rank, heap_bases, mask=None):

# 2. Uses 'cur' to indicate the calling rank
def load(pointer, cur_rank, dst_rank, heap_bases, mask=None):

# 3. Generalizes roles as caller/target
def load(pointer, caller_rank, target_rank, heap_bases, mask=None):

The pointer argument can be named local_ptr, sym_ptr, address, or pointer.

I do like the from/to at the moment. Here are all APIs:

def load(pointer, to_rank, from_rank, heap_bases, mask=None):

def store(pointer, val, from_rank, to_rank, heap_bases, mask=None):

def get(from_pointer, to_pointer, to_rank, from_rank, heap_bases, mask=None):

def put(from_pointer, to_pointer, from_rank, to_rank, heap_bases, mask=None):

def atomic_add(pointer, val, from_rank, to_rank, heap_bases, mask=None, sem=None, scope=None):

def atomic_sub(pointer, val, from_rank, to_rank, heap_bases, mask=None, sem=None, scope=None):

def atomic_cas(pointer, cmp, val, from_rank, to_rank, heap_bases, sem=None, scope=None):

def atomic_xchg(pointer, val, from_rank, to_rank, heap_bases, mask=None, sem=None, scope=None):

Notes

  • I do like to/from, but not for the from_pointer field.
  • Is there a notion of "triggering" a load/store/put/get from a rank thats not "current" or "remote" rank? If so, current cannot be used.

Other suggestions:

  • Use value instead of val,
  • Use semantics instead of sem
  • The problematic (imo) APIs are:
def get(from_pointer, to_pointer, to_rank, from_rank, heap_bases, mask=None):
def put(from_pointer, to_pointer, from_rank, to_rank, heap_bases, mask=None):

@mawad-amd
Copy link
Collaborator Author

mawad-amd commented Jul 14, 2025

Is there a notion of "triggering" a load/store/put/get from a rank thats not "current" or "remote" rank? If so, current cannot be used.

No. For that reason, I was considering adding a text to warn against that e.g., for load "to_rank must be the same rank issuing the operation." But if there is some an interesting use-case we can accommodate that.
Eventually the "current rank" should be implicit and removed, alongside the heap_bases.

The shortened val and sem are to match Triton but I am okay with either (and I slightly prefer spelling out the complete word).

How about this:

def load(pointer, to_rank, from_rank, heap_bases, mask=None):
def store(pointer, val, from_rank, to_rank, heap_bases, mask=None):
def atomic_add(pointer, val, from_rank, to_rank, heap_bases, mask=None, sem=None, scope=None):
def atomic_sub(pointer, val, from_rank, to_rank, heap_bases, mask=None, sem=None, scope=None):
def atomic_cas(pointer, cmp, val, from_rank, to_rank, heap_bases, sem=None, scope=None):
def atomic_xchg(pointer, val, from_rank, to_rank, heap_bases, mask=None, sem=None, scope=None):

and,

def get(dst_ptr, src_ptr, to_rank, from_rank, heap_bases, mask=None):
def put(dst_ptr, src_ptr, to_rank, from_rank, heap_bases, mask=None):

@neoblizz
Copy link
Member

neoblizz commented Jul 14, 2025

No. For that reason, I was considering adding a text to warn against that e.g., for load "to_rank must be the same rank issuing the operation." But if there is some an interesting use-case we can accommodate that.
Eventually the "current rank" should be implicit and removed, alongside the heap_bases.

I am not sure I understand the warning. You mean from_rank must be the same rank issuing the op?
Separately, I think it can be true where thats not the case. Where we can have one GPU thread somehow initiate a copy/move. Lets discuss in a call separately with @BKP.

The shortened val and sem are to match Triton but I am okay with either (and I slightly prefer spelling out the complete word).

How about this:

def load(pointer, to_rank, from_rank, heap_bases, mask=None):
def store(pointer, val, from_rank, to_rank, heap_bases, mask=None):
def atomic_add(pointer, val, from_rank, to_rank, heap_bases, mask=None, sem=None, scope=None):
def atomic_sub(pointer, val, from_rank, to_rank, heap_bases, mask=None, sem=None, scope=None):
def atomic_cas(pointer, cmp, val, from_rank, to_rank, heap_bases, sem=None, scope=None):
def atomic_xchg(pointer, val, from_rank, to_rank, heap_bases, mask=None, sem=None, scope=None):

and,

def get(dst_ptr, src_ptr, to_rank, from_rank, heap_bases, mask=None):
def put(dst_ptr, src_ptr, to_rank, from_rank, heap_bases, mask=None):

I prefer spelled out as well --- triton is not consistent with their naming, you can easily find examples of these.

I like this! (with sem, val, maybe even cmp fully spelled out. I know its minor, but sem could be semaphore or semantics or something else, idk)

Copy link
Member

@neoblizz neoblizz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still lots of remote, lets discuss the best way to word some of the descriptions.

@neoblizz neoblizz requested a review from Copilot August 3, 2025 19:19
Copy link
Member

@neoblizz neoblizz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves the documentation and API design of the iris library by making the translate function private and enhancing docstrings for memory operations. The changes focus on clarifying parameter naming conventions and providing more detailed documentation for cross-rank memory operations.

Key changes:

  • Made the translate function private by renaming it to __translate and removing it from public exports
  • Improved parameter naming conventions across all memory operation functions (e.g., cur_rankfrom_rank, target_rankto_rank)
  • Enhanced docstrings with detailed explanations of memory translation operations and parameter descriptions

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
iris/iris.py Renamed translate to __translate, improved parameter naming, and added comprehensive docstrings for all memory operation functions
iris/init.py Removed translate from public exports and fixed trailing comma in __all__ list

@neoblizz
Copy link
Member

neoblizz commented Aug 3, 2025

@mawad-amd I just asked copilot to see if it catches something I missed, feel free to ignore that.

@neoblizz neoblizz changed the title Improve docs Improve code documentation Aug 3, 2025
@neoblizz neoblizz changed the title Improve code documentation Improve code documentation and APIs Aug 3, 2025
mawad-amd and others added 2 commits August 3, 2025 16:04
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@mawad-amd
Copy link
Collaborator Author

I think the language is still a bit confusing. I am not happy with the text now but will revisit later.

@mawad-amd mawad-amd merged commit 462e0ed into main Aug 3, 2025
2 checks passed
@mawad-amd mawad-amd deleted the muhaawad/docs branch August 3, 2025 23:05
mawad-amd added a commit that referenced this pull request Aug 6, 2025
@mawad-amd mawad-amd mentioned this pull request Aug 18, 2025
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update get and put docs
2 participants