Skip to content

Conversation

@ssklzx
Copy link

@ssklzx ssklzx commented Oct 15, 2024

accelerator = Accelerator()
model, optimizer, data = accelerator.prepare(model, optimizer, data)
device_map = {}
model = accelerate.dispatch_model(model, device_map=device_map)
accelerator.save_state(save_path)

When I use accelerate.dispatch_model after accelerator.prepare, there will be an error when saving the model

@tjruwase
Copy link
Contributor

@ssklzx, thanks for creating this PR. However, I think you misunderstood my response
#6620 (comment).

What I meant is that we need to debug further to understand why some parameters are missing from self.param_names. Are you able to provide a full repro?

@ssklzx
Copy link
Author

ssklzx commented Oct 16, 2024

@ssklzx, thanks for creating this PR. However, I think you misunderstood my response #6620 (comment).

What I meant is that we need to debug further to understand why some parameters are missing from self.param_names. Are you able to provide a full repro?

Because after the initialization of 'self. param_name', I will change the position of the parameters, such as moving them from 'CUDA: 0' to 'CUDA: 1', so I will not be able to find these transferred parameters

for example:
model, optimizer, data = accelerator.prepare(model, optimizer, data). # initialization 'self. param_name'
model = accelerate.dispatch_model(model, device_map=device_map) # change parameter position
accelerator.save_state(save_path) # report errors

@tjruwase
Copy link
Contributor

Because after the initialization of 'self. param_name', I will change the position of the parameters, such as moving them from 'CUDA: 0' to 'CUDA: 1', so I will not be able to find these transferred parameters

@ssklzx, thanks for the clarification. I think the correct solution here is for accelerate and DeepSpeed to coordinate to ensure that DeepSpeed is aware of new parameter locations, including updating self.param_names

@ssdevtech
Copy link

hi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants