-
Notifications
You must be signed in to change notification settings - Fork 2
Description
@thomasyu888 @gkowalski In the current design of the infrastructure, the Orchestrator gets a page (as in "paginated response") of clinical notes from a Data Node (e.g. 50 clinical notes), sends them to NLP Tool being evaluated, receives the results and repeat with the next page of clinical notes. In addition of allowing us to controller the flow of information to the NLP Tool, which limit its memory need, we can evaluate and ideally report the following metrics:
- Completion rate: number of notes processed / number of notes in the dataset
- Time required to process a clinical note (average, std)
- the timer start after the request has been sent (clinical notes sent to the NLP Tool)
- the timer stops when all the responses have been received from the NLP Tool for the clinical notes sent
The motivation for reporting the completion rate to the user is that it will allow him/her to better predict when the results are out. This can also be used by the user to track whether the tool takes too my time to complete. For the staff maintaining a Data Hosting Site, it would be nice to have a report in ELK that shows the Tools that are being evaluated and their completion rate.
The motivation for reporting information about the processing time is that a hospital who is looking for a tool to use in production by visiting a Leaderboard of the NLP Sandbox may identify that a Tool would take too much time to process their volume of clinical note. One option could be to extrapolate and show the time required to process 1 million of notes. It's important that any time information are very much dependent on the spec of the infrastructure used (number and frequency of CPU cores, etc.). We should be able to provide information about the spec used when reporting a time information. Note that this spec may vary from one Data Hosting Site, in which case we would probably want to report the time for each dataset / Data Hosting Site used to evaluate a NLP Tool.