-
Notifications
You must be signed in to change notification settings - Fork 5
fix(parser): only keep final traceback for each failing test #160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The Tempest report parser was updated to correctly extract the final traceback when multiple tracebacks are present in a single log entry. Previously, the parser would capture the first traceback encountered. This could lead to large amount of inputs that our model can't handle. For now let's just focus on the last traceback that is found for each test.
|
Tested with this test which has 7 Tracebacks. API logs (with a print enabled to show that only the last traceback was sent): Curl: |
|
LGTM |
lpiwowar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 👍
Maybe we could update the tools repo as well so that we store in the vector db the last traceback only as well [1].
Not necessarily, as the more tracebacks we have in vectordb the better? |
The matches (cosine similarity) between the long logs and the short logs we are producing here are not going to be as high as they would if we had the short logs in the vector db as well. Also, with the long logs there's going to be a lot of other noise (other tracebacks etc). More difficult for matching. The problem IMO is not the fact that we are storing more tracebacks in the vectordb. The issue is that we are taking in some cases almost the entire error string and computing the key for it (instead of computing the key solely with the Traceback section). |
|
good points, let's do it then |
The Tempest report parser was updated to correctly extract the final traceback
when multiple tracebacks are present in a single log entry.
Previously, the parser would capture all tracebacks encountered and the logs in the middle. This could lead to large amount of
inputs that our model can't handle. For now let's just focus on the last
traceback that is found for each test.