-
-
Notifications
You must be signed in to change notification settings - Fork 852
fix(openai): remove duplicate schema from messages in JSON_SCHEMA mode #1761
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(openai): remove duplicate schema from messages in JSON_SCHEMA mode #1761
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed everything up to f3af7fb in 1 minute and 14 seconds. Click for details.
- Reviewed
44
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
2
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. instructor/providers/openai/utils.py:436
- Draft comment:
Good change: The 'if mode != Mode.JSON_SCHEMA' condition properly avoids adding duplicate schema information in JSON_SCHEMA mode. Consider adding an inline comment to clarify this design decision for future maintainers. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 85% The comment is about documenting a design decision in the code. While the suggestion is reasonable, the function namehandle_json_modes
and the context make it fairly clear what's happening. The code is not complex enough to warrant additional inline documentation. The comment also starts with "Good change:" which is not actionable. The design decision could be non-obvious to new contributors. Documentation can help prevent future regressions. The function name and surrounding context provide sufficient clarity. The code change is straightforward and the reason for skipping JSON_SCHEMA mode is evident from the mode handling logic throughout the file. Delete the comment as it suggests adding documentation that isn't necessary given the clear context and straightforward code change.
2. instructor/providers/openai/utils.py:437
- Draft comment:
Note: The code assumes new_kwargs['messages'] is non-empty. It might be prudent to add a guard or document this requirement to avoid potential IndexError. - Reason this comment was not posted:
Confidence changes required:80%
<= threshold85%
None
Workflow ID: wflow_bjqLYxK1hjtcwoBW
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
Hi @jxnl, just following up on this PR. It’s a small change that removes redundant schema duplication in JSON_SCHEMA mode, reducing token usage without affecting JSON or MD_JSON modes. Would appreciate it if you could take a look when you get a chance. |
Hi @jxnl, quick follow-up. I see auto-merge is enabled, but CI is blocking; logs show OIDC token errors in the claude-review job and missing provider API keys in provider tests. Seems unrelated to this diff, but happy to tweak if needed. |
Removes redundant schema information from messages when using
JSON_SCHEMA
mode.Why This Change?
JSON mode (
response_format: {"type": "json_object"}
) - OpenAI docs require explicit JSON instruction in messages since no schema is provided in response_format.https://platform.openai.com/docs/guides/structured-outputs?api-mode=chat#json-mode
JSON_SCHEMA mode (
response_format: {"type": "json_schema", ...}
) - Schema is already provided in response_format. Adding the same schema to messages creates redundancy, increases token consumption unnecessarily, and provides no additional value to the model.Changes
JSON_SCHEMA
mode: No schema added to messages (schema already in response_format)JSON
andMD_JSON
modes: Unchanged behavior (still add schema to messages as required)Important
Removes redundant schema from messages in
JSON_SCHEMA
mode inhandle_json_modes()
inutils.py
, reducing token consumption.handle_json_modes()
inutils.py
,JSON_SCHEMA
mode no longer adds schema to messages, as it's already inresponse_format
.JSON
andMD_JSON
modes remain unchanged, still adding schema to messages.JSON_SCHEMA
mode by not duplicating schema in messages.This description was created by
for f3af7fb. You can customize this summary. It will automatically update as commits are pushed.