feat: Update search-actors tool #321

jirispilka · 2025-10-25T10:03:09Z

I really struggled to get GPT to work, so I ended up analyzing agentic prompts to better understand their tool instructions.

I had to update the system prompt so that GPTs would actually recognize that there are tools available to use.

You are a helpful assistant with a set of tools.

Follow these rules regarding tool calls:
1. ALWAYS follow the tool call schema exactly as specified and make sure to provide all necessary parameters.
2. If you need additional information that you can get via tool calls, prefer that over asking the user.
3. Only use the standard tool call format and the available tools.

Other changes:

Changed tool description and arguments description
Refactored evaluation

Important: For now, I have disabled LLM as judge. It was super misleading. I've documented everything in evals/README.md

…e case

…t-cases

…t again

jirispilka · 2025-10-25T20:25:39Z

Performance only on search-actors tools: tool exact match

jirispilka · 2025-10-25T20:36:13Z

Performance on the complete dataset (only tool exact mathc)

Before:

After

jirispilka added 9 commits October 23, 2025 14:26

feat: Add option to create custom dataset

c7588c3

feat: Add CLI to run-evaluation.ts

1fadc04

feat: Refactor run-evaluation.ts and add function to evaluate a singl…

78e69eb

…e case

feat: Refactor run-evaluation.ts, and eval-single.ts to load more tes…

715c960

…t-cases

fix: formatting log

970fc52

fix: update search-actors description

55d748d

fix: update search-actors description

0e77ecb

fix: update evaluation prompt

1af8811

fix: update dataset, change description, I will need to run experimen…

e8c7650

…t again

github-actions bot assigned jirispilka Oct 25, 2025

github-actions bot added the t-ai Issues owned by the AI team. label Oct 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Update search-actors tool #321

feat: Update search-actors tool #321

Uh oh!

jirispilka commented Oct 25, 2025 •

edited

Loading

Uh oh!

jirispilka commented Oct 25, 2025 •

edited

Loading

Uh oh!

jirispilka commented Oct 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: Update search-actors tool #321

Are you sure you want to change the base?

feat: Update search-actors tool #321

Uh oh!

Conversation

jirispilka commented Oct 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jirispilka commented Oct 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jirispilka commented Oct 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jirispilka commented Oct 25, 2025 •

edited

Loading

jirispilka commented Oct 25, 2025 •

edited

Loading