You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried to use claude-4, gemini, o3 etc to run this stress test with the latest version of browser use
vision, memory, planning... all default setting
But I kept getting 8-10/31 on this
What's the expected score? It feels this should be passing at a much higher rate somehow