-
Notifications
You must be signed in to change notification settings - Fork 443
Include screenshot in MCP response when taking sceenshot #194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hi @lukasvdberk @tedjames Is there any you'd be up to modify this a bit? Or even just a fast change to toggle this feature on/off? Having a lot of problems from this change! Although totally understand it is well meant and useful to have a screenshot in the chat 🙏 Biggest issue for me: it is inserting the screenshot + auto-sending the message in random places if I'm not in the cursor agent chat. It will put the message in the code, chatgpt, basically any window that has focus when cursor takes the screnshot. in cursor: screenshot appears as a new message sent to cursor (I think this will use an additional request) ![]() in the code :( ![]() |
I don't understand this: |
@lukasvdberk @tedjames this has been happening for me in chatgpt as well as within cursor code files. I attached a screenshot of the code insert in my previous message. This one is easy to replicate. I was working on a file and you can see that it inserted the filename and message when the focus was in the file. The other issue is that it has been putting the screenshot in a followup message instead of updating the agent's response. This uses 2x the api calls within cursor. So this gets expensive. Basically - getting this anytime I am letting the agent run a UI based task and multitasking. |
@lukasvdberk actually looked at your commit - my bad, it might have just been a coincidence in timing. This started happening for me yesterday. But your PR doesn't seem to directly trigger this. I'll open up a separate issue. |
"The other issue is that it has been putting the screenshot in a followup message instead of updating the agent's response. This uses 2x the api calls within cursor. So this gets expensive." I think that is a separate issue that is not caused by this PR. I think this has something to do with the recent 1.2 update and the "Allow Auto-Paste into Cursor". Which commit are you running of browser-tool-mcp? |
@lukasvdberk thanks for the fast reply! I'm running with this, so I think it auto-pulls Anyway put out a separate issue, sorry for bothering ya. And thanks for the heads up, on autopaste: maybe I can just turn that off for a bit 😵💫 |
@100kristine No problem :) |
@tedjames Is there something I can do to get this PR merged? kind regards |
???????????????? |
This PR includes the taken screenshot in base64 format besides the existing text response. Now the MCP caller can read the image themselves (for example Cursor).
I have included a example below where it can describe the taken screenshot in Cursor:
Love this MCP server and let me know if I can make any improvements to my PR!