web-infra-dev
diff --git a/‎README.md‎
Lines changed: 1 addition & 1 deletion b/‎README.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎README.zh.md‎
Lines changed: 1 addition & 1 deletion b/‎README.zh.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎apps/report/src/components/store/index.tsx‎
Lines changed: 1 addition & 1 deletion b/‎apps/report/src/components/store/index.tsx‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎apps/site/docs/en/api.mdx‎
Lines changed: 15 additions & 15 deletions b/‎apps/site/docs/en/api.mdx‎
Lines changed: 15 additions & 15 deletions
diff --git a/‎apps/site/docs/en/automate-with-scripts-in-yaml.mdx‎
Lines changed: 4 additions & 4 deletions b/‎apps/site/docs/en/automate-with-scripts-in-yaml.mdx‎
Lines changed: 4 additions & 4 deletions
diff --git a/‎apps/site/docs/en/blog-programming-practice-using-structured-api.md‎
Lines changed: 10 additions & 10 deletions b/‎apps/site/docs/en/blog-programming-practice-using-structured-api.md‎
Lines changed: 10 additions & 10 deletions
diff --git a/‎apps/site/docs/en/blog-support-android-automation.mdx‎
Lines changed: 4 additions & 4 deletions b/‎apps/site/docs/en/blog-support-android-automation.mdx‎
Lines changed: 4 additions & 4 deletions
diff --git a/‎apps/site/docs/en/blog-support-ios-automation.mdx‎
Lines changed: 6 additions & 6 deletions b/‎apps/site/docs/en/blog-support-ios-automation.mdx‎
Lines changed: 6 additions & 6 deletions
diff --git a/‎apps/site/docs/en/caching.mdx‎
Lines changed: 1 addition & 1 deletion b/‎apps/site/docs/en/caching.mdx‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎apps/site/docs/en/changelog.mdx‎
Lines changed: 7 additions & 7 deletions b/‎apps/site/docs/en/changelog.mdx‎
Lines changed: 7 additions & 7 deletions
@@ -81,7 +81,7 @@ Read more about [Choose a model](https://midscenejs.com/choose-a-model)
 Midscene will automatically plan the steps and execute them. It may be slower and heavily rely on the quality of the AI model.
 
 ```javascript
-await aiAction('click all the records one by one. If one record contains the text "completed", skip it');
+await aiAct('click all the records one by one. If one record contains the text "completed", skip it');
 ```
 
 ### Workflow Style
 
@@ -81,7 +81,7 @@ Midscene.js 支持视觉语言模型，例如 `Qwen3-VL`、`Doubao-1.6-vision`
 Midscene 会自动规划步骤并执行。它可能较慢，并且深度依赖于 AI 模型的质量。
 
 ```javascript
-await aiAction('click all the records one by one. If one record contains the text "completed", skip it');
+await aiAct('click all the records one by one. If one record contains the text "completed", skip it');
 ```
 
 ### 工作流风格
 
@@ -31,7 +31,7 @@ export const useBlackboardPreference = create<{
   },
 }));
 export interface HistoryItem {
-  type: 'aiAction' | 'aiQuery' | 'aiAssert';
+  type: 'aiAct' | 'aiQuery' | 'aiAssert';
   prompt: string;
   timestamp: number;
 }
 
@@ -17,7 +17,7 @@ These Agents share some common constructor parameters:
 - `reportFileName: string`: The name of the report file. (Default: generated by midscene)
 - `autoPrintReportMsg: boolean`: If true, report messages will be printed. (Default: true)
 - `cacheId: string | undefined`: If provided, this cacheId will be used to save or match the cache. (Default: undefined, means cache feature is disabled)
-- `actionContext: string`: Some background knowledge that should be sent to the AI model when calling `agent.aiAction()`, like 'close the cookie consent dialog first if it exists' (Default: undefined)
+- `actionContext: string`: Some background knowledge that should be sent to the AI model when calling `agent.aiAct()`, like 'close the cookie consent dialog first if it exists' (Default: undefined)
 - `onTaskStartTip: (tip: string) => void | Promise<void>`: Optional hook that fires before each execution task begins with a human-readable summary of the task (Default: undefined)
 
 In Playwright and Puppeteer, there are some common parameters:
@@ -42,14 +42,14 @@ In Midscene, you can choose to use either auto planning or instant action.
 
 :::
 
-### `agent.aiAction()` or `.ai()`
+### `agent.aiAct()` or `.ai()`
 
 This method allows you to perform a series of UI actions described in natural language. Midscene automatically plans the steps and executes them.
 
 - Type
 
 ```typescript
-function aiAction(
+function aiAct(
   prompt: string,
   options?: {
     cacheable?: boolean;
@@ -72,7 +72,7 @@ function ai(prompt: string): Promise<void>; // shorthand form
 
 ```typescript
 // Basic usage
-await agent.aiAction(
+await agent.aiAct(
   'Type "JavaScript" into the search box, then click the search button',
 );
 
@@ -82,14 +82,14 @@ await agent.ai(
 );
 
 // When using UI Agent models like ui-tars, you can try a more goal-driven prompt
-await agent.aiAction('Post a Tweet "Hello World"');
+await agent.aiAct('Post a Tweet "Hello World"');
 ```
 
 :::tip
 
 Under the hood, Midscene uses AI model to split the instruction into a series of steps (a.k.a. "Planning"). It then executes these steps sequentially. If Midscene determines that the actions cannot be performed, an error will be thrown.
 
-For optimal results, please provide clear and detailed instructions for `agent.aiAction()`. For guides about writing prompts, you may read this doc: [Tips for Writing Prompts](./prompting-tips).
+For optimal results, please provide clear and detailed instructions for `agent.aiAct()`. For guides about writing prompts, you may read this doc: [Tips for Writing Prompts](./prompting-tips).
 
 Related Documentation:
 
@@ -700,7 +700,7 @@ For more information about YAML scripts, please refer to [Automate with Scripts
 
 ### `agent.setAIActionContext()`
 
-Set the background knowledge that should be sent to the AI model when calling `agent.aiAction()` or `agent.ai()`. This will override the previous setting.
+Set the background knowledge that should be sent to the AI model when calling `agent.aiAct()` or `agent.ai()`. This will override the previous setting.
 
 For instant action type APIs, like `aiTap()`, this setting will not take effect.
 
@@ -749,14 +749,14 @@ const result = await agent.evaluateJavaScript('document.title');
 console.log(result);
 ```
 
-### `agent.logScreenshot()`
+### `agent.recordToReport()`
 
 Log the current screenshot with a description in the report file.
 
 - Type
 
 ```typescript
-function logScreenshot(title?: string, options?: Object): Promise<void>;
+function recordToReport(title?: string, options?: Object): Promise<void>;
 ```
 
 - Parameters:
@@ -772,7 +772,7 @@ function logScreenshot(title?: string, options?: Object): Promise<void>;
 - Examples:
 
 ```typescript
-await agent.logScreenshot('Login page', {
+await agent.recordToReport('Login page', {
   content: 'User A',
 });
 ```
@@ -872,7 +872,7 @@ export MIDSCENE_RUN_DIR=midscene_run # The default value is the midscene_run in
 
 ### Customize the replanning cycle limit
 
-Set the `MIDSCENE_REPLANNING_CYCLE_LIMIT` variable to customize the maximum number of replanning cycles allowed during action execution (`aiAction`).
+Set the `MIDSCENE_REPLANNING_CYCLE_LIMIT` variable to customize the maximum number of replanning cycles allowed during action execution (`aiAct`).
 
 ```bash
 export MIDSCENE_REPLANNING_CYCLE_LIMIT=10 # The default value is 10. When the AI needs to replan more than this limit, an error will be thrown suggesting to split the task into multiple steps
@@ -1143,8 +1143,8 @@ describe('Android Settings Test', () => {
     await sleep(1000);
     await adb.shell('am start -n com.android.settings/.Settings');
     await sleep(1000);
-    await agent.aiAction('find and enter WLAN setting');
-    await agent.aiAction(
+    await agent.aiAct('find and enter WLAN setting');
+    await agent.aiAct(
       'toggle WLAN status *once*, if WLAN is off pls turn it on, otherwise turn it off.',
     );
   });
@@ -1154,8 +1154,8 @@ describe('Android Settings Test', () => {
     await sleep(1000);
     await adb.shell('am start -n com.android.settings/.Settings');
     await sleep(1000);
-    await agent.aiAction('find and enter bluetooth setting');
-    await agent.aiAction(
+    await agent.aiAct('find and enter bluetooth setting');
+    await agent.aiAct(
       'toggle bluetooth status *once*, if bluetooth is off pls turn it on, otherwise turn it off.',
     );
   });
 
@@ -183,7 +183,7 @@ If you need to use the `aiActionContext` parameter, you can set it through the g
 ```yaml
 # Global AI agent configuration
 agent:
-  # Background knowledge to send to the AI model when calling aiAction, optional.
+  # Background knowledge to send to the AI model when calling aiAct, optional.
   aiActionContext: <string>
 ```
 
@@ -269,12 +269,12 @@ tasks:
       # Auto Planning (.ai)
       # ----------------
 
-      # Perform an interaction. `ai` is a shorthand for `aiAction`.
+      # Perform an interaction. `ai` is a shorthand for `aiAct`.
       - ai: <prompt>
         cacheable: <boolean> # Optional, whether to cache the result of this API call when the [caching feature](./caching.mdx) is enabled. Defaults to True.
 
       # This usage is the same as `ai`.
-      - aiAction: <prompt>
+      - aiAct: <prompt>
         cacheable: <boolean> # Optional, whether to cache the result of this API call when the [caching feature](./caching.mdx) is enabled. Defaults to True.
 
       # Instant Action (.aiTap, .aiHover, .aiInput, .aiKeyboardPress, .aiScroll)
@@ -317,7 +317,7 @@ tasks:
         cacheable: <boolean> # Optional, whether to cache the result of this API call when the [caching feature](./caching.mdx) is enabled. Defaults to True.
 
       # Log the current screenshot with a description in the report file.
-      - logScreenshot: <title> # Optional, the title of the screenshot. If not provided, the title will be 'untitled'.
+      - recordToReport: <title> # Optional, the title of the screenshot. If not provided, the title will be 'untitled'.
         content: <content> # Optional, the description of the screenshot.
 
       # Data Extraction
 
@@ -1,24 +1,24 @@
 # Use JavaScript to optimize the AI automation code
 
-Many developers love using `ai` or `aiAction` to accomplish complex tasks, and even describe all logic in a single natural language instruction. Although it may seem 'intelligent', in practice, this approach may not provide a reliable and efficient experience, and results in an endless loop of Prompt tuning.
+Many developers love using `ai` or `aiAct` to accomplish complex tasks, and even describe all logic in a single natural language instruction. Although it may seem 'intelligent', in practice, this approach may not provide a reliable and efficient experience, and results in an endless loop of Prompt tuning.
 
 Here is a typical example, developers may write a large logic storm with long descriptions, such as:
 
 ```javascript
 // complex tasks
-aiAction(`
+aiAct(`
 1. click the first user
 2. click the chat bubble on the right side of the user page
 3. if I have already sent a message to him/her, go back to the previous page
 4. if I have not sent a message to him/her, input a greeting text and click send
 `)
 ```
 
-Another common misconception is that the complex workflow can be effectively controlled using `aiAction` methods. These prompts are far from reliable when compared to traditional JavaScript. For example:
+Another common misconception is that the complex workflow can be effectively controlled using `aiAct` methods. These prompts are far from reliable when compared to traditional JavaScript. For example:
 
 ```javascript
 // not stable !
-aiAction('click all the records one by one. If one record contains the text "completed", skip it')
+aiAct('click all the records one by one. If one record contains the text "completed", skip it')
 ```
 
 ## One path to optimize the automation code: use JavaScript and structured API
@@ -27,7 +27,7 @@ From v0.16.10, Midscene provides data extraction methods like `aiBoolean` `aiStr
 
 Combining them with the instant action methods, like `aiTap`, `aiInput`, `aiScroll`, `aiHover`, etc., you can split complex logic into multiple steps to improve the stability of the automation code.
 
-Let's take the first bad case above, you can convert the `.aiAction` method into a structured API call:
+Let's take the first bad case above, you can convert the `.aiAct` method into a structured API call:
 
 Original prompt:
 
@@ -53,7 +53,7 @@ After modifying the coding style, the whole process can be much more reliable an
 Here is another example, this is what it looks like before rewriting: 
 
 ```javascript
-aiAction(`
+aiAct(`
 1. click the first unfollowed user, enter the user's homepage
 2. click the follow button
 3. go back to the previous page
@@ -185,14 +185,14 @@ After you input the prompt, the AI IDE will convert the prompt into structured j
 
 Enjoy it!
 
-## Which approach is best: `aiAction` or structured code?
+## Which approach is best: `aiAct` or structured code?
 
 There is no standard answer. It depends on the model's ability, the complexity of the actual business.
 
-Generally, if you encounter the following situations, you should consider abandoning the `aiAction` method:
+Generally, if you encounter the following situations, you should consider abandoning the `aiAct` method:
 
-- The success rate of `aiAction` does not meet the requirements after multiple retries
-- You have already felt tired and spent too much time repeatedly tuning the `aiAction` prompt
+- The success rate of `aiAct` does not meet the requirements after multiple retries
+- You have already felt tired and spent too much time repeatedly tuning the `aiAct` prompt
 - You need to debug the script step by step
 
 ## What's next?
 
@@ -47,10 +47,10 @@ android:
 tasks:
   - name: search headphones
     flow:
-      - aiAction: open browser and navigate to ebay.com
-      - aiAction: type 'Headphones' in ebay search box, hit Enter
+      - aiAct: open browser and navigate to ebay.com
+      - aiAct: type 'Headphones' in ebay search box, hit Enter
       - sleep: 5000
-      - aiAction: scroll down the page for 800px
+      - aiAct: scroll down the page for 800px
 
   - name: extract headphones info
     flow:
@@ -88,7 +88,7 @@ Promise.resolve(
     await sleep(5000);
 
     // 👀 type keywords, perform a search
-    await agent.aiAction('type "Headphones" in search box, hit Enter');
+    await agent.aiAct('type "Headphones" in search box, hit Enter');
 
     // 👀 wait for the loading
     await agent.aiWaitFor("there is at least one headphone item on page");
 
@@ -42,11 +42,11 @@ ios:
 tasks:
   - name: search content
     flow:
-      - aiAction: tap address bar
-      - aiAction: input 'Midscene AI automation'
-      - aiAction: tap search button
+      - aiAct: tap address bar
+      - aiAct: input 'Midscene AI automation'
+      - aiAct: tap search button
       - sleep: 3000
-      - aiAction: scroll down 500px
+      - aiAct: scroll down 500px
 
   - name: extract search results
     flow:
@@ -89,10 +89,10 @@ Promise.resolve(
     await sleep(3000);
 
     // 👀 tap address bar and input search keywords
-    await agent.aiAction('tap address bar and input "Midscene automation"');
+    await agent.aiAct('tap address bar and input "Midscene automation"');
 
     // 👀 perform search
-    await agent.aiAction('tap search button');
+    await agent.aiAct('tap search button');
 
     // 👀 wait for loading to complete
     await agent.aiWaitFor("there is at least one search result on the page");
 
@@ -19,7 +19,7 @@ With caching hit, time cost is significantly reduced. For example, in the follow
 Midscene's caching mechanism is based on input stability and output reusability. When the same task instructions are repeatedly executed in similar page environments, Midscene will prioritize using cached results to avoid repeated AI model calls, significantly improving execution efficiency.
 
 The core caching mechanisms include:
-- **Task instruction caching**: For planning operations (such as `ai`, `aiAction`), Midscene uses the prompt instruction as the cache key to store the execution plan returned by AI
+- **Task instruction caching**: For planning operations (such as `ai`, `aiAct`), Midscene uses the prompt instruction as the cache key to store the execution plan returned by AI
 - **Element location caching**: For location operations (such as `aiLocate`, `aiTap`), the system uses the location prompt as the cache key to store element XPath information, and verifies whether the XPath is still valid on the next execution
 - **Invalidation mechanism**: When cache becomes invalid, the system automatically falls back to AI model for re-analysis
 - **Never cache query results**: The query results like `aiBoolean`, `aiQuery`, `aiAssert` will never be cached.
 
@@ -13,8 +13,8 @@ We've adapted the latest Qwen `Qwen3-VL` model, giving developers faster and mor
 
 ### 🤖 AI core capability enhancement
 
-- **UI-TARS Model Performance Optimization**: Optimized aiAction planning, improved dialogue history management, and provided better context awareness capabilities
-- **AI Assertion and Action Optimization**: We updated the prompt for `aiAssert` and optimized the internal implementation of `aiAction`, making AI-driven assertions and action execution more precise and reliable
+- **UI-TARS Model Performance Optimization**: Optimized aiAct planning, improved dialogue history management, and provided better context awareness capabilities
+- **AI Assertion and Action Optimization**: We updated the prompt for `aiAssert` and optimized the internal implementation of `aiAct`, making AI-driven assertions and action execution more precise and reliable
 
 ### 📊 Reporting and debugging experience optimization
 - **URL Parameter Playback Control**: To improve debugging experience, you can now directly control the default behavior of report playback through URL parameters
@@ -194,7 +194,7 @@ Based on the introduction of [Rslib](https://github.com/web-infra-dev/rslib) in
 - Support storing more complex data structures, laying the foundation for future feature extensions
 
 #### 3️⃣ Customize replanning cycle limit
-- Set the `MIDSCENE_REPLANNING_CYCLE_LIMIT` environment variable to customize the maximum number of re-planning cycles allowed when executing operations (aiAction).
+- Set the `MIDSCENE_REPLANNING_CYCLE_LIMIT` environment variable to customize the maximum number of re-planning cycles allowed when executing operations (aiAct).
 - The default value is 10. When the AI needs to re-plan more than this limit, an error will be thrown and suggest splitting the task.
 - Provide more flexible task execution control, adapting to different automation scenarios
 
@@ -306,14 +306,14 @@ Reduce the size of the generated report by trimming redundant data, significantl
 
 ### Custom node in report
 
-* Add the `logScreenshot` API to the agent. Take a screenshot of the current page as a report node, and support setting the node title and description to make the automated testing process more intuitive. Applicable for capturing screenshots of key steps, error status capture, UI validation, etc.
+* Add the `recordToReport` API to the agent. Take a screenshot of the current page as a report node, and support setting the node title and description to make the automated testing process more intuitive. Applicable for capturing screenshots of key steps, error status capture, UI validation, etc.
 
-![](/blog/logScreenshot-api.png)
+![](/blog/recordToReport-api.png)
 
 * Example:
 
 ```javascript
-test('login github', async ({ ai, aiAssert, aiInput, logScreenshot }) => {
+test('login github', async ({ ai, aiAssert, aiInput, recordToReport }) => {
   if (CACHE_TIME_OUT) {
     test.setTimeout(200 * 1000);
   }
@@ -322,7 +322,7 @@ test('login github', async ({ ai, aiAssert, aiInput, logScreenshot }) => {
   await aiInput('123456', 'password');
 
   // log by your own
-  await logScreenshot('Login page', {
+  await recordToReport('Login page', {
     content: 'Username is quanru, password is 123456',
   });
Original file line number	Diff line number	Diff line change
`@@ -31,7 +31,7 @@ export const useBlackboardPreference = create<{`
`31`	`31`	`},`
`32`	`32`	`}));`
`33`	`33`	`export interface HistoryItem {`
`34`		`- type: 'aiAction' \| 'aiQuery' \| 'aiAssert';`
	`34`	`+ type: 'aiAct' \| 'aiQuery' \| 'aiAssert';`
`35`	`35`	`prompt: string;`
`36`	`36`	`timestamp: number;`
`37`	`37`	`}`