feat(inspect): Add train model dialog, job logs and models dataset #3053

MarkRedeman · 2025-10-24T09:14:07Z

📝 Description

This PR contains the changes I made for yesterday's demo. Please check the individual commits for more specific changes.
In summary this:

Adds a "Train model" button that opens a dialog to let the user choose their model
Adds a "Show logs" button that can be used to see training logs in the UI
Use query parameters to determine which tab is opened
Adds the models tab showing all trained models and non-completed jobs
Invoke inference when changing to a different model

On the backend I made two changes:

Update the end time of a job so that we can compute its duration (not yet used in the UI)
Change the formatting of the SSE endpoint

✨ Changes

Select what type of change your PR is:

✅ Checklist

Before you submit your pull request, please make sure you have completed the following steps:

📚 I have made the necessary updates to the documentation (if applicable).
🧪 I have written tests that support my changes and prove that my fix is effective or my feature works (if applicable).
🏷️ My PR title follows conventional commit format.

For more information about code review checklists, see the Code Review Checklist.

Set end_time to the current UTC time when a job reaches COMPLETED, FAILED, or CANCELED.

Emit SSE-formatted messages (prepend "data: " and send a DONE sentinel) from the backend stream so clients can detect completion. Add a new fullscreen Job Logs dialog component (show-job-logs.component.tsx) that uses EventSource and react-query streamedQuery to stream and render job logs in the UI.

This allows us to change the tab that's been opened by changing the url. Which is useful when we want to focus the models tab after training a model.

Load the '/thumbnail' image endpoint instead of '/full' for dataset items to reduce bandwidth and improve list rendering performance.

Introduce TrainModelButton, TrainModelDialog, and TrainableModelListBox to enable starting model training from the UI. Show a persistent toast with a Train action when a project has >= 20 images. Wire the button into the Dataset header, simplify Dataset content (remove status panel/divider), and adjust upload button styling.

Reorder providers so SelectedMediaItemProvider wraps InferenceProvider. Use useSelectedMediaItem inside InferenceProvider and add an onSetSelectedModelId wrapper that sets the model and invokes onInference when a media item is selected. Pass the wrapper into the context.

Add a new Models component to list project models and active training jobs. Export useProjectTrainingJobs and useRefreshModelsOnJobUpdates and improve the logic for detecting job completion to trigger model refreshes. Refactor TrainingInProgress: extract status/heading mapping, consolidate rendering, show job logs, and simplify imports. Remove the ReadyToTrain UI and wire the new Models view into the sidebar (fix icon import and props).

Copilot

Pull Request Overview

This PR adds a complete workflow for model training and inspection including a train model dialog with model selection, real-time job logs viewing, and a models tab to track training progress. The changes enable users to select and train models, monitor their progress, and view associated logs through the UI. Query parameters are now used to persist the selected sidebar tab across navigation.

Key Changes

Added a train model dialog allowing users to select from available models and initiate training
Implemented real-time job logs viewer using Server-Sent Events (SSE)
Created a models tab displaying trained models and in-progress training jobs
Updated navigation to use query parameters for tab state management

Reviewed Changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
application/ui/src/routes/welcome.tsx	Sets default mode to Dataset tab via query parameter
application/ui/src/routes/router.tsx	Updates redirect to include Dataset mode query parameter
application/ui/src/routes/inspect/inspect.tsx	Reorders provider hierarchy for inference context access
application/ui/src/features/inspect/train-model/trainable-model-list-box.component.tsx	Implements model selection UI with performance ratings
application/ui/src/features/inspect/train-model/train-model.module.scss	Adds styles for model selection cards and ratings
application/ui/src/features/inspect/train-model/train-model-dialog.component.tsx	Creates dialog component for initiating model training
application/ui/src/features/inspect/train-model/train-model-button.component.tsx	Adds button to trigger train model dialog with validation
application/ui/src/features/inspect/sidebar.component.tsx	Integrates query parameter-based tab navigation
application/ui/src/features/inspect/projects-management/project-list-item/project-list-item.component.tsx	Updates project navigation to set Dataset mode
application/ui/src/features/inspect/models/models.component.tsx	Creates models view displaying training jobs and completed models
application/ui/src/features/inspect/jobs/show-job-logs.component.tsx	Implements SSE-based job logs viewer
application/ui/src/features/inspect/inference-provider.component.tsx	Adds automatic inference triggering on model selection
application/ui/src/features/inspect/dataset/dataset.component.tsx	Integrates train button and training notification toast
application/ui/src/features/inspect/dataset/dataset-status-panel.component.tsx	Refactors training status display and adds job logs button
application/ui/src/features/inspect/dataset/dataset-item/dataset-item.component.tsx	Changes image endpoint from full to thumbnail
application/backend/src/services/job_service.py	Updates job end time tracking and SSE message formatting

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

application/ui/src/features/inspect/dataset/dataset-status-panel.component.tsx

application/ui/src/features/inspect/jobs/show-job-logs.component.tsx

application/ui/src/features/inspect/train-model/trainable-model-list-box.component.tsx

application/ui/src/features/inspect/models/models.component.tsx

Log the original message on SSE parse failures to aid debugging. Change duration to allow null and use null for in-progress jobs instead of Infinity. Remove unused InfoTooltip and related imports from the trainable model list to clean up dead code.

Fixes PLR1714

camiloHimura · 2025-10-24T11:28:14Z

application/ui/src/features/inspect/dataset/dataset.component.tsx

+        await queryClient.invalidateQueries({ queryKey: imagesOptions.queryKey });
+        const images = await queryClient.ensureQueryData(imagesOptions);
+
+        if (images.media.length >= 20) {


Is 20 going to be the constant number used for training?

Yes, we already have a constant for it but I forgot to use it here, will update.

We might also change this to a lower number in the future.

camiloHimura · 2025-10-24T11:30:42Z

application/ui/src/features/inspect/jobs/show-job-logs.component.tsx

+                        console.error('Could not parse message:', message);
+                    }
+
+                    ({ promise, resolve, reject } = Promise.withResolvers<string>());


First time using this method, felt like a nice use case. :D
I did worry a bit that maybe this code is a bit too weird, but I expect we'd only need it here so I don't see too many issues with it.

camiloHimura · 2025-10-24T11:32:18Z

application/ui/src/features/inspect/jobs/show-job-logs.component.tsx

+                    padding='size-200'
+                    backgroundColor={'gray-50'}
+                    UNSAFE_style={{
+                        fontSize: '11px',


Shouldn’t we use the Spectrum CSS variables instead?

camiloHimura · 2025-10-24T11:36:07Z

application/ui/src/features/inspect/models/models.component.tsx

+            if (job) {
+                const end = job.end_time ? new Date(job.end_time) : new Date();
+                // Job duration in seconds
+                duration = Math.floor((end.getTime() - start.getTime()) / 1000);


Let’s rename it to durationInSeconds so the comment isn’t needed anymore

camiloHimura · 2025-10-24T11:38:47Z

application/ui/src/features/inspect/models/models.component.tsx

+            }
+
+            return {
+                id: model.id!,


If possible, let’s avoid using the Non-Null Assertion Operator. When we’re sure about the value, we can just use String(model.id) instead

Yeah, the one thing I wanted to check as well is if we can make these ids non-nullable as that would save us a lot of effort. @maxxgx can you look into this?

I don't think I understand where nullable model.id is coming from? Where does the UI get the schema from?

We get this from the OpenAPI schema from the server. If you have the server running (on the default port) then you can run,

npm run build:api

from the ui folder and this will generate two files: a openapi-spec.json and openapi-spec.d.ts. The first is taken directly from the backend's OpenAPI json schema. The second is typescript types inferred based on that.

In this case the type inference happens on the models endpoint which returns a ModelList which in turn returns the Model pydantic model.
The nullable propery comes from BaseIDModel's ID field which takes a default_factory=uuid which makes it nullable. Removing that default factory should solve it from the UI's point of view, but we might need to double check that it doesn't introduce regressions in the other models.

camiloHimura · 2025-10-24T11:43:04Z

application/ui/src/features/inspect/train-model/trainable-model-list-box.component.tsx

+                <Flex alignItems={'center'} gap={'size-100'}>
+                    {RateColors[rating].map((color, idx) => (
+                        <View
+                            key={idx}


Declare const rateLabels = Object.keys(RateColors); outside the component

Suggested change

key={idx}

key={`rate-${rateLabels[idx]}`}

camiloHimura · 2025-10-24T11:44:51Z

application/ui/src/features/inspect/inference-provider.component.tsx

+        setSelectedModelId(modelId);
+
+        if (modelId) {
+            if (selectedMediaItem) {


Any particular reason for having them in separate if statements instead of combining them into one?

Not anymore, will combine them. I think before I had some extra stuff in the first body but that's no longer needed.

application/backend/src/services/job_service.py

Copilot

Pull Request Overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-24T15:37:34Z

application/ui/src/features/inspect/dataset/dataset-status-panel.component.tsx

+            const shouldRefetchModels = jobs.some((job, idx) => {
+                // NOTE: assuming index stays the same
+                return job.status === 'completed' && job.status !== prevJobsRef.current.at(idx)?.status;
+            });


The logic assumes array indices remain stable between job updates, which may not be true if jobs are reordered, filtered, or removed. This could cause the model refresh to be triggered incorrectly or not at all. Consider comparing jobs by ID instead of index.

Copilot · 2025-10-24T15:37:35Z

application/ui/src/features/inspect/train-model/trainable-model-list-box.component.tsx

+    // NOTE: we will need to update the trainable models endpoint to return more info
+    const models = trainableModels.map((model) => {
+        return {
+            modelTemplateId: model.id,
+            name: capitalize(model.name),
+            license: 'Apache 2.0',
+            performanceRatings: {
+                accuracy: 1,
+                inferenceSpeed: 1,
+                trainingTime: 1,
+            },
+            performanceCategory: PerformanceCategory.OTHER,
+        } satisfies SupportedAlgorithm;
+    });


Hardcoded placeholder values (all ratings set to 1, license 'Apache 2.0', category 'OTHER') reduce the usefulness of the performance rating UI. The TODO comment indicates this is temporary, but consider adding validation or warnings if the backend doesn't provide this data to prevent misleading users with incorrect performance metrics.

Copilot · 2025-10-24T15:37:35Z

application/ui/src/features/inspect/jobs/show-job-logs.component.tsx

+                    } catch {
+                        console.error('Could not parse message:', message);


The error message logs to console.error but doesn't expose the parsing error details to help diagnose why JSON parsing failed. Consider logging the actual error or the problematic message content for better debugging.

Suggested change

} catch {

console.error('Could not parse message:', message);

} catch (err) {

console.error('Could not parse message:', message, 'Error:', err);

MarkRedeman added 7 commits October 24, 2025 11:04

fix(job_service): Record end_time on terminal job statuses

f74fee8

Set end_time to the current UTC time when a job reaches COMPLETED, FAILED, or CANCELED.

feat(inspect): Use URLSearchParams to decide the tab that's opened

cee0c21

This allows us to change the tab that's been opened by changing the url. Which is useful when we want to focus the models tab after training a model.

perf(inspect/dataset-item): Use thumbnail for dataset item media URL

9363e06

Load the '/thumbnail' image endpoint instead of '/full' for dataset items to reduce bandwidth and improve list rendering performance.

MarkRedeman requested a review from samet-akcay as a code owner October 24, 2025 09:14

Copilot AI review requested due to automatic review settings October 24, 2025 09:14

Copilot AI reviewed Oct 24, 2025

View reviewed changes

MarkRedeman added 2 commits October 24, 2025 11:23

refactor(job_service): Use set membership for status checks

a6fc18f

Fixes PLR1714

ashwinvaidya17 approved these changes Oct 24, 2025

View reviewed changes

camiloHimura reviewed Oct 24, 2025

View reviewed changes

maxxgx reviewed Oct 24, 2025

View reviewed changes

application/backend/src/services/job_service.py Show resolved Hide resolved

Apply code review feedback

ab1e8c1

Copilot AI review requested due to automatic review settings October 24, 2025 15:36

Copilot AI reviewed Oct 24, 2025

View reviewed changes

camiloHimura approved these changes Oct 27, 2025

View reviewed changes

maxxgx mentioned this pull request Oct 27, 2025

chore(inspect): improve SSE generator #3060

Merged

15 tasks

MarkRedeman merged commit 6bf1e55 into open-edge-platform:feature/geti-inspect Oct 27, 2025
14 of 17 checks passed

		} catch {
		console.error('Could not parse message:', message);

feat(inspect): Add train model dialog, job logs and models dataset #3053

feat(inspect): Add train model dialog, job logs and models dataset #3053

Uh oh!

Conversation

MarkRedeman commented Oct 24, 2025

📝 Description

✨ Changes

✅ Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Key Changes

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants