-
Notifications
You must be signed in to change notification settings - Fork 838
Image generation tool #6749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
ericstj
wants to merge
26
commits into
dotnet:main
Choose a base branch
from
ericstj:ImageGenerationTool
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Image generation tool #6749
Changes from all commits
Commits
Show all changes
26 commits
Select commit
Hold shift + click to select a range
ffe9a92
Prototype of using ImageGenerationTool
ericstj e5edc77
Handle DataContent returned from ImageGen
ericstj 2d19cce
React to rename and improve metadata
ericstj 5eef474
Handle image_generation tool content from streaming
ericstj ff80804
Add handling for combining updates with images
ericstj 1725ce1
Add tests for new ChatResponseUpdateExtensions
ericstj c44f5fb
Merge branch 'main' of https://github.com/dotnet/extensions into Imag…
ericstj b4fe94b
Rename ImageGenerationTool to HostedImageGenerationTool
ericstj 06bfa30
Remove ChatResponseUpdateCoalescingOptions
ericstj ca8b15d
Add ImageGeneratingChatClient
ericstj 62e0ac5
Fix namespace of tool
ericstj 81e6e5a
Replace traces of function calling
ericstj 6559a66
More namepsace fix
ericstj 398bbdb
Enable editing
ericstj ac2de35
Merge branch 'main' of https://github.com/dotnet/extensions into Imag…
ericstj 1d96532
Update to preview OpenAI with image tool support
ericstj 6a6ffa2
Temporary OpenAI feed
ericstj 94ceab2
Fix tests
ericstj 96e9747
Add integration tests for ImageGeneratingChatClient
ericstj 9ddc91a
Remove ChatRole.Tool -> Assistant workaround
ericstj 3b589ac
Remove use of private reflection for Image results
ericstj 20919ab
Add ChatResponseUpdate.Clone
ericstj e5f68a6
Move all mutable state into RequestState object
ericstj 9f9a430
Adjust prompt to improve integration test reliability
ericstj 799a72e
Refactor tool initialization
ericstj 6029b01
Add integration tests for streaming
ericstj File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -119,6 +119,95 @@ static async Task AddMessagesAsync( | |
list.AddMessages(await updates.ToChatResponseAsync(cancellationToken).ConfigureAwait(false)); | ||
} | ||
|
||
/// <summary>Applies a <see cref="ChatResponseUpdate"/> to an existing <see cref="ChatResponse"/>.</summary> | ||
/// <param name="response">The response to which the update should be applied.</param> | ||
/// <param name="update">The update to apply to the response.</param> | ||
/// <exception cref="ArgumentNullException"><paramref name="response"/> is <see langword="null"/>.</exception> | ||
/// <exception cref="ArgumentNullException"><paramref name="update"/> is <see langword="null"/>.</exception> | ||
/// <remarks> | ||
/// This method modifies the existing <paramref name="response"/> by incorporating the content and metadata | ||
/// from the <paramref name="update"/>. This includes using <see cref="ChatResponseUpdate.MessageId"/> to determine | ||
/// message boundaries, as well as coalescing contiguous <see cref="AIContent"/> items where applicable, e.g. multiple | ||
/// <see cref="TextContent"/> instances in a row may be combined into a single <see cref="TextContent"/>. | ||
/// </remarks> | ||
[Experimental("MEAI0001")] | ||
public static void ApplyUpdate(this ChatResponse response, ChatResponseUpdate update) | ||
{ | ||
_ = Throw.IfNull(response); | ||
_ = Throw.IfNull(update); | ||
|
||
ProcessUpdate(update, response); | ||
FinalizeResponse(response); | ||
} | ||
|
||
/// <summary>Applies <see cref="ChatResponseUpdate"/> instances to an existing <see cref="ChatResponse"/>.</summary> | ||
/// <param name="response">The response to which the updates should be applied.</param> | ||
/// <param name="updates">The updates to apply to the response.</param> | ||
/// <exception cref="ArgumentNullException"><paramref name="response"/> is <see langword="null"/>.</exception> | ||
/// <exception cref="ArgumentNullException"><paramref name="updates"/> is <see langword="null"/>.</exception> | ||
/// <remarks> | ||
/// This method modifies the existing <paramref name="response"/> by incorporating the content and metadata | ||
/// from the <paramref name="updates"/>. This includes using <see cref="ChatResponseUpdate.MessageId"/> to determine | ||
/// message boundaries, as well as coalescing contiguous <see cref="AIContent"/> items where applicable, e.g. multiple | ||
/// <see cref="TextContent"/> instances in a row may be combined into a single <see cref="TextContent"/>. | ||
/// </remarks> | ||
[Experimental("MEAI0001")] | ||
public static void ApplyUpdates(this ChatResponse response, IEnumerable<ChatResponseUpdate> updates) | ||
{ | ||
_ = Throw.IfNull(response); | ||
_ = Throw.IfNull(updates); | ||
|
||
if (updates is ICollection<ChatResponseUpdate> { Count: 0 }) | ||
{ | ||
return; | ||
} | ||
|
||
foreach (var update in updates) | ||
{ | ||
ProcessUpdate(update, response); | ||
} | ||
|
||
FinalizeResponse(response); | ||
} | ||
|
||
/// <summary>Applies <see cref="ChatResponseUpdate"/> instances to an existing <see cref="ChatResponse"/> asynchronously.</summary> | ||
/// <param name="response">The response to which the updates should be applied.</param> | ||
/// <param name="updates">The updates to apply to the response.</param> | ||
/// <param name="cancellationToken">The <see cref="CancellationToken"/> to monitor for cancellation requests. The default is <see cref="CancellationToken.None"/>.</param> | ||
/// <returns>A <see cref="Task"/> representing the completion of the operation.</returns> | ||
/// <exception cref="ArgumentNullException"><paramref name="response"/> is <see langword="null"/>.</exception> | ||
/// <exception cref="ArgumentNullException"><paramref name="updates"/> is <see langword="null"/>.</exception> | ||
/// <remarks> | ||
/// This method modifies the existing <paramref name="response"/> by incorporating the content and metadata | ||
/// from the <paramref name="updates"/>. This includes using <see cref="ChatResponseUpdate.MessageId"/> to determine | ||
/// message boundaries, as well as coalescing contiguous <see cref="AIContent"/> items where applicable, e.g. multiple | ||
/// <see cref="TextContent"/> instances in a row may be combined into a single <see cref="TextContent"/>. | ||
/// </remarks> | ||
[Experimental("MEAI0001")] | ||
public static Task ApplyUpdatesAsync( | ||
this ChatResponse response, | ||
IAsyncEnumerable<ChatResponseUpdate> updates, | ||
CancellationToken cancellationToken = default) | ||
{ | ||
_ = Throw.IfNull(response); | ||
_ = Throw.IfNull(updates); | ||
|
||
return ApplyUpdatesAsync(response, updates, cancellationToken); | ||
|
||
static async Task ApplyUpdatesAsync( | ||
ChatResponse response, | ||
IAsyncEnumerable<ChatResponseUpdate> updates, | ||
CancellationToken cancellationToken) | ||
{ | ||
await foreach (var update in updates.WithCancellation(cancellationToken).ConfigureAwait(false)) | ||
{ | ||
ProcessUpdate(update, response); | ||
} | ||
|
||
FinalizeResponse(response); | ||
} | ||
} | ||
|
||
/// <summary>Combines <see cref="ChatResponseUpdate"/> instances into a single <see cref="ChatResponse"/>.</summary> | ||
/// <param name="updates">The updates to be combined.</param> | ||
/// <returns>The combined <see cref="ChatResponse"/>.</returns> | ||
|
@@ -372,6 +461,29 @@ private static void ProcessUpdate(ChatResponseUpdate update, ChatResponse respon | |
(response.Usage ??= new()).Add(usage.Details); | ||
break; | ||
|
||
case DataContent dataContent when | ||
!string.IsNullOrEmpty(dataContent.Name): | ||
// Check if there's an existing DataContent with the same name to replace | ||
for (int i = 0; i < message.Contents.Count; i++) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this change the algorithm to be O(N^2)? |
||
{ | ||
if (message.Contents[i] is DataContent existingDataContent && | ||
string.Equals(existingDataContent.Name, dataContent.Name, StringComparison.Ordinal)) | ||
{ | ||
// Replace the existing DataContent | ||
message.Contents[i] = dataContent; | ||
dataContent = null!; | ||
break; | ||
} | ||
} | ||
|
||
if (dataContent is not null) | ||
{ | ||
// No existing DataContent with the same name, add it normally | ||
message.Contents.Add(dataContent); | ||
} | ||
|
||
break; | ||
|
||
default: | ||
message.Contents.Add(content); | ||
break; | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
49 changes: 49 additions & 0 deletions
49
src/Libraries/Microsoft.Extensions.AI.Abstractions/Tools/HostedImageGenerationTool.cs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
// Licensed to the .NET Foundation under one or more agreements. | ||
// The .NET Foundation licenses this file to you under the MIT license. | ||
|
||
using System; | ||
using System.Diagnostics.CodeAnalysis; | ||
using System.Text.Json.Serialization; | ||
|
||
namespace Microsoft.Extensions.AI; | ||
|
||
/// <summary>Represents a hosted tool that can be specified to an AI service to enable it to perform image generation.</summary> | ||
/// <remarks> | ||
/// This tool does not itself implement image generation. It is a marker that can be used to inform a service | ||
/// that the service is allowed to perform image generation if the service is capable of doing so. | ||
/// </remarks> | ||
[Experimental("MEAI001")] | ||
public class HostedImageGenerationTool : AITool | ||
{ | ||
/// <summary> | ||
/// Initializes a new instance of the <see cref="HostedImageGenerationTool"/> class with the specified options. | ||
/// </summary> | ||
public HostedImageGenerationTool() | ||
: base() | ||
{ | ||
} | ||
|
||
/// <summary> | ||
/// Gets or sets the options used to configure image generation. | ||
/// </summary> | ||
public ImageGenerationOptions? Options { get; set; } | ||
|
||
/// <summary> | ||
/// Gets or sets a callback responsible for creating the raw representation of the image generation tool from an underlying implementation. | ||
/// </summary> | ||
/// <remarks> | ||
/// The underlying <see cref="IChatClient" /> implementation can have its own representation of this tool. | ||
/// When <see cref="IChatClient.GetResponseAsync" /> or <see cref="IChatClient.GetStreamingResponseAsync" /> is invoked with an | ||
/// <see cref="HostedImageGenerationTool" />, that implementation can convert the provided tool and options into its own representation in | ||
/// order to use it while performing the operation. For situations where a consumer knows which concrete <see cref="IChatClient" /> is being used | ||
/// and how it represents this tool, a new instance of that implementation-specific tool type can be returned by this | ||
/// callback for the <see cref="IChatClient" /> implementation to use instead of creating a new instance. | ||
/// Such implementations might mutate the supplied options instance further based on other settings supplied on this | ||
/// <see cref="HostedImageGenerationTool" /> instance or from other inputs, therefore, it is <b>strongly recommended</b> to not | ||
/// return shared instances and instead make the callback return a new instance on each call. | ||
/// This is typically used to set an implementation-specific setting that isn't otherwise exposed from the strongly typed | ||
/// properties on <see cref="ImageGenerationOptions" />. | ||
/// </remarks> | ||
[JsonIgnore] | ||
public Func<IChatClient, object?>? RawRepresentationFactory { get; set; } | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the existing ToChatResponse then just:
or is there a meaningful behavioral or performance difference with that?