Request for .NET Example Using InvokeModelWithBidirectionalStreamAsync with Nova Sonic #4065
-
|
Hi👋 I'm currently working on integrating Amazon Bedrock into a .NET application and would greatly appreciate an example demonstrating how to use InvokeModelWithBidirectionalStreamAsync in C# with Nova Sonic model. Specifically, I'm looking for guidance on:
I've seen examples in other languages (like Python and JavaScript), but the .NET implementation details are still unclear — especially around the event-driven streaming model. An official example or even a minimal working snippet would be incredibly helpful for developers trying to build real-time applications with Bedrock in .NET. Thanks in advance for your support! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
|
Hi @tlierdotfr Here is a sample app I wrote awhile back that uses the Program.cs using System.Text;
using System.Text.Json;
using NAudio.Wave;
using Amazon;
using Amazon.BedrockRuntime.Model;
using Amazon.BedrockRuntime;
using Amazon.Runtime.EventStreams;
using AudioStreamingAI;
// Configure the input device
WaveInEvent waveIn = new WaveInEvent();
waveIn.WaveFormat = new WaveFormat(16000, 16, 1);
waveIn.DeviceNumber = ChooseInputDevice();
// Configure the output device
WaveOutEvent waveOut = new WaveOutEvent();
waveOut.DeviceNumber = ChooseOutputDevice();
// The bufferedWaveProvider is used by the InvokeModelWithBidirectionalStreamAsync response streaming to add
// audio data that should be played by the output device.
BufferedWaveProvider bufferedWaveProvider = new BufferedWaveProvider(waveIn.WaveFormat)
{
// I found I needed to expand the default buffer size otherwise I would occasionally get errors from NAudio.
BufferDuration = TimeSpan.FromMinutes(1)
};
//
waveOut.Init(bufferedWaveProvider);
waveOut.Play();
var request = new InvokeModelWithBidirectionalStreamRequest
{
ModelId = "amazon.nova-sonic-v1:0"
};
// Hook up our application's event publisher as the Func on the InvokeModelWithBidirectionalStreamRequest
var publisher = new EventPublisher(waveIn.WaveFormat);
request.BodyPublisher = publisher.PublishAsync;
// As the input device receive audio data add it to our application's publisher.
waveIn.DataAvailable += (o, e) =>
{
publisher.AddAudioEvent(e.Buffer, 0, e.BytesRecorded);
};
// Start recording from the microphone
waveIn.StartRecording();
// Start the bi-directional call. Be sure the input device has started recording and the DataAvailable callback is setup
// before starting the call.
CancellationTokenSource cancelSource = new CancellationTokenSource();
using var client = new AmazonBedrockRuntimeClient(RegionEndpoint.USEast1);
using var response = await client.InvokeModelWithBidirectionalStreamAsync(request, cancelSource.Token);
// Setup the response callbacks to get the audio output.
response.Body.ChunkReceived += Body_ChunkReceived;
response.Body.ExceptionReceived += Body_ExceptionReceived;
_ = response.Body.StartProcessingAsync();
Console.WriteLine("Begin speaking with Amazon Nova Sonic...");
Console.WriteLine("Press any key to stop.");
Console.ReadKey();
Console.ReadKey();
Console.ReadKey();
waveIn.StopRecording();
cancelSource.Cancel();
void Body_ChunkReceived(object? sender, EventStreamEventReceivedArgs<BidirectionalOutputPayloadPart> e)
{
var str = UTF8Encoding.UTF8.GetString(e.EventStreamEvent.Bytes.ToArray());
// Look to see if the incoming event is an audio event. If so add the
// audio event output to the BufferedWaveProvider for the output device to play.
var jsonDoc = JsonDocument.Parse(str);
if (!jsonDoc.RootElement.TryGetProperty("event", out var evnt))
{
return;
}
if (!evnt.TryGetProperty("audioOutput", out var audioOutput))
{
return;
}
if (!audioOutput.TryGetProperty("content", out var base64Content))
{
return;
}
var data = Convert.FromBase64String(base64Content.GetString()!);
bufferedWaveProvider.AddSamples(data, 0, data.Length);
}
void Body_ExceptionReceived(object? sender, EventStreamExceptionReceivedArgs<BedrockRuntimeEventStreamException> e)
{
Console.WriteLine(e.EventStreamException.Message);
cancelSource.Cancel();
}
int ChooseInputDevice()
{
var deviceDescriptions = new List<string>();
for (var i = 0; i < WaveIn.DeviceCount; i++)
{
var capabilities = WaveIn.GetCapabilities(i);
deviceDescriptions.Add(capabilities.ProductName);
}
return ChooseDevice("Choose output device:", deviceDescriptions);
}
int ChooseOutputDevice()
{
var deviceDescriptions = new List<string>();
for (var i = 0; i < WaveOut.DeviceCount; i++)
{
var capabilities = WaveOut.GetCapabilities(i);
deviceDescriptions.Add(capabilities.ProductName);
}
return ChooseDevice("Choose output device:", deviceDescriptions);
}
int ChooseDevice(string label, IList<string> descriptions)
{
while (true)
{
Console.WriteLine("Choose input device:");
Console.WriteLine("--------------------");
for (var i = 0; i < descriptions.Count; i++)
{
Console.WriteLine($"{i}: {descriptions[i]}");
}
var input = Console.ReadKey();
if (!char.IsDigit(input.KeyChar))
{
Console.Error.WriteLine("Invalid device number");
continue;
}
Console.WriteLine(string.Empty);
var device = (int)(input.KeyChar - '0');
if (device < 0 || device > descriptions.Count)
{
Console.Error.WriteLine("Invalid device number");
}
return device;
}
}EventPublisher.cs using System.Collections.Concurrent;
using System.Text;
using Amazon.BedrockRuntime.Model;
using Microsoft.VisualBasic;
using NAudio.Wave;
namespace AudioStreamingAI;
internal class EventPublisher
{
BlockingCollection<IInvokeModelWithBidirectionalStreamInputEvent> _queuedEvents = new BlockingCollection<IInvokeModelWithBidirectionalStreamInputEvent>(new ConcurrentQueue<IInvokeModelWithBidirectionalStreamInputEvent>());
public EventPublisher(WaveFormat waveFormat)
{
AddInitialEvents(waveFormat);
}
/// <summary>
/// This instance method will be set as the BodyPublisher for the InvokeModelWithBidirectionalStreamRequest.
/// </summary>
/// <returns></returns>
public Task<IInvokeModelWithBidirectionalStreamInputEvent> PublishAsync()
{
// If there are no events in the queue the Take method will block.
var evnt = _queuedEvents.Take();
return Task.FromResult(evnt);
}
/// <summary>
/// These are the initial events used to start a speech to speech conversation;
/// </summary>
/// <param name="waveFormat"></param>
private void AddInitialEvents(WaveFormat waveFormat)
{
var json1 = """
{
"event": {
"sessionStart": {
"inferenceConfiguration": {
"maxTokens": 1024,
"topP": 0.95,
"temperature": 0.7
}
}
}
}
""".Trim();
_queuedEvents.Add(new BidirectionalInputPayloadPart { Bytes = ConvertStringToMemoryStream(json1) });
var json2 = """
{
"event": {
"promptStart": {
"promptName": "126680f5-5859-4d15-ae70-488de4146484",
"textOutputConfiguration": {
"mediaType": "text/plain"
},
"audioOutputConfiguration": {
"mediaType": "audio/lpcm",
"sampleRateHertz": TOKEN_HERTZ,
"sampleSizeBits": TOKEN_BITS,
"channelCount": TOKEN_CHANNEL,
"voiceId": "en_us_matthew",
"encoding": "base64",
"audioType": "SPEECH"
},
"toolUseOutputConfiguration": {
"mediaType": "application/json"
},
"toolConfiguration": {
"tools": []
}
}
}
}
""".Trim()
.Replace("TOKEN_HERTZ", waveFormat.SampleRate.ToString())
.Replace("TOKEN_BITS", waveFormat.BitsPerSample.ToString())
.Replace("TOKEN_CHANNEL", waveFormat.Channels.ToString());
_queuedEvents.Add(new BidirectionalInputPayloadPart { Bytes = ConvertStringToMemoryStream(json2) });
var json3 = """
{
"event": {
"contentStart": {
"promptName": "126680f5-5859-4d15-ae70-488de4146484",
"contentName": "a6431ef2-e23c-4f8c-a552-3f308629d3c3",
"type": "TEXT",
"interactive": true,
"textInputConfiguration": {
"mediaType": "text/plain"
}
}
}
}
""".Trim();
_queuedEvents.Add(new BidirectionalInputPayloadPart { Bytes = ConvertStringToMemoryStream(json3) });
var json4 = """
{
"event": {
"textInput": {
"promptName": "126680f5-5859-4d15-ae70-488de4146484",
"contentName": "a6431ef2-e23c-4f8c-a552-3f308629d3c3",
"content": "You are a friend. The user and you will engage in a spoken dialog exchanging the transcripts of a natural real-time conversation. Keep your responses short but talk with a pirate accent, generally two or three sentences for chatty scenarios.",
"role": "SYSTEM"
}
}
}
""".Trim();
_queuedEvents.Add(new BidirectionalInputPayloadPart { Bytes = ConvertStringToMemoryStream(json4) });
var json5 = """
{
"event": {
"contentEnd": {
"promptName": "126680f5-5859-4d15-ae70-488de4146484",
"contentName": "a6431ef2-e23c-4f8c-a552-3f308629d3c3"
}
}
}
""".Trim();
_queuedEvents.Add(new BidirectionalInputPayloadPart { Bytes = ConvertStringToMemoryStream(json5) });
var json6 = """
{
"event": {
"contentStart": {
"promptName": "126680f5-5859-4d15-ae70-488de4146484",
"contentName": "b3917935-2398-4889-94a8-e677f6c3e351",
"type": "AUDIO",
"interactive": true,
"audioInputConfiguration": {
"mediaType": "audio/lpcm",
"sampleRateHertz": 16000,
"sampleSizeBits": 16,
"channelCount": 1,
"audioType": "SPEECH",
"encoding": "base64"
}
}
}
}
""".Trim();
_queuedEvents.Add(new BidirectionalInputPayloadPart { Bytes = ConvertStringToMemoryStream(json6) });
}
MemoryStream ConvertStringToMemoryStream(string str)
{
var bytes = UTF8Encoding.UTF8.GetBytes(str);
return new MemoryStream(bytes);
}
/// <summary>
/// As the application receives audio data it will call this method with the audio
/// data and convert it into audio event and add to the queue.
/// </summary>
/// <param name="buffer"></param>
/// <param name="offset"></param>
/// <param name="length"></param>
public void AddAudioEvent(byte[] buffer, int offset, int length)
{
var audioString = Convert.ToBase64String(buffer, 0, length);
var audioJson = """
{
"event": {
"audioInput": {
"promptName": "126680f5-5859-4d15-ae70-488de4146484",
"contentName": "b3917935-2398-4889-94a8-e677f6c3e351",
"content": "BASE64_AUDIO",
"role": "USER"
}
}
}
""".Trim().Replace("BASE64_AUDIO", audioString);
var audioEvent = new BidirectionalInputPayloadPart { Bytes = ConvertStringToMemoryStream(audioJson) };
_queuedEvents.Add(audioEvent);
}
} |
Beta Was this translation helpful? Give feedback.
Hi @tlierdotfr
Here is a sample app I wrote awhile back that uses the
InvokeModelWithBidirectionalStreamAsyncAPI and theNAudio.WaveNuGet package for handling the audio input and output. Keep in mind with the audio stuff I have only tested it on my laptop and I wouldn't say the code is the most durable. But hopefully it shows how to publish and receive events using theInvokeModelWithBidirectionalStreamAsyncAPI. The app is a console .NET 8 app but due to the use ofNAudio.Waveit is restricted to Windows.Program.cs