Skip to content

How do I chunk audio from a stream for downstream processing? #618

@shashi-netra

Description

@shashi-netra

IMPORTANT: Be sure to replace all template sections {{ like this }} or your issue may be discarded.

Overview

I need to run a speech recognition engine on an audio stream, but it needs to happen in (near) real-time. My idea was to use Pyav to save audio for say 10 audio frames and then run the speech recognition. Is there a recommended way to chunk an incoming audio stream?

Expected behavior

#file container for output:

out_container = av.open('test.wav','w')
out_stream = out_container.add_stream(template=audio_stream)
for i,packet in enumerate(container.demux(audio_stream)):
    print float(packet.pts*packet.stream.time_base)
    out_container.mux(packet)
    if i %10:
       #run speech recognition module here every 10th audio frame
     speech_recog()

Actual behavior

Is this the recommended approach to chunk audio from a stream?

Research

I have done the following:

Additional context

{{ Add any other context about the problem here. }}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions