Skip to content

How do I manage multi rtsp more efficiently? #585

@mifenfeifen

Description

@mifenfeifen

Overview

I have tens of rtsp video stream to recv and to calc the frames. Now the problem is that recv multi rtsp stream at the time may occupy the most cpu resource.

ffmpeg command-line

I had tried to use subproccess to exec the ffmpeg cmd to do that, the cpu goes to 70% —— I think it make sense. The only problem is that it seems it can only save the image into disk, and I have to read the pic from disk. That's a little bit complex.

num = 32
cmd = 'ffmpeg -y -rtsp_transport tcp -i {} -r 1 /data/test_ffmpeg/stream_{:02}/img_{:02}_%03d.jpg'
for i in range(num):
    idx = i % 3
    url = rtsp_ls[idx]
    ## delegator is the lib that use subprocess more gracefully
    # p = Thread(target=lambda url, i: delegator.run(cmd.format(url, i, i)), args=(url, i))
    # p.start()
    p = delegator.run(cmd.format(url, i, i), block=False)

pyav

So I use the pyav lib to do the job.

  • To avoid the GIL, I select Process to exec the decoding job.
  • using multiprocessing.Queue to combine video stream decoding and the calculation work.
class Decode(Process):
    options = {'rtsp_transport': 'tcp', 'stimeout': '5000000', 'max_delay': '5000000'}

    def __init__(self, logger, queue_frame, stream_id, stream_data, time_sample=1):
        Process.__init__(self, daemon=True)
        # ...

    def run(self):
        rtsp_url = self.stream_data['stream_url']
        cap = av.open(rtsp_url, 'r', options=self.options)
        # cap.streams.video[0].thread_type = 'NONE'
        ## interval
        try:
            fps = int(cap.streams.video[0].average_rate)
        except Exception:
            fps = 25
        num_interval = round(fps*self.time_sample)
        ## start to run
        cnt_frame = 0
        while True:
            vf = next(cap.decode(video=0))
            cnt_frame += 1
            if cnt_frame == num_interval:
                frame = vf.to_ndarray(format='bgr24')
                self.queue_frame.put(frame)

the code snippet is to calc

class Calc(Process):
    def __init__(self, logger, queue_frame):
        Process.__init__(self)
        # ...

    def run(self):
        while True:
            if self.queue_frame.empty():
                sleep(0.1)
                continue
            temp = self.queue_frame.get()
            # calculation work

SO in my code, there are multi Decode processes and one Calc process.

Expected behavior

The code can do the same work same the command-line, and has the similar performance.

Actual behavior

My code using pyav works, but the performance is not well. Ignoring the Calc process, the 32 Decode processes drive the cpu 100%.

Investigation

  1. In fact, I used OpenCV to complete the work, but it cost too much resource. When the num of rtsp streams is only 20, the cpu had achieved 100%.
  2. I thought that maybe Process cost too much, so I changed to Thread —— made no sense.
  3. If I use ProcessPool to manage the processes, it seems that I can't get the status of the rtsp stream in real time, so I haven't done that.

Research

I have done the following:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions