Skip to content

How can I distribute Pyav jobs on many servers with Spark #659

@vtexier

Description

@vtexier

Overview

I am using Spark (pyspark) to distribute big data jobs across servers. I would like to distribute Pyav encoding and filter features with Spark on many servers .

Expected behavior

The distributed function will execute Pyav features which need to be aware of previous and/or next frames in the GOP.

How to implement those features with GOP acknowledgement?

Actual behavior

Isolated frame in a separate job can not be used alone.

Investigation

A solution to give Spark the context by serializing the Frame object with pickle is not enough, see #652

Research

I have done the following:

https://gitter.im/mikeboers/PyAV?at=5eab0dd59f0c955d7d97bbb1

Additional context

@koenvo, may be you can help me on this as you propose in #652 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions