Skip to content

feat: add frame_converter and provide consumers opaque frame type #1609

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Julusian
Copy link
Member

@Julusian Julusian commented Mar 26, 2025

This is the first polished portion of #1581

This removes the image_data on const_frame and replaces it with an opaque image_ptr. Consumers can use the frame_converter to convert that into a pixel buffer.
For now this is the same basic texture to buffer copy as before, #1610 will replace that with a proper conversion step

Most of the consumers need testing to make sure they work and don't crash immediately. ndi and ffmpeg appear to not work for me, but the same is true with master so is probably something weird on my machine today..

This removes the `image_data` on `const_frame` and replaces it with an opaque `image_ptr`. Consumers can use the `frame_converter` to convert that into a pixel buffer.
For now this is the same basic texture to buffer copy as before, but will soon be replaced with a format conversion step
@Julusian Julusian force-pushed the feat/shader-conversions-1-frame-converter branch from c6e2afc to e5dbb26 Compare April 11, 2025 16:15
@niklaspandersson
Copy link
Member

I like this in principle.
I'm a bit worried about compatibility and performance of the consumers that we're not actively testing. From what I can tell, we now need to deal with the async nature of rendering / processing the frame on the GPU in both the image_mixer (actual compositing) and in every consumer (downloading from GPU).
I haven't measured, but I recon consumers will need to buffer / queue converted_frames to minimize latency. Should we make some general purpose buffer that all consumers (that don't get special treatment) can use to get the same latency as before this change?

Other than that I really like what this enables, and can't wait for it to land

@Julusian
Copy link
Member Author

Yeah I am a little worried about potential performance impact, but I also wonder if it will just be fine? I havent done any performance testing though, just some light usage.

The theory I have in my mind from the code is that this should be minimally different, a key thing is that I have made sure to queue the download before buffering the frames in consumers.
As for the downloads themselves, they are being done in the ogl thread and shouldn't be reliant on being awaited by the consumers for the download to progress. So unless the system is struggling, I think this should mean that frames will download in plenty of time. If the frame isn't ready and the consumer is awaiting the frame, this is no different to existing behaviour of blocking waiting for a frame to be pushed to the buffer.


I am currently thinking of releasing 2.5.0 any day now.
This will merge into master(aka 2.6) shortly after, with any follow ups when they look ready.

@niklaspandersson
Copy link
Member

niklaspandersson commented Jun 12, 2025

I think an architecture where we isolate the usage of the frame_converter from the consumers would be cleaner and keep the complexity of consumers down.

Something like this:

  1. Channels instantiate a frame_converter
  2. A sub-interface of the frame_converter is passed to each consumer in the constructor, only allowing the consumer to set / update its required frame_conversion_format (or an array of formats if the consumer is made up of multiple ports).
  3. During rendering, the channel automatically runs all required conversions and push the final, converted, frames to each consumer
  4. Consumers behave exactly as today, with the exception that the frame-data received is already converted to each consumers specific need.

Pros:

  • Consumer complexity would not be any greater than today.
  • It would also allow for optimizations in the case where multiple consumers want the same "view" of the rendered frame (it would only need to be converted and downloaded once).
  • It would take one step closer to bringing the ports-functionality to more consumers (subregions for non-decklink consumer #1587).
  • We don't have to deal with asynchronous frame conversions in every consumer, and all complexity that entails

The total complexity would not be greater than the current implementation in the PR.

@Julusian
Copy link
Member Author

I did consider doing something like that, but decided against it for some reasons that I don't 100% remember.
I think one reason for this was to avoid having to deal with race conditions when the consumer reinitialises and starts wanting a different format (although thinking about it, maybe that could still be a problem in this implementation if a consumer isnt careful). The other case I can think of is that depending on where this bulk downloading happens, could another consumer be added while this is happening?

The other part of the reason was that I think it will a little complexity to consumers, as they will need to declare the array of frames they need. And then when consuming the frames will need to guard against the frames vector being the wrong length. At the time, I didn't have an idea on how to do this and it felt like it would become a mess, but maybe it won't be as bad as I was worried about.

It would also allow for optimizations in the case where multiple consumers want the same "view" of the rendered frame

Yeah it would do that better/cleaner, I have tried to handle this today with some caching, but I acknowledge that it could be prone to race conditions resulting in an unnecessary cache miss, because of the downloads being scheduled on different thread. I felt that a cache miss was better than wrapping it in a mutex.

It would take one step closer to bringing the ports-functionality to more consumers

I don't think it would change anything. The current approach of this PR allows consumers to download as many versions of the frame they want. All the subregion logic is being moved into this download step, so I don't see what would be any better

The total complexity would not be greater than the current implementation in the PR.

I'm not 100% convinced that will be the case. I suspect its going to add a bunch of complexity to the channel (output?) at the point where it needs to do the conversions.
Sure, this PR does require some changes in every consumer, but they are pretty small and while it does require some thinking about async, not much.

I am also curious what the performance implications would be. Doing the conversions in the channel would mean that (unless a new thread is introduced here for this purpose, with some handling for race conditions) some part of the channel will need to block while these downloads are being done, which goes against the way the mixer is pipelining the compositing?
Perhaps it would be best to do these downloads as part of that compositing (like the old rgba8 download used to be), but that really raises the risk of race conditions (consumers being added/reconfigured)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants