Author: eilemann@gmail.com
State: Partly implemented
Table of Contents
Background
Frames are used by compounds during the recomposition phase. This document discusses the implementation details of this part of the compound spec.
Overview
Frames can be input frames or output frames. Output frames produce the data consumed by input frames. The common link between the output and input frames is the frame data. The frame data holds the data, which is organized into a set of images. Each image represents a 2D array of pixels, for each (OpenGL) framebuffer attachment - color, depth and stencil. The default implementation uses one image per frame data. The application can implement a custom readback producing multiple images per type, in order to further minimize the amount of pixel data.
A frame has a buffer format, which defines the components of the framebuffer to be read back (output frame) or assembled (input frames). It has a viewport, which defines the fractional portion of the channel has to be read back (output) or which fractional portion of the corresponding output frame (input frames) has to be assembled.
Requirements
Frames are used in the following way:
- Output images for one output frame are produced by a single node.
- Images are send from the 'output' node to multiple 'input' nodes
- The input frame buffers may be a subset of the output frame buffers. Consequently, the node producing the output images only sends images to nodes where the input buffers match.
- Same optimisation for other attributes, i.e., viewport, eye(?)
- Frames and images are relatively static during rendering. The objects should be reused from (config) frame to frame to avoid costly network instantiation and reallocation.
The following features must be implementable:
- It must be possible to select a ready frame from a list of input frames, to allow early assembly when using multiple input frames.
- Different readback mechanisms: readPixels, FBO, hardware framegrabbers
- Different assemble mechanisms
- On-GPU storage if input and output frame are on the same pipe, using textures or FBO's
Implementation
The implementation should execute as follows:
[readback callback] foreach frame start readback of image[s] foreach frame sync readback of image[s] foreach image of frame send image to all 'input' nodes Note: multiple consumers on a node should receive only one update foreach frame send frame ready to all 'input' nodes [assemble callback] (background receiving in recv thread) foreach frame sync frame ready .OR. select ready frame from frame list start draw frame foreach frame sync draw frame (omit or defer until frame is reused?)
Use monitor to determine readyness of frame data.
Q: scope of frames and images when using multiple pipes per node.
A: SCOPE_THREAD for frames to reference correct frame data for each thread.
SCOPE_NODE for frame data, with merged data of all input frames on node.
Images are not distributed objects. They are managed directly by the
framedata and the data is distributed using command packets.
Q: latency
A: New frame data for each config frame (and recycle once obsolete)
Images have to be pushed by the output node for optimal performance. The subscriber model of versioned objects does not work for this purpose, since the frame input and output relations may change every frame, due to load balancing, DPlex phase or other per-frame compound changes. Furthermore, the number of images per frame can vary.
On the server:
Compound::update prepare output frames (1st tree traversal) release old frame datas based on frame number recycle new frame data set frame data (output frame data) commit frame data (new version) set frame data (frame data id and version, viewport, buffers, eye, ...) commit frame prepare input frames (2nd tree traversal) set current data (viewport, buffers, eye, ...) set frame data id and version from output frame commit frame Compound::updateChannel (per-channel task generation) foreach output frame if applicable (filter: buffers, vp, eye) add to output frame list if output frame list not empty send readback task w/ output frame list [ output frame id and version ] -> holds frame data id and version -> holds frame number foreach frame in output frame list foreach input frame of output frame if applicable (filter: buffers, vp, eye) filter duplicate nodes send transmit task [ output frame id and version, attrs ] foreach input frame if applicable (filter: buffers, vp, eye, already send to node) add to assemble task frames [ input frame id and version ] -> holds frame data id and version -> holds frame number if assemble task frame list not empty send assemble task containing frame listOn the "output" node:
readback task: start readback transmit task: send by the output frame's FrameData: foreach matching image sync readback transmit image packet [ frame data version, type, vp, data ] transmit frame data ready [ frame number ]On the "input" node:
assemble task foreach input frame sync to frame version call assemble() sync frame[s] wait for frame data images to reach version assemble frame[s] [recv thread] receive input image if version of images older than received image release images set version of images to received version save image into recycled Image object receive ready set image ready version
API
Incomplete pseudocode of the API, look at the implementation for full details. A object of a given type has a different implementation on the server-side and client-side which share the same distributed data.
class Frame { uint32_t frameDataID; uint32_t frameDataVersion; } class FrameData { void addImage( Image::Type type, Image* image ); // corresponding getters uint32_t _versionImages; eq::base::Monitor _versionReady; }; class Image { enum Type { TYPE_COLOR = GL_COLOR_BUFFER_BIT, TYPE_DEPTH = GL_DEPTH_BUFFER_BIT, TYPE_STENCIL = GL_STENCIL_BUFFER_BIT }; void startReadback( buffers, type, x, y, w, h ); void syncReadback(); };
Open Issues
Send optimisation for partial input frames, select ready frames from list for early assembly,
Restrictions
Only full images are sent to the nodes. If the input frame specifies a non-full viewport (which is relative to the output frame), the amount of data send will not be optimized at the moment. The same applies to the other attributes (buffers, eye, etc.).