Equalizer logo
Collage logo
GPU-SD logo

Subpixel Compounds

Author: eilemann@gmail.com
State: Implemented in 0.9.1

Overview

This features adds support for a new type of decomposition and recomposition, whereby each contributing channel renders one or multiple subsamples for full-scene anti-aliasing (FSAA) or depth-of-field (DOF).

Applications might already do multi-pass software FSAA/DOF, either during rendering or when the application is idle. Equalizer does not yet have a notion of idle/non-idle rendering.

Design

The application will do the adding of the frustum jitter, since it is the one which knows how many FSAA and/or DOF samples are desired. Therefore it needs to know in Channel::frameDraw which sample out of how many it should render.

The default implementation of Channel::applyFrustum will use the subpixel sample description to compute an FSAA jitter using a pre-defined lookup table. It will add this jitter to the frustum supplied by getFrustum.

Applications which have their own SWAA settings will use the subpixel sample description to calculate how many passes with which samples have to be rendered, e.g., if it desires to render 16 samples on a 4-time decomposition, the application will render 4 passes out of a 16-value jitter lookup table on each channel.

It is the application's responsibility to provide a blended result of the sub-passes on each channel. This should not be an overhead, since the application could already compute the accumulation and averaging before.

Compositing

API

  class SubPixel
  {
  public:
    uint32_t index;
    uint32_t size;
  }
  
  vmml::Vector2f Channel::getJitter() const;
  vmml::Vector2i Channel::getSubpixel() const;

File Format

    compound
    {
        subpixel [ index size ]
    }

Implementation

Idle SWAA in eqPly

  class FrameData
    bool _idleMode         // idle AA mode

  class Config
    uint32_t _nbFramesAA   // number of frames left
  	
  class Channel
    Accum* _accum          // accumulation buffer
    uint32_t _jitterStep   // current jitter step decrementing
    uint32_t _totalSteps   // total steps to be done
    uint32_t _subpixelStep // number of ressources (subpixel.size)

    bool _needsTransfer    // combining idle AA with AA compound
    PixelViewport _currentPVP // current saved pvp

-------------------------------------------------------------------

  Config::startFrame
    if view is idle AA
      assert( nbFramesAA > 0 )
      --nbFramesAA
      set idleMode
    else
      nbFrames = 0
      clear idleMode
    commit frame data
    
  Config::handleEvent
    event = get IDLE_AA event
    nbFramesAA = MAX( nbFramesAA, event->jitter )

  Channel::frameStart
    if view is not in idle AA mode
      clear accum
      reset idle AA

  Channel::frameViewStart
    save _currentPVP

  Channel::frameDraw
    [...]
    _subpixelStep = MAX( _subpixelStep, getSubPixel().size )
    set _needsTransfer

  Channel::frameAssemble
    if getPVP != _currentPVP
      if accum doesn't use an FBO
        warning, idle AA not implemented
        stop idle AA
      do normal assembly
      set _needsTransfer
    else
      if all input frames have SubPixel::ALL
        do normal assembly
        set _needsTransfer
      else
        for each input frame
          assemble each frame with subpixel decomp
          accumulate
        clear _needsTransfer

  Channel::frameViewFinish
    if PVP has changed
      resize accum buffer
      clear accum
      reset idle AA
    else if idle mode && _needsTransfer
      if first idle step
        reset accum buffer
      accumulate back buffer to accum buffer
      display average result in back buffer

  Channel::frameFinish
    if idle mode && _jitterStep > 0
      _jitterStep -= subpixelStep

    create IDLE_AA event
    send event to the config

  Channel::applyFrustum
    if idle mode && _jitterStep > 0
      jitter = _getJitter();
      apply jitter vector to frustum
    else
      do normal apply frustum

  Channel::_getJitterStep
    idx = channelID * _totalSteps;
    idx += ( _jitterStep * primeNumber[channelID] ) % subset_size
    return ( idx % sampleSize, idx / sampleSize )

  Channel::_getJitter
    compute pixel size and subpixel size on near plane (frustum)
    generate random sample position in _getJitterStep() subpixel

AA compound in Equalizer

AA compounds without idle AA:
  eq::Channel::applyFrustum / applyOrtho
    -> add jitter
  eq::Channel::frameAssemble
    Step 1: Do accumulation using GPU:
      for each jitter decomp in input frames
        assemble all input frames with jitter decomp
        accumulate result
      return result
    Step 2: Opt: Do accumulate using CPU (later)

AA compounds with idle AA:
  See also Issue 6.
  -> extend compositor functions so the accumulation buffer (and state?) can be
  given as parameter

Server

loader
  add subpixel token

Compound::InheritData
  add subpixel param

Compound::updateInheritData
  update subpixel

CompoundUpdateOutput::_updateOutput
  update subpixel

ChannelUpdateVisitor::_setupRenderContext
  update subpixel

FrameData::setSubPixel

Client

FrameData::Data
  add subpixel param

Channel::getSubPixel

Channel::applyFrustum
  if subpixel decomp
    lookup on precomputed jitter tables
    if no table is found
      compute the sample randomly
    apply the jitter vector

Compositor::obtainAcum
  get accum object from ObjectManager
  if accum doesn't exist
    create new accum object
  else
    resize accum

Compositor::assembleFramesSorted/Unsorted
  if input frames have different subpixel decomp
    if accum doesn't exist
      obtain accum from compositor
      clear accum
    extract similar subpixel decomp
    assemble frames (recursion on assembleFramesSorted/Unsorted)
    accumulate frames
    display result
  [...]

Issues

1. Do we need a one-dimensional or two-dimensional subpixel description for each source channel?

This depends on Q2. My feeling is a one-dimensional attribute is easier.

2. How do I compute my current jitter area?

The number of jitter areas, i.e, the number of FSAA or DOF samples, is an application parameter. The current jitter area within each Channel::frameDraw is a function of the current idle step and the channel's subpixel description.

Open Issue: the function itself. The jitter values for all channels in the first pass should be evenly distributed over the total subpixel grid.

Idea: Divide the sample grid into the number of channels to get a subset of pixels for each channel. For each pass, compute (pass_number * rnd_prime_number) % size to get kind of a randomized distribution. The size parameter is the size of the pixel subset for the current channel.

3. How do I disable FSAA when it's not supported?

Depending on the OpenGL version, we run either the glAccum method either the FBO's one. On MacOSX, a different implementation of glAccum is needed.

4. How do I compute the jitter steps for FSAA compound?

The destination channel asks for a new frame instead that the config starts a new frame itself. Each destination channel sends an event to the config which takes the maximum number of steps needed and enables to start a new frame.

5. How do I decrement the jitter steps done in the current frame?

The destination channel might render 0..n jitter steps and receive 0..n jitter steps as input frames during compositing.

We could assume that a subpixel decomposition parameter is set for the destination channel. If the destination channel does a frameDraw, he will decrement the jitter step by the subpixel decomposition size in frameDraw, otherwise he will decrement it by the subpixel decomposition size of the input frames in frameAssemble:

  Channel::frameStart
    _subpixelSteps = 0;
  Channel::frameDraw
    _subpixelSteps = EQ_MAX( _subpixelSteps, subpixel.size );
  Channel::frameAssemble
    for each input frame
      _subpixelSteps = EQ_MAX( _subpixelSteps, frame.subpixel.size );
  Channel::frameFinish
    _jitterSteps -= _subpixelSteps;
    send event

6. How does the accumulation in frameAssemble work together with the one in frameViewFinish?

Without any scalability, each destination channel accumulates in idle mode the results of different frames into an accumulation buffer. With scalability, frameAssemble might accumulate the results of different, jittered input frames.

A simple, but non-optimal solution is to assemble and average all jitter steps in frameAssemble, and to transfer the result into the accumulation buffer n times in frameViewFinish:

  Channel::frameAssemble
    accumulate all input frames in separate accum buffer
    return accum buffer to frame buffer
  Channel::frameViewFinish
    copy frame buffer _subpixelSteps times into accum buffer
    _stepsDone += _subpixelSteps;
    return accum buffer

In the optimal case, frameAssemble and frameViewFinish should share the accumulation buffer:

  Channel::frameDraw
    set _needsTransfer
  Channel::frameViewStart
    save current pvp
  Channel::frameAssemble
    if( current pvp != saved pvp )
      use separate accum buffer
      return accum buffer after assembly
      set _needsTransfer
      Q: what if glAccum is used? Disable idle AA for this channel?
    if( all input frames have Subpixel::ALL )
      do normal assembly
      set _needsTransfer
    else if for each jitter decomp in input frames
      assemble all input frames with jitter decomp
      accumulate result
      clear _needsTransfer
  Channel::frameViewFinish
    if _needsTransfer
      accumulate FB into accum buffer
    return accum buffer