Add EvalStencils and EvalPatches API for most of CPU and GPU evaluators.
with this change, Eval API in the osd layer consists of following parts:
- Evaluators (Cpu, Omp, Tbb, Cuda, CL, GLXFB, GLCompute, D3D11Compute)
implements EvalStencils and EvalPatches(*). Both supports derivatives
(not fully implemented though)
- Interop vertex buffer classes (optional, same as before)
Note that these classes are not necessary to use Evaluators.
All evaluators have EvalStencils/Patches which take device-specific
buffer objects. For example, GLXFBEvaluator can take GLuint directly
for both stencil tables and input primvars. Although using these
interop classes makes it easy to integrate osd into relatively
simple applications.
- device-dependent StencilTable and PatchTable (optional)
These are also optional, but can be used simply a substitute of
Far::StencilTable and Far::PatchTable for osd evaluators.
- PatchArray, PatchCoord, PatchParam
They are tiny structs used for GPU based patch evaluation.
(*) TODO and known issues:
- CLEvaluator and D3D11Evaluator's EvalPatches() have not been implemented.
- GPU Gregory patch evaluation has not been implemented in EvalPatches().
- CudaEvaluator::EvalPatches() is very unstable.
- All patch evaluation kernels have not been well optimized.
- Currently GLXFB kernel doesn't support derivative evaluation.
There's a technical difficulty for the multi-stream output.
- it takes number and pointer for the input PatchCoords.
- add derivative evaluations.
- enhance glEvalLimit example to see the derivative evaluation works.
In osd layer, we use GLPatchTable (D3D11PatchTable) as a
device-specific representation of FarPatchTables instead of
DrawContext. GLPatchTable may be used not only for drawing
but also for GPU eval APIs (not yet supported though.
We may add CudaPatchTable etc as needed).
The legacy gregory patch drawing buffers are carved out to
the separate class, named GLLegacyGregoryPatchTable.
Also face-varying data are split into client side for now, until
we add new and more robust face-varying drawing structure
(scheduled at 3.1 release)
Tentatively replicate PatchArray structure in GLPatchTables. It will
be revised in the upcoming change.
Shifting hard-coded SRV locations of legacy gregory buffers in HLSL shaders.
As a preparation for retiring DrawContext, move SupportsAdaptiveTessellation
method to examples/common/glUtils, which is renamed and namespaced
from gl_common.{cpp,h} to be consistent to other files.
Same renamings applied to other example files.
My earlier change which simplified the categorization of
patch types broke evaluation for boundary and corner patches.
Previously, boundary and corner patches were always rotated
into a canoncial orientation by permuting the point indices
of the patch. This was convenient in some cases, but generally
made things unecessarily complicated, since the parameterization
of the patch had to be counter-rotated to compensate.
Now patches always remain correctly oriented with respect
to the underlying surface topology and evaluation of boundary
and corner patches is accommodated by simply adjusting the
spline weights to account for the missing/invalid patch
points along boundary and corner edges.
There is more to clean up and optimize, but this restores
correct behavior.
In OpenSubdiv 2.x, we encapsulated subdivision tables into
compute context in osd layer since those tables are order-dependent
and have to be applied in a certain manner. In 3.0, we adopted stencil
table based refinement. It's more simple and such an encapsulation is
no longer needed. Also 2.0 API has several ownership issues of GPU
kernel caching, and forces unnecessary instantiation of controllers
even though the cpu kernels typically don't need instances unlike GPU ones.
This change completely revisit osd client facing APIs. All contexts and
controllers were replaced with device-specific tables and evaluators.
While we can still use consistent API across various device backends,
unnecessary complexities have been removed. For example, cpu evaluator
is just a set of static functions and also there's no need to replicate
FarStencilTables to ComputeContext.
Also the new API delegates the ownership of compiled GPU kernels
to clients, for the better management of resources especially in multiple
GPU environment.
In addition to integrating ComputeController and EvalStencilController into
a single function Evaluator::EvalStencils(), EvalLimit API is also added
into Evaluator. This is working but still in progress, and we'll make a followup
change for the complete implementation.
-some naming convention changes:
GLSLTransformFeedback to GLXFBEvaluator
GLSLCompute to GLComputeEvaluator
-move LimitLocation struct into examples/glEvalLimit.
We're still discussing patch evaluation interface. Basically we'd like
to tease all ptex-specific parametrization out of far/osd layer.
TODO:
-implments EvalPatches() in the right way
-derivative evaluation API is still interim.
-VertexBufferDescriptor needs a better API to advance its location
-synchronization mechanism is not ideal (too global).
-OsdMesh class is hacky. need to fix it.
- rename "Regular end cap" to "BSplineBasis end cap"
- revert templating and add EndCapType into PatchTablesFactory::Options.
- make EndCapFactories internal in PatchTablesFactory.
- move end cap stencils into PatchTables, keep them relative to the max level.
- add a utility StencilTablesFactory::AppendEndCapStencilTables to splice and factorize endcap stencil tables.
Remove the ptex-specific code from the Far::TopologyRefiner and instead provide it in a separate class Far::PtexIndices. Clients who need to use the Ptex API need to first build a Far::PtexIndices object by providing it with a refiner.
This has the advantage of keeping the API on the TopologyRefiner a little cleaner. The ptex methods were const but were mutating state with const_casts. The new mechanism still achieves the same lazy initialization behavior by forcing clients to instantiate them exactly when needed.
A disadvantage of this approach is that the PatchTablesFactory creates its own PtexIndices and throws it out after the patch tables are created. This is great if you're never going to need the ptex indices again, but not so great if you will need them again.
This change moves all gregory patch generation from Far::PatchTablesFactory
so that we can construct patch tables without stencil tables as well as client
can chose any end patch strategies (we have 3 options for now: legacy 2.x style
gregory patch, gregory basis patch and experimental regular patch approximation).
Also Far::EndCapGregoryBasisPatchFactory provides index mapping from patch index
to vtr face index, which can be used for single gregory patch evaluation on top
of refined points, without involving heavier stencil tables generation.
This is the first step to tease off Osd compute controller/contexts
from Far API.
Currently FarStencilTable only creates a kernelbatch for the entire range,
so we can use [0, numStencils) for all cases instead of KernelBatch.
This might not be true if we apply non-factorized level-wise stencils,
then we'll add another modular utility to serve those cases.
PatchTablesFactory fills 20 indices topology into patchtable, and use it for eval and draw.
note: currently screen-space adaptive tessellation of gregory basis patches is
broken and cracks appear around them.
- extend Far::PatchTables data structures & interfaces to store requisite
information for channels of face-varying bi-cubic patches
- implement gather function in Far::PatchTablesFactory to populate face-varying
channels with adaptive patches
- extend accessor interface in Vtr::Level
- propagate code fall-out throughout OpenSubdiv code base, examples & tutorials
- extend vtrViewer code to visualize tessellated bi-cubic face-varying patches
- renamed Sdc::Type to SchemeType and TypeTraits to SchemeTypeTraits
- renamed TYPE_ prefix to SCHEME_
- updated all usage within core library
- updated all usage in examples, tutorials, etc.
- move level of refinement / isolation into the Options structs
- fix splash damage in rest of the code
note 1: this is less than ideal, because most compilers accept the previous
call to these functions with an incorrect parameter list (ie. passing
the level instead of the struct issues no warnings and compiles...)
caveat emptor...
note 2: the level parameter names may not be final for adaptive modes
as we will likely want independent controls over crease vs.
extraordinary vertex isolation.
- don't rotate (s,t) coordinates but rotate the patch instead !
- refactor osd/cpuEvalLimitKernels to share Far::PatchTables cubic spline
interpolation functions : this replaces tensor product formulation with
weight matrices, which does not really impact performance here, but would
have to be replaced when implementing regular gridding functions.
- fix OsdCpuEvalLimitController to not rotate coordinates and pass the rotation bitfields
- expose Far::PatchTables spline interpolation API (protected -> public)
- fix glEvalLimit tangent buffers (remove empty padding - see below)
- change policy for tangent buffers : the output buffer descriptor is
**NO LONGER APPLIED** to tangent output buffers. Tangent primvar data
buffers are no longer applying the offset and stride from the descriptor
(because it doesn't make sense to share it). If more flexiblity is
required, we will consider adding independent descriptors for the tangent
buffers. This change will impact existing code that generates tangents
with the EvalLimit controller.
fixes#370
Const' declared instances of Vtr::Array do not protect the pointer held
privately by the class properly. In order to force the compiler to
protect this pointer, we removed all non-const accessors from Vtr::Array
(now renamed Vtr::ConstArray) and moved them to a child class (Vtr::Array),
which requires const_cast<> operators internally to allow access.
The change & renaming is then propagated to all internal dependencies.
- change error codes from situational to general (fatal / coding / run-time...)
- pull error functions from Osd into Far
- add a templated topology validation reporting system to Far::TopologyRefinerFactory
- fix fallout on rest of code-base
- split Far::PatchDescriptor into its own class (mirrors Far::PatchParam)
- hide PatchArray as a private internal structure
- add public accessors patterned after Far::TopologyRefiner (returning Vtr::Arrays)
- propagate new API to all dependent code
note: some direct table accessors have not been removed *yet* - see code for details
- re-implement the pool allocator
- use templates to remove code redundancy between regular & limit stencils
- leverage [] operator overloading to simplify stencil factorization
- add the ability to treat subdivision levels independently (see below)
- refactor Far::TopologyRefiner::Interpolate<>() methods to pass buffers by reference
(allows overloading of [] operator)
- rename some of the stencil factory options
- propagate changes to Osd / examples / tutorials...
Make the sample locations dynamic by adding a velocity vector. Face boundary
crossing is handled using the new ptex adjacency functionality recently
added to the Far::TopologyRefiner.
Sync'ing the 'dev' branch with the 'feature_3.0dev' branch at commit 68c6d11fc36761ae1a5e6cdc3457be16f2e9704a
The branch 'feature_3.0dev' is now locked and preserved for historical purposes.