Now StencilTableFactory::AppendLocalPointStencilTable() does
nothing when the localPointStencilTable is empty. This avoids
a potential crash (failed assertion) when both the baseStencilTable
and the localPointStencilTable are empty, as is the case for
simple geometry like the all-quads torus regression test shape.
For now, the common patch shader code supports fractional spacing
modes only when screen-space tessellation is also enabled.
It's possible to relax this restriction, but that requires changing
the client shader interface.
This change includes support for both fractional_even_spacing
and fractional_odd_spacing.
The implementation follows the existing pattern of re-parameterizing
the tessellation domain only along transition boundary edges. This
allows for crack-free tessellation, but it might be better to
consistently re-parameterize all of the outer edges of all patches,
which also would be required for numerically watertight tessellation.
This is implemented in a way that requires no changes to the client
shader API. It should be more efficient to move some computations to
the control/hull shaders and reduce divergence in the execution of
eval/domain shaders.
- instead of accumulating GregoryBasis::Point (fixed size stencils
backed by stackbuffer), pack the stencils into StencilTable as they
are evaluated
- use single integer for varying stencils of patch points, not
a GregoryBasis::Point
- cap the reserved stencil entry size.
optimize PatchTableFactory::Create performance
The primary hesitation here is that this change alters the approximation for BSpline end caps, however the consensus is that BSpline end caps were already the most extreme approximation, so it makes sense to sacrifice quality for speed.
Reviewed by David, Barry, George and myself.
- replace std::vector with vtr::StackBuffer in GregoryBasis::Point
- remote getQuadOffsets call from ProtoBasis
- rewrite some inefficient code in the endcap generation.
Note that this is a temporary remedy for the performance issue in 3.0.
We'll fix it again in the later release.
This fixes a regression in the function used to identify single crease
patches. This also updates the patch color values used by the glImaging
regression test to match the colors used in other example viewers so
that patch types can be more easily identified during automated testing.
generateAllLevels flags is used when we want to have uniform refined
patch array containing all levels together. There's an index offsetting
bug when this flag is enabled.
- eliminated need for disabling warning 177 in far/patchTableFactory.cpp
- encapsulated all floating point equality tests (1572) in local functions
- bracketed all ICC specific warning pragmas with #ifdef __INTEL_COMPILER
- avoided GCC's stricter shadowing warning in vtr/array.h
This workaround actually caused linking errors related on missing
symbols and removing the workaround does not cause any duplicated
symbols with MSVC 2013.
Te issue here is that some of the functions were not considered templated
anymore because all their template arguments were specified, which made it
so compiler was creating implementation for them in every file from where
the header was included. This causes errors during linking related on the
same symbol implemented in several places.
Marking those functions as inlined solves the problem and should not cause
any bad side effects because they're small enough and likely being inlined
by an optimizer anyway.
- correctly initialize FVar tag and source entry for unconnected verts
- added regression/shape with unconnected vertices and fvar data
- fixed edge-face vector access when unconnected edges are last
In one case, we were comparing int and unsigned int.
In primvarRefiner, some values were safely uninitialized, but older compilers
(GCC 4.1) were complaining.
This change restores the use of 4-bits in Far::PatchParam to
encode the refinement level of a patch. This restores one bit
that was stolen to allow for more general encoding of boundary
edge and transition edge masks. In order to accommodate all
of the bits that are required, the transition edge mask bits
are now stored along with the faceId bits.
Also, accessors are now exposed directly as members of Far::PatchParam
and the internal bitfield class is no longer directly exposed.
If the input cv stencil is given and it includes the local point
stencils for endcaps, LimitStencilTableFactory::Create failed
because of incorrect sanity checking.
Unified transition patch drawing affects the calculation of
tessellation level metrics. Because a single edge of a shader
patch might be split into two halfs along a transition edge,
the effective maximum number of spans along any adaptive edge
is limited to half of the device maximum.
Although valence 2 gregory patch is not well supported yet, this fix
mitigates artifacts around such a vertex.
Adding a shape catmark_gregory_test8 to see this issue.
stb - potential use of uninitialized variable (this may have been safe)
farViewer - unused variable
patchTableFactory - _channelIndices potentially used uninitialized
FVarLevel - valueIndexInFace0 potentially used used uninitialized (was safe)
We'll restore this code and finish it up for the next release.
For now, removing this code restores parity with the 3.0 beta,
i.e. face-varying patches are always all bilinear.
Besides we've not been computing accurate derivatives on gregory patch,
there was a separate bug in shaders which gives completely bogus dUdV on
corner vertices. This change fixes that significant artifact, however,
is still approximating derivatives by ignoring rational components.
- moved TopologyRefiner out of the RefinerFactoryBase into Far
- moved implementation of its Factory<MESH> to far/topologyDescriptor.*
- updated examples and tutorials (no more references to FactoryBase)
There's a lot of good foundational work here to eventually support
smooth interpolation of face-varying patches. Unfortunately, this
is not quite ready to release. Therefore, we've decided to defer this
feature until a later release.
This change hides this code behind the FAR_FVAR_SMOOTH_PATCH macro.
Now the channel specifier is the last parameter in a method's
parameter list with a default of 0. This is consistent with the
topological face-varying queries and also simplifies the common
case of just a single face-varying channel.
- add HLSL equivalents of the previous GLSL change
- rename OsdGetSingleCreaseSegmentParameter to
OsdGetPatchSingleCreaseSegmentParameter.
- add shadingMode UI for dxViewer similar to glViewer
use boundaryMask to identify the crease edge from 4 edges.
with this change, single-crease patch no longer needs to be rotated on
its population.
In shader, experimentally use same infinite sharp matrix for both
boundary and single-crease patch.
- change public status of members back to protected/private
- mimized friends (primarily Refinements as builders for Levels)
- added any missing accessors to prevent member access (mainly Tags)
- added the Tri/Quad refinement subclasses to private header list
Added a size specifier to the shader output array declaration
in the GregoryBasis and Gregory control shaders. This seems
to be required by the GLSL compiler on AMD and is harmless elsewhere.
Added a size specifier to the shader output array declaration
in the BSpline control shader. This seems to be required by the
GLSL compiler on AMD and is harmless elsewhere.
This change refactors the GLSL and HLSL patch shader code so that
most of the work is implemented within a library of common functions
and the remaining shader snippets just manage plumbing.
There is more to do here:
- varying and face-varying data can be managed entirely by the client
- similarly, displacement can be implemented in client code
- there's still quite a bit of residual boiler-plate code needed
in each shader stage that we should be able to wrap up in a more
convenient form.
- removed all of the multi-level Interpolate...() methods taking T*, U*
- made all single-level methods consistent wrt usage of T&, U&
- replaced usage in regressions, tutorials and examples
- additional minor improvements to far/tutorials
To encapsulate endcap functions from public API, add methods to
tell the number of patch points needed (GetNumLocalPoints()) and
to compute those patch points as a result of change of basis from
the refined vertices (ComputeLocalPointValues()).
ComputeLocalPointValues takes contiguous source data of all levels
including level0 control vertices.
- removed AddVaryingWithWeight from Far::PrimvarRefiner interpolation
- removed Far::StencilBuilder dependencies on varying
- updated Far::StencilTableFactory use of StencilBuilder constructor
- updated far/tutorial_2 to use vertex colors vs varying (for now)
- added TopologyRefiner base level modifiers to TopologyRefinerFactoryBase
- removed old modifiers from TopologyRefiner (unused by anything else)
- updated existing Factory<MESH> definitions to use new methods
- updated PrimvarRefiner to make use of now-local Mask class
- renamed vtr/maskInterfaces.h to vtr/componentInterfaces.h
- updated usage of renamed header file and CMakeLists.txt
It looks like there's a compiler bug in some earlier nvidia driver 340/346 releases.
It has been fixed in 348.07 (win) as far as I can tell.
Following code behaves incorrectly.
void f(int a) {
for (int i=0; i<3; ++i) doSomething(a, i);
}
void g() {
for (int i=0; i<100; ++i) f(i);
}
The workaround is to use different identifiers for each function.
- removed TopologyRefiner options to propagate base face in Refinememt
- removed all Vtr storage and management of base face
- added PrimvarRefiner methods to "interpolate" per-face primvar data
Add EvalStencils and EvalPatches API for most of CPU and GPU evaluators.
with this change, Eval API in the osd layer consists of following parts:
- Evaluators (Cpu, Omp, Tbb, Cuda, CL, GLXFB, GLCompute, D3D11Compute)
implements EvalStencils and EvalPatches(*). Both supports derivatives
(not fully implemented though)
- Interop vertex buffer classes (optional, same as before)
Note that these classes are not necessary to use Evaluators.
All evaluators have EvalStencils/Patches which take device-specific
buffer objects. For example, GLXFBEvaluator can take GLuint directly
for both stencil tables and input primvars. Although using these
interop classes makes it easy to integrate osd into relatively
simple applications.
- device-dependent StencilTable and PatchTable (optional)
These are also optional, but can be used simply a substitute of
Far::StencilTable and Far::PatchTable for osd evaluators.
- PatchArray, PatchCoord, PatchParam
They are tiny structs used for GPU based patch evaluation.
(*) TODO and known issues:
- CLEvaluator and D3D11Evaluator's EvalPatches() have not been implemented.
- GPU Gregory patch evaluation has not been implemented in EvalPatches().
- CudaEvaluator::EvalPatches() is very unstable.
- All patch evaluation kernels have not been well optimized.
- Currently GLXFB kernel doesn't support derivative evaluation.
There's a technical difficulty for the multi-stream output.
* Added stencilTable.cpp
* Fixed the "off" variable shadow warning
* Moved constructors to cpp file
* Made LimitStencilTable constructor private
* other minor clean up.
- created new class Far::PrimvarRefiner with interpolation methods
- removed interpolation and limit methods from Far::TopologyRefiner
- replaced internal usage in Far::StencilTableFactory
- replaced usage in regressions, tutorials and examples
This is a new implementation of the stencil table construction algorithm found
in protoStencil.h. In local tests with production assets, the new algorithm is
~25% faster and significantly more stable, in terms of average performance
In one asset test, generating stencils for level 10 adaptive refinement of
BuzzLightyear was reduced from 18s to 13s.
- removed all friend declarations (no more Far declarations with Vtr)
- all protected methods made public
- intent is to move these into namespace internal
- access between Vtr classes is under review
- it takes number and pointer for the input PatchCoords.
- add derivative evaluations.
- enhance glEvalLimit example to see the derivative evaluation works.
Cleaned up the Legacy Gregory shader source by accessing buffer
data through helper functions.
Switched to performing tessellation in untransformed (object) space.
this change fixes a crash when selecting catmark_hole_test1
in glStencilViewer. If there's a hole limit stencils may not
be found for given limit location, resulting invalid stencil
entries.
- changed Vtr::LocalIndex to 16-bit integer from 8-bit
- added test shapes including valence 360 vertices
- disabled new shapes in far/regression until improved accuracy accepted
PatchTables no longer needs to friend EndCapLegacyGregoryPatchFactory.
Instead we now make the patch table factory pass in the data that needs
to be updated directly to EndCapLegacyGregoryPatchFactory.
In osd layer, we use GLPatchTable (D3D11PatchTable) as a
device-specific representation of FarPatchTables instead of
DrawContext. GLPatchTable may be used not only for drawing
but also for GPU eval APIs (not yet supported though.
We may add CudaPatchTable etc as needed).
The legacy gregory patch drawing buffers are carved out to
the separate class, named GLLegacyGregoryPatchTable.
Also face-varying data are split into client side for now, until
we add new and more robust face-varying drawing structure
(scheduled at 3.1 release)
Tentatively replicate PatchArray structure in GLPatchTables. It will
be revised in the upcoming change.
Shifting hard-coded SRV locations of legacy gregory buffers in HLSL shaders.
- changes completely deprecate AddWithWeight(T, float, float, float)
- added new EvaluateBasis() method to PatchTables
- replaced usage of old Evaluate...<T,U>() methods with EvaluateBasis()
- removed old Evaluate...<T,U>() methods
- removed now unused Interpolate...<T,U>() functions in far/interpolate.h
- moved low-level basis code from far/interpolate.* to patchBasis.*
hlslPatchGregoryBasis.hlsl is an equivalent to glslPatchGregoryBasis.
Update dxViewer to be able to switch among bspline, gregorybasis, legacy
end capping.
also fixes a bug of GLSL legacy gregory shader which had an inconsistent
resource naming with example codes.
It looks like there's still an issue of D3D11 patchParam data fetching.
we'll come back to that bug.
While this may be worth revisiting, we should first quantify the benefits and
identify the compilers that support it. Ultimately, we may never use pragma
once in favor of strictly using standard C++.
- added new class to far/topologyLevel.h
- updated TopologyRefiner to manage set of TopologyLevels internally
- added TopologyRefiner method to retrieve TopologyLevel
- redefined obsolete TopologyRefiner methods in terms of TopologyLevel
As a preparation for retiring DrawContext, move SupportsAdaptiveTessellation
method to examples/common/glUtils, which is renamed and namespaced
from gl_common.{cpp,h} to be consistent to other files.
Same renamings applied to other example files.
Remove DrawRegistry from osd layer and put a simple shader caching
utility into examples/common. osd layer only provides patch shader
snippet and let client configure and compile the code. Clients also
maintain the lifetime of shader object, which is preferable for the
actual application integration.
update all examples to use the new scheme.
These are now redundant since all bspline patches are encoded in
the patch tables consistently using 16 point indices with boundary
and corner edges indicated in the boundary mask of the patch params.
My earlier change which simplified the categorization of
patch types broke evaluation for boundary and corner patches.
Previously, boundary and corner patches were always rotated
into a canoncial orientation by permuting the point indices
of the patch. This was convenient in some cases, but generally
made things unecessarily complicated, since the parameterization
of the patch had to be counter-rotated to compensate.
Now patches always remain correctly oriented with respect
to the underlying surface topology and evaluation of boundary
and corner patches is accommodated by simply adjusting the
spline weights to account for the missing/invalid patch
points along boundary and corner edges.
There is more to clean up and optimize, but this restores
correct behavior.
Since unified shading work already removed subPatch info from
Osd::PatchDescriptor, the difference between Far::PatchDescriptor and
Osd::PatchDescriptor is just maxValence and numElements. They are used
for legacy gregory patch drawing.
Both maxValence and numElements are actually constant within a topology
(drawContext). This change move maxValence to DrawContext and let client
manage numElements, then we can eliminate Osd::PatchDescriptor and simply
use Far::PatchDescritor instead.
This is still an intermediate step toward further DrawRegistry refactoring.
For the time being, adding EffectDesc struct to include maxValence and
numValence to be maintained by the clients. They will be cleaned up later.
The side benefit of this change is we no longer need to recompile regular b-spline
shaders for the different max-valences.
- added Far::GetGregoryWeights() to work directly with 20 weights
- simplified tensor-product evaluation of Bezier and BSpline (will more
readily support higher order derivatives in near future)
- fixed Bezier derivative scaling issue (off by factor of 3.0)
* noted incorrectness of Gregory derivatives (correction will accompany
support for higher order derivatives in near future)
- Remove MeshPtexData bit from Osd::MeshBits. It's not used any more
- Rename ptexIndexBuffer in D3D11DrawContext to paramParamBuffer
- Remove Is/SetPtexEnabled from D3D11DrawRegistry
In OpenSubdiv 2.x, we encapsulated subdivision tables into
compute context in osd layer since those tables are order-dependent
and have to be applied in a certain manner. In 3.0, we adopted stencil
table based refinement. It's more simple and such an encapsulation is
no longer needed. Also 2.0 API has several ownership issues of GPU
kernel caching, and forces unnecessary instantiation of controllers
even though the cpu kernels typically don't need instances unlike GPU ones.
This change completely revisit osd client facing APIs. All contexts and
controllers were replaced with device-specific tables and evaluators.
While we can still use consistent API across various device backends,
unnecessary complexities have been removed. For example, cpu evaluator
is just a set of static functions and also there's no need to replicate
FarStencilTables to ComputeContext.
Also the new API delegates the ownership of compiled GPU kernels
to clients, for the better management of resources especially in multiple
GPU environment.
In addition to integrating ComputeController and EvalStencilController into
a single function Evaluator::EvalStencils(), EvalLimit API is also added
into Evaluator. This is working but still in progress, and we'll make a followup
change for the complete implementation.
-some naming convention changes:
GLSLTransformFeedback to GLXFBEvaluator
GLSLCompute to GLComputeEvaluator
-move LimitLocation struct into examples/glEvalLimit.
We're still discussing patch evaluation interface. Basically we'd like
to tease all ptex-specific parametrization out of far/osd layer.
TODO:
-implments EvalPatches() in the right way
-derivative evaluation API is still interim.
-VertexBufferDescriptor needs a better API to advance its location
-synchronization mechanism is not ideal (too global).
-OsdMesh class is hacky. need to fix it.
Changing all device kernels to take two buffer identifiers for
source and destination separately.
This change is an intermediate step toward upcoming context/controller
refactoring.
Previously we have a limitation that the source and destination
vertex buffer has to be a single buffer, since the subdivision
kernels are iteratively applied by level.
With stencil tables, we don't have such a limitation any more,
so we may want to apply stencils from seprate source buffer to
another.
To specifiy the output location within the destination buffer,
we can use VertexBufferDescriptor.offset. This allows us not only
configuring arbitrary batching scheme, but also relaxing the
limitation that source and destination buffers are in same
interleaved layout. For examples, we could include derivatives only
in the destination buffer, which doesn't need to be allocated in
the source buffer.
we're teasing out ptex specific data from core osd entities,
so there's no reason to keep ptex texturing utilities in core osd.
move them into example libs and let clients assemble shader snippets
as needed.
Also removing older ptex texturing code (without mipmap)
Each patch has a corresponding patchParam. This is a set of three values
specifying additional information about the patch:
faceId -- topological face identifier (e.g. Ptex FaceId)
bitfield -- refinement-level, non-quad, boundary, transition, uv-offset
sharpness -- crease sharpness for a single-crease patch
These are stored in OsdPatchParamBuffer indexed by the value returned
from OsdGetPatchIndex() which is a function of the current PrimitiveID
along with an optional client provided offset.
Accessors are provided to extract values from a patchParam. These are
all named OsdGetPatch*().
While drawing patches, the patchParam is condensed into a patchCoord which
has four values (u, v, faceLevel, faceId). These patchCoords are treated
as int values during per-prim processing but are converted to float values
during per-vertex processing where the values are interpolated.
Also, cleaned up more of the shader namespace by giving an Osd prefix
to public functions, and consolidated boundary and transition handling
code into the PatchCommon shader files. The functions determining
tessellation levels are now all named OsdGetTessLevel*().
- resolves DX-CL interop functions in Osd::ClD3D11VertexBuffer.
- enable CL kernels in DX build.
- more cleanup in test harnesses, adding D3D11 initializations into DeviceContext.
- add new defines OPENSUBDIV_HAS_OPENGL and OPENSUBDIV_HAS_DX for convenience.
- removed default value for its <SIZE> parameter
- updated all usage to specify a value for <SIZE>
- added explicit element destruction missing from destructor
- corrected comment regarding VLA's being non-standard
refactor CL/CUDA specific initialization stuffs into
examples/common/clDeviceContext and cudaDeviceContext, and
update examples to use those structs.
also
- remove CL/CUDA tests from osd_regression. The tests for those kernels will be covered by glImaging.
- update cuda initialization to use the GL-interoperable device if available.
- remove CL specialization from glShareTopology, following the same pattern as we took in the previous OsdGLMesh refactoring. (still something strange with XFB kernels though)
- fix file permissions.
The previous change to the gathering of patch points went
a bit too far. Near non-manifold features it is important
to be careful when traversing the faces in a level to avoid
assumptions that are valid only for manifold topology.
Also, removed Vtr::Level::gatherQuadRegularPatchPoints().
This method was added in my previous change, but it is
unsafe to use in the presence of non-manifold topology.
Removed OpenCL/D3D11 specialization and add DEVICE_CONTEXT as a template
parameter. For the kernels which don't need a context object (e.g.
CPU, OpenGL, cuda) just ignore the context, and for the kernels which
use a context (e.g. OpenCL, DirectX) takes a context or a user-defined
class as which encapsulates device contexts. Note that OpenCL requires
two objects, cl_context and cl_command_queue. The user-defined
class must provide GetContext() and GetCommandQueue() for strongly typed
binding to osd VertexBuffers and ComputeContexts.
Osd::Mesh and MeshInterface have been used as a handy harness to host
multiple GPU kernels and graphics APIs. However it has CL/DirectX
specializations and duplicates large amount of plubming code. With this
change, glMesh.h and d3d11Mesh.h become just typedefs and all logic is
put into mesh.h without specializations.
Also cleaned up unused header files and code formatting.
Now a boundary and corner patch remains
aligned with its underlying parametric
orientation. This simplifies both the
gathering of patch vertices and downstream
evaluation of patches.
Added a method to Vtr::Level which gathers
the 16 patch points for a B-spline patch
even if the patch has boundary or corner
edges. The undefined patch vertex index
values along boundary and corner edges are
assigned Vtr::INDEX_INVALID.
In order to simplifiy the process of drawing
B-spline patches with boundary or corner
edges, the Far::PatchTablesFactory will
replace any invalid vertex indices with
a known good value, i.e. the index of the
first patch face vertex.
Single-crease patches are still a slightly
special case, which will be resolved later.
- rename "Regular end cap" to "BSplineBasis end cap"
- revert templating and add EndCapType into PatchTablesFactory::Options.
- make EndCapFactories internal in PatchTablesFactory.
- move end cap stencils into PatchTables, keep them relative to the max level.
- add a utility StencilTablesFactory::AppendEndCapStencilTables to splice and factorize endcap stencil tables.
Remove the ptex-specific code from the Far::TopologyRefiner and instead provide it in a separate class Far::PtexIndices. Clients who need to use the Ptex API need to first build a Far::PtexIndices object by providing it with a refiner.
This has the advantage of keeping the API on the TopologyRefiner a little cleaner. The ptex methods were const but were mutating state with const_casts. The new mechanism still achieves the same lazy initialization behavior by forcing clients to instantiate them exactly when needed.
A disadvantage of this approach is that the PatchTablesFactory creates its own PtexIndices and throws it out after the patch tables are created. This is great if you're never going to need the ptex indices again, but not so great if you will need them again.
computes edge lengths using limit surface points. Made this
the default screen-space metric so that we avoid cracks when
using Gregory Basis or Regular B-spline end caps.
The alternative method which computes edge lengths using the
distance between B-spline control points is still available.
Added a diagram and comments to explain how the control
points and limit points are organized.
This change moves all gregory patch generation from Far::PatchTablesFactory
so that we can construct patch tables without stencil tables as well as client
can chose any end patch strategies (we have 3 options for now: legacy 2.x style
gregory patch, gregory basis patch and experimental regular patch approximation).
Also Far::EndCapGregoryBasisPatchFactory provides index mapping from patch index
to vtr face index, which can be used for single gregory patch evaluation on top
of refined points, without involving heavier stencil tables generation.