Merge pull request #639 from takahito-tejima/doc

update osd layer documents
2024-12-26 09:41:08 +00:00 · 2015-06-17 18:10:35 -07:00 · 2015-06-17 18:10:35 -07:00 · 0e5b504bca
commit 0e5b504bca
parent bb72958ab4 c9fbc2c49d
25 changed files with 462 additions and 156 deletions
--- a/documentation/CMakeLists.txt
+++ b/documentation/CMakeLists.txt
@ -87,6 +87,7 @@ if (DOCUTILS_FOUND AND PYTHONINTERP_FOUND)
        mod_notes.rst
        maya_osdpolysmooth.rst
        osd_overview.rst
+        osd_shader_interface.rst
        porting.rst
        release_notes.rst
        release_notes_2x.rst
--- a/documentation/api_overview.rst
+++ b/documentation/api_overview.rst
@ -175,7 +175,7 @@ Using the Right Tools

 OpenSubdiv's tiered interface offers a lot flexibility to make your application
 both fast and robust. Because navigating through the large collection of classes and
-features can be challenging, here is a flow-chart that should help sketch
+features can be challenging, here are use cases that should help sketch
 the broad lines of going about using subdivisions in your application.

 General client application requirements:
@ -206,10 +206,3 @@ General client application requirements:
 |                      | back-ends provide full support for programmable       |
 |                      | shading.                                              |
 +----------------------+-------------------------------------------------------+
-
-Flow-chart:
-
-.. image:: images/osd_flow.png
-   :align: center
-   :target: images/osd_flow.png 
-
--- a/documentation/diagrams.odg
+++ b/documentation/diagrams.odg
--- a/documentation/images/api_layers_3_0.png
+++ b/documentation/images/api_layers_3_0.png
--- a/documentation/images/osd_backends.png
+++ b/documentation/images/osd_backends.png
--- a/documentation/images/osd_context_controller.png
+++ b/documentation/images/osd_context_controller.png
--- a/documentation/images/osd_controllers.png
+++ b/documentation/images/osd_controllers.png
--- a/documentation/images/osd_controllers_example0.png
+++ b/documentation/images/osd_controllers_example0.png
--- a/documentation/images/osd_controllers_example1.png
+++ b/documentation/images/osd_controllers_example1.png
--- a/documentation/images/osd_draw.png
+++ b/documentation/images/osd_draw.png
--- a/documentation/images/osd_flow.png
+++ b/documentation/images/osd_flow.png
--- a/documentation/images/osd_limiteval.png
+++ b/documentation/images/osd_limiteval.png
--- a/documentation/images/osd_limitstencil.png
+++ b/documentation/images/osd_limitstencil.png
--- a/documentation/images/osd_refinement.png
+++ b/documentation/images/osd_refinement.png
--- a/documentation/images/osd_shader_bspline.png
+++ b/documentation/images/osd_shader_bspline.png
--- a/documentation/images/osd_shader_gregory.png
+++ b/documentation/images/osd_shader_gregory.png
--- a/documentation/images/osd_shader_legacy_gregory.png
+++ b/documentation/images/osd_shader_legacy_gregory.png
--- a/documentation/images/osd_shader_param_remap.png
+++ b/documentation/images/osd_shader_param_remap.png
--- a/documentation/images/osd_shader_patch.png
+++ b/documentation/images/osd_shader_patch.png
--- a/documentation/images/osd_shader_transition.png
+++ b/documentation/images/osd_shader_transition.png
--- a/documentation/nav_template.txt
+++ b/documentation/nav_template.txt
@ -66,6 +66,9 @@
                <ul>
                    <li><a href="api_overview.html">API Overview</a>
                        <li><a href="osd_overview.html">Osd</a></li>
+                        <ul>
+                            <li><a href="osd_shader_interface.html">Shader Interface</a></li>
+                        </ul>
                        <li><a href="far_overview.html">Far</a></li>
                        <ul>
                            <li><a href="far_overview.html#far-topologyrefiner">Topology Refiner</a></li>
--- a/documentation/osd_overview.rst
+++ b/documentation/osd_overview.rst
@ -36,91 +36,195 @@ OSD Overview
 OpenSubdiv (Osd)
 ================

-**Osd** contains client-level code that uses *Far* to create concrete instances of
-meshes. These meshes use precomputed tables from *Far* to perform table-driven
-subdivision steps with a variety of massively parallel computational backend
-technologies. **Osd** supports both `uniform subdivision <subdivision_surfaces.html#uniform-subdivision>`__
-and `adaptive refinement <subdivision_surfaces.html#feature-adaptive-subdivision>`__
-with cubic patches.
+**Osd** contains device dependent code that reflects *Far* structure to be
+available on various backends such as TBB, CUDA, OpenCL, GLSL etc.
+The main roles of **Osd** are:

----
-
-Modular Architecture
-====================
-
-With uniform subdivision the computational backend code performs Catmull-Clark
-splitting and averaging on each face.
-
-With adaptive subdivision, the Catmull/Clark steps are used to compute the CVs
-of cubic Bezier patches. On modern GPU architectures, bicubic patches can be
-drawn directly on screen at very high resolution using optimized tessellation
-shader paths.
-
-.. image:: images/osd_layers.png
-
-Finally, the general manipulation of high-order surfaces also requires functionality
-outside of the scope of pure drawing.
-
-Following this pattern of general use, **Osd** can be broken down into 3 main
-modules : **Compute**, **Draw** and **Eval**.
-
-.. image:: images/osd_modules.png
-   :align: center
-
-The modules are designed so that the data being manipulated can be shared and
-interoperated between modules (although not all paths are possible).
-
-These modules are identified by their name spaces (**Compute**, **Draw**,
-**Eval**) and encapsulate atomic functationality. The vertex data is carried
-in interoperable buffers that can be exchanged between modules.
-
-The typical use pattern is to pose the coarse vertices of a mesh for a given frame.
-The buffer is submitted to the **Refine** module which applies the subdivision rules
-and produces refined control vertices. This new buffer can be passed to the **Draw**
-module which will draw them on screen.
-
-However, the same buffer of refined control vertices could be passed instead to
-the **Eval** module (and be projected onto another surface for instance) before
-being sent for display to the **Draw** module.
-
----
-
-OsdCompute
-**********
-
-The Compute module contains the code paths that manage the application of the
-subdivision rules to the vertex data. This module is sufficient for uniform
-subdivision applications.
-
----
-
-OsdDraw
-*******
-
-The Draw module manages interactions with discrete display devices and provide
-support for interactive drawing of the subdivision surfaces.
-
----
-
-OsdEval
-*******
-
-The Eval module provides computational APIs for the evaluation of vertex data at
-the limit, ray intersection and point projection.
+Refinement
+    Compute stencil-based uniform/adaptive subdivision on CPU/GPU backends
+Limit Stencil Evaluation
+    Compute limit surfaces by limit stencils on CPU/GPU backends
+Limit Evaluation with PatchTable
+    Compute limit surfaces by patch evaluation on CPU/GPU backends
+OpenGL/DX11 Drawing with hardware tessellation
+    Provide GLSL/HLSL tessellation functions for patch table
+Interleaved/Batched buffer configuration
+    Provide consistent buffer descriptor to deal with arbitrary buffer layout.
+Cross-Platform Implementation
+    Provide convenient classes to interop between compute and draw APIs

+They are independently used by client. For example, a client can use only
+the stencil table evaluation. A client can call **Osd** compute functions
+on its own vertex buffers.

 OpenSubdiv enforces the same results for the different computation backends with
 a series of regression tests that compare the methods to each other.

+----

+Refinement
+==========
+
+**Osd** supports both `uniform subdivision <subdivision_surfaces.html#uniform-subdivision>`__
+and `adaptive subdivision <subdivision_surfaces.html#feature-adaptive-subdivision>`__.
+
+
+.. image:: images/osd_refinement.png
+   :align: center
+
+Once clients create a Far::StencilTable for the topology, then convert it into
+device-specific stencil tables if necessary. The following table shows which evaluator
+classes and stencil table interfaces can be used together. Note that while **Osd**
+provides these stencil tables classes which can be easily constructed from Far::StencilTable,
+clients aren't required to use these table classes. Clients may have their own entities
+as a stencil table as long as Evaluator::EvalStencils() can access necessary interfaces.
+
+-----------------------------+-----------------------+-------------------------+
+| Backend                     | Evaluator class       | compatible stencil table|
+=============================+=======================+=========================+
+| CPU (CPU single-threaded)   | CpuEvaluator          | Far::StencilTable       |
+-----------------------------+-----------------------+-------------------------+
+| TBB (CPU multi-threaded)    | TbbEvaluator          | Far::StencilTable       |
+-----------------------------+-----------------------+-------------------------+
+| OpenMP (CPU multi-threaded) | OmpEvaluator          | Far::StencilTable       |
+-----------------------------+-----------------------+-------------------------+
+| CUDA (GPU)                  | CudaEvaluator         | CudaStencilTable        |
+-----------------------------+-----------------------+-------------------------+
+| OpenCL (CPU/GPU)            | CLEvaluator           | CLStencilTable          |
+-----------------------------+-----------------------+-------------------------+
+| GL ComputeShader (GPU)      | GLComputeEvaluator    | GLStencilTableSSBO      |
+-----------------------------+-----------------------+-------------------------+
+| GL Transform Feedback (GPU) | GLXFBEvaluator        | GLStencilTableTBO       |
+-----------------------------+-----------------------+-------------------------+
+| DX11 ComputeShader (GPU)    | D3D11ComputeEvaluator | D3D11StencilTable       |
+-----------------------------+-----------------------+-------------------------+
+
+
+Limit Stencil Evaluation
+========================
+
+Limit stencil evaluation is quite similar to refinement in **Osd**. Clients
+create Far::LimitStencilTable for the locations need to evaluate. Then create
+an evaluator compatible stencil table and call Evaluator::EvalStencils().
+
+.. image:: images/osd_limitstencil.png
+   :align: center
+
+Limit Evaluation with PatchTable
+================================
+
+In **Osd**, the limit surfaces can also be evaluated by PatchTable once all
+control vertices and local points are resolved by the stencil evaluation.
+
+.. image:: images/osd_limiteval.png
+   :align: center
+
+-----------------------------+-------------------------+-------------------------+
+| Backend                     | Evaluator class         | compatible patch   table|
+=============================+=========================+=========================+
+| CPU (CPU single-threaded)   | CpuEvaluator            | CpuPatchTable           |
+-----------------------------+-------------------------+-------------------------+
+| TBB (CPU multi-threaded)    | TbbEvaluator            | CpuPatchTable           |
+-----------------------------+-------------------------+-------------------------+
+| OpenMP (CPU multi-threaded) | OmpEvaluator            | CpuPatchTable           |
+-----------------------------+-------------------------+-------------------------+
+| CUDA (GPU)                  | CudaEvaluator           | CudaPatchTable          |
+-----------------------------+-------------------------+-------------------------+
+| OpenCL (CPU/GPU)            | CLEvaluator             | CLPatchTable            |
+-----------------------------+-------------------------+-------------------------+
+| GL ComputeShader (GPU)      | GLComputeEvaluator      | GLPatchTable            |
+-----------------------------+-------------------------+-------------------------+
+| GL Transform Feedback (GPU) | GLXFBEvaluator          | GLPatchTable            |
+-----------------------------+-------------------------+-------------------------+
+| DX11 ComputeShader (GPU)    | | D3D11ComputeEvaluator | D3D11PatchTable         |
+|                             | | (*)not yet supported  |                         |
+-----------------------------+-------------------------+-------------------------+

 .. container:: impnotip

-   * **Release Notes (3.0.0)**
+ **Release Notes (3.0.0)**
+
+ * GPU limit evaluation backends (Evaluator::EvalPatches()) only supports
+   BSpline patches. Clients need to specify BSpline approximation for endcap
+   when creating a patch table. See `end capping <far_overview.html#endcap>`__.
+
+OpenGL/DX11 Drawing with hardware tessellation
+==============================================
+
+One of the most interesting use cases of **Osd** layer is realtime drawing of
+subdivision surfaces using hardware tessellation. This is somewhat similar to
+limit evaluation with PatchTable described above. Drawing differs from limit
+evaluation in that **Osd** provides shader snippets for patch evaluation and
+clients will inject them into their own shader source.
+
+.. image:: images/osd_draw.png
+   :align: center
+
+see `shader interface <osd_shader_interface.html>`__ for more detail of shader interface.
+
+----
+
+Interleaved/Batched buffer configuration
+========================================
+
+All **Osd** layer APIs assume that each primitive variables to be computed
+(points, colors, uvs ...) are contiguous array of 32bit floating point values.
+**Osd** API refers this array as "buffer". Buffer can exist on CPU memory or
+GPU memory. **Osd** Evaluators typically take one source buffer and one destination
+buffer, or three destination buffers if derivatives are being computed.
+**Osd** Evaluators also take BufferDescriptors,
+which is used to specify the layout of the source and destination buffers.
+BufferDescriptor is 3 integers struct which consists of offset, length and stride.
+
+For example:
+
+ +-----------+-----------+-----------+
+ | Vertex 0  |  Vertex 1 | ...       |
+ +---+---+---+---+---+---+-----------+
+ | X | Y | Z | X | Y | Z | ...       |
+ +---+---+---+---+---+---+-----------+
+
+The layout of this buffer can be described as
+
+.. code:: c++
+
+  Osd::BufferDescriptor desc(/*offset = */ 0, /*length = */ 3, /*stride = */ 3);
+
+BufferDescriptor can be used for interleaved buffer too.
+
+ +---------------------------+---------------------------+-------+
+ | Vertex 0                  | Vertex 1                  | ...   |
+ +---+---+---+---+---+---+---+---+---+---+---+---+---+---+-------+
+ | X | Y | Z | R | G | B | A | X | Y | Z | R | G | B | A | ...   |
+ +---+---+---+---+---+---+---+---+---+---+---+---+---+---+-------+
+
+.. code:: c++
+
+  Osd::BufferDescriptor xyzDesc(0, 3, 7);
+  Osd::BufferDescriptor rgbaDesc(3, 4, 7);
+
+Although the source and the destination buffer don't have to be a same buffer for
+EvalStencils(), adaptive patch tables are constructed to index the coarse vertices
+first and immediately followed by the refined vertices. In this case, the
+BufferDescriptor for the destination should include the offset as the number of coarse
+vertices to be skipped.
+
+ +-----------------------------------+-----------------------------------+
+ |  Coarse vertices (n) : Src        |  Refined vertices : Dst           |
+ +-----------+-----------+-----------+-----------+-----------+-----------+
+ | Vertex 0  | Vertex 1  | ...       | Vertex n  | Vertex n+1|           |
+ +---+---+---+---+---+---+-----------+---+---+---+---+---+---+-----------+
+ | X | Y | Z | X | Y | Z | ...       | X | Y | Z | X | Y | Z | ...       |
+ +---+---+---+---+---+---+-----------+---+---+---+---+---+---+-----------+
+
+.. code:: c++
+
+  Osd::BufferDescriptor srcDesc(0, 3, 3);
+  Osd::BufferDescriptor dstDesc(n*3, 3, 3);
+
+Also note that the source descriptor doesn't have to start from offset = 0.
+This is useful when a client has a big buffer multiple objects batched together.

-      Face-varying smooth data interpolation is currently only supported in 
-      **Osd** through refinement and limit points but not in the PatchTable.
-      A more complete implementation is currently slated for a 3.1 release.

 ----

@ -130,79 +234,41 @@ Cross-Platform Implementation
 One of the key goals of OpenSubdiv is to achieve as much cross-platform flexibility
 as possible and leverage all optimized hardware paths where available. This can
 be very challenging however, as there is a very large variety of plaftorms and
-matching APIs available, with very distinct capabilities. The following chart
-illustrates the matrix of back-end APIs supported for each module.
+matching APIs available, with very distinct capabilities.

-.. image:: images/osd_backends.png
-   :align: center
+In **Osd**, Evaluators don't care about interops between those APIs. All Evaluators
+have two kinds of APIs for both EvalStencils() and EvalPatches().

-Since the **Compute** module performs mostly specialized interpolation
-computations, most GP-GPU and multi-core APIs can be deployed. If the end-goal
-is to draw the surface on screen, it can be very beneficial to move as much of
-these computations to the same GPU device in order to minimize data transfers.
+ - Explicit signatures which directly take device-specific buffer representation
+   (i.e. pointer for CpuEvaluator, GLuint buffer for GLComputeEvaluator)
+ - Generic signatures which take arbitrary buffer classes. The buffer class
+   is required to have a certain method to return the device-specific buffer representation.

-For instance: pairing a CUDA **Compute** back-end to an OpenGL **Draw** backend
-could be a good choice on hardware and OS that supports both. Similarly, a DX11
-HLSL-Compute **Compute** back-end can be paired effectively with a DX11
-HLSL-Shading **Draw** back-end. Some pairings however are not possible, as
-there may be no data inter-operation paths available (ex: transferring DX11
-compute SRVs to GL texture buffers).
+The later interface is useful if the client supports multiple backends at the same time.
+The methods needs to be implemented for each Evaluators are:

----
+-----------------------+------------------------+------------------+
+| Evaluator class       | object                 | method           |
+=======================+========================+==================+
+| | CpuEvaluator        | pointer to cpu memory  | BindCpuBuffer()  |
+| | TbbEvaluator        |                        |                  |
+| | OmpEvaluator        |                        |                  |
+-----------------------+------------------------+------------------+
+| CudaEvaluator         | pointer to cuda memory | BindCudaBuffer() |
+-----------------------+------------------------+------------------+
+| CLEvaluator           | cl_mem                 | BindCLBuffer()   |
+-----------------------+------------------------+------------------+
+| | GLComputeEvaluator  | GL buffer object       | BindVBO()        |
+| | GLXFBEvaluator      |                        |                  |
+-----------------------+------------------------+------------------+
+| D3D11ComputeEvaluator | D3D11 UAV              | BindD3D11UAV()   |
+-----------------------+------------------------+------------------+

-Contexts & Controllers
-======================
-
-At the core of **Osd** modularization is the need for inter-operating vertex buffer
-data between different APIs. This is achieved through a *"binding"* mechanism.
-
-Binding Vertex Buffers
-**********************
-
-Each back-end manages data of 2 types: specific to each primitive manipulated
-(topology, vertex data...), and general state data that is shared by all the
-primitives (compute kernels, device ID...). The first type is contained in a
-"Context" object, the latter manipulated through a singleton "Controller".
-
-.. image:: images/osd_context_controller.png
-   :align: center
-
-The Context itself holds the data that is specific to both the primitive and
-the operation that needs to be appled (ex: *"drawing"*). It also owns multiple
-buffers of vertex data. Contexts and Controller each have a specific back-end
-API, so only matching back-ends can be paired (ex: an OpenCL Context cannot be
-paired with a CUDA Controller).
-
-Vertex Buffer Inter-Op
-**********************
-
-When a Controller needs to perform an operation, it *"binds"* the Context, which
-is the trigger to move the vertex data into the appropriate device memory pool
-(CPU to GPU, GPU to GPU...).
-
-.. image:: images/osd_controllers.png
-   :align: center
+The buffers can use these methods as a trigger of interop. **Osd** provides default
+implementation of interop buffer for the most of combination of backends.
+For example, if the client wants to use cuda as computation backend and use OpenGL
+as drawing APIs, Osd::CudaGLVertexBuffer fits the case since it implements
+BindCudaBuffer() and BindVBO(). Again, clients can implement their own buffer
+class and pass it to Evaluators.


-In practice, a given application will maintain singletons of the controllers for
-each of the modules that it uses, and pair them with the Contexts associated with
-each primitive. A given primitive will use one Context for each of the modules that
-it uses.
-
-Example
-*******
-
-Here is an example of client code implementation for drawing surfaces using a
-CUDA **Compute** module and an OpenGL **Draw** module.
-
-.. image:: images/osd_controllers_example1.png
-   :align: center
-
-The client code will construct a CudaComputeController and CudaComputeContext
-for the **Compute** stage, along with an GLDrawController and a GLDrawContext.
-
-The critical components are the vertex buffers, which must be of type
-CudaGLVertexBuffer. The Contexts and Controllers classes all are
-specializations of a templated *"Bind"* function which will leverage API
-specific code responsible for the inter-operation of the data between the
-API-specific back-ends.
--- a/documentation/osd_shader_interface.rst
+++ b/documentation/osd_shader_interface.rst
@ -0,0 +1,243 @@
+..
+     Copyright 2015 Pixar
+
+     Licensed under the Apache License, Version 2.0 (the "Apache License")
+     with the following modification; you may not use this file except in
+     compliance with the Apache License and the following modification to it:
+     Section 6. Trademarks. is deleted and replaced with:
+
+     6. Trademarks. This License does not grant permission to use the trade
+        names, trademarks, service marks, or product names of the Licensor
+        and its affiliates, except as required to comply with Section 4(c) of
+        the License and to reproduce the content of the NOTICE file.
+
+     You may obtain a copy of the Apache License at
+
+         http://www.apache.org/licenses/LICENSE-2.0
+
+     Unless required by applicable law or agreed to in writing, software
+     distributed under the Apache License with the above modification is
+     distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+     KIND, either express or implied. See the Apache License for the specific
+     language governing permissions and limitations under the Apache License.
+
+
+OSD Tessellation shader Interface
+---------------------------------
+
+.. contents::
+   :local:
+   :backlinks: none
+
+Basic
+=====
+
+From 3.0, **Osd** tessellation shaders can be used as a set of functions from
+client shader code. In order to tessellate **Osd** patches, client shader
+code should perform following processes (regular B-spline patch case):
+
+* In a tessellation control shader
+    1. fetch a PatchParam for the current patch
+    2. call OsdComputePerPatchVertexBSpline() to compute OsdPerPatchVertexBezier.
+    3. compute tessellation level. To prevent cracks on transition patches,
+       two vec4 parameters (tessOuterHi, tessOuterLo) will be needed in addition to built-in gl_TessLevelInner/Outers.
+
+* In a tessellation evaluation shader
+    1. call OsdGetTessParameterization() to remap gl_TessCoord to a patch parameter to evaluated at.
+    2. call OsdEvalPatchBezier()/OsdEvalPatchGregory() to evaluate the current patch.
+
+The following is a minimal example of GLSL code explaining how client shader code
+uses OpenSubdiv shader function to tessellate patches of patch table.
+
+
+Tessellation Control Shader example (for BSpline patches)
+*********************************************************
+
+.. code:: glsl
+
+    layout (vertices = 16) out;
+    in vec3 position[];
+    patch out vec4 tessOuterLo, tessOuterHi;
+    out OsdPerPatchVertexBezier v;
+
+    void main()
+    {
+        // Get a patch param from texture buffer.
+        ivec3 patchParam = OsdGetPatchParam(gl_PrimitiveID);
+
+        // Compute per-patch vertices.
+        OsdComputePerPatchVertexBSpline(patchParam, gl_InvocationID, position, v);
+
+        // Compute tessellation factors.
+        if (gl_InvocationID == 0) {
+            vec4 tessLevelOuter = vec4(0);
+            vec2 tessLevelInner = vec2(0);
+            OsdGetTessLevelsUniform(patchParam,
+                                    tessLevelOuter, tessLevelInner,
+                                    tessOuterLo, tessOuterHi);
+
+            gl_TessLevelOuter[0] = tessLevelOuter[0];
+            gl_TessLevelOuter[1] = tessLevelOuter[1];
+            gl_TessLevelOuter[2] = tessLevelOuter[2];
+            gl_TessLevelOuter[3] = tessLevelOuter[3];
+
+            gl_TessLevelInner[0] = tessLevelInner[0];
+            gl_TessLevelInner[1] = tessLevelInner[1];
+        }
+    }
+
+
+
+Tessellation Evaluation Shader example (for BSpline patches)
+************************************************************
+
+.. code:: glsl
+
+    layout(quads) in;
+    patch in vec4 tessOuterLo, tessOuterHi;
+    in OsdPerPatchVertexBezier v[];
+    uniform mat4 mvpMatrix;
+
+    void main()
+    {
+        // Compute tesscoord.
+        vec2 UV = OsdGetTessParameterization(gl_TessCoord.xy, tessOuterLo, tessOuterHi);
+
+        vec3 P = vec3(0), dPu = vec3(0), dPv = vec3(0);
+        vec3 N = vec3(0), dNu = vec3(0), dNv = vec3(0);
+        ivec3 patchParam = inpt[0].v.patchParam;
+
+        // Evaluate patch at the tess coord UV
+        OsdEvalPatchBezier(patchParam, UV, v, P, dPu, dPv, N, dNu, dNv);
+
+        // Apply model-view-projection matrix.
+        gl_Position = mvpMatrix * vec4(P, 1);
+    }
+
+Basis Conversion
+================
+
+B-spline patch
+**************
+
+The following diagram shows how **Osd** shader processes b-spline patches.
+
+.. image:: images/osd_shader_bspline.png
+
+While regular patches are expressed as b-spline patches in Far::PatchTable,
+**Osd** shader converts them into Bezier basis patches, for simplicity and efficiency.
+This conversion is performed in the tessellation control stage. The boundary edge evaluation
+and single crease matrix evaluation are also resolved during this conversion.
+OsdComputePerPatchVertexBSpline() can be used for this process.
+The resulting Bezier control vertices are stored in OsdPerPatchVertexBezier struct.
+
+.. code:: glsl
+
+  void  OsdComputePerPatchVertexBSpline(
+      ivec3 patchParam, int ID, vec3 cv[16], out OsdPerPatchVertexBezier result);
+
+The tessellation evaluation shader takes an array of OsdPerPatchVertexBezier struct,
+and then evaluates the patch using OsdEvalPatchBezier() function.
+
+.. code:: glsl
+
+  void OsdEvalPatchBezier(ivec3 patchParam, vec2 UV,
+                          OsdPerPatchVertexBezier cv[16],
+                          out vec3 P, out vec3 dPu, out vec3 dPv,
+                          out vec3 N, out vec3 dNu, out vec3 dNv)
+
+
+Gregory Basis patch
+*******************
+
+In a similar way, gregory basis patches are processed as follows:
+
+.. image:: images/osd_shader_gregory.png
+
+OsdComputePerPatchVertexGregoryBasis() can be used for the gregory patches
+(although no basis conversion involved for the gregory patches) and the resulting vertices
+are stored in OsdPerPatchVertexGreogryBasis struct.
+
+.. code:: glsl
+
+  void OsdComputePerPatchVertexGregoryBasis(
+      ivec3 patchParam, int ID, vec3 cv, out OsdPerPatchVertexGregoryBasis result)
+
+The tessellation evaluation shader takes an array of OsdPerPatchVertexGregoryBasis struct,
+and then evaluates the patch using OsdEvalPatchGregory() function.
+
+.. code:: glsl
+
+  void
+  OsdEvalPatchGregory(ivec3 patchParam, vec2 UV, vec3 cv[20],
+                      out vec3 P, out vec3 dPu, out vec3 dPv,
+                      out vec3 N, out vec3 dNu, out vec3 dNv)
+
+
+Legacy Gregory patch (2.x compatibility)
+****************************************
+
+OpenSubdiv 3.0 also supports 2.x style gregory patch evaluation (see far_overview).
+In order to evaluate a legacy gregory patch, client needs to bind extra buffers and
+to perform extra steps in the vertex shader as shown in the following diagram:
+
+.. image:: images/osd_shader_legacy_gregory.png
+
+
+
+Tessellation levels
+===================
+
+**Osd** provides both uniform and screen-space adaptive tessellation level computation.
+
+Uniform tessellation
+  OsdGetTessLevelsUniform()
+
+Screen-space adaptive tessellation
+  OsdGetTessLevelsAdaptiveLimitPoints()
+
+Because of the nature of `feature adaptive subdivision <far_overview.html>`__,
+we need to pay extra attention for patch's outer tessellation level for the screen-space
+adaptive case so that cracks won't appear.
+
+An edge of the patch marked as a transition edge is split into two segments (Hi and Lo).
+
+.. image:: images/osd_shader_patch.png
+
+**Osd** shader uses these two segment to ensure the same tessellation along the
+edge between different levels of subdivision. In the following example, suppose the left hand side
+patch has determined the tessellation level of its right edge to 5. gl_TessLevelOuter is set to
+5 for the edge, and at the same time we also pass 2 and 3 to the tessellation evaluation shader
+as separate levels for the two segments of the edge split at the middle.
+
+.. image:: images/osd_shader_transition.png
+
+Then the tessellation evaluation shader takes gl_TessCoord and those two values, and remaps
+gl_TessCoord using OsdGetTessParameterization() to ensure the parameters are consistent
+across adjacent patches.
+
+.. image:: images/osd_shader_param_remap.png
+
+.. code:: glsl
+
+  vec2 OsdGetTessParameterization(vec2 uv, vec4 tessOuterLo, vec4 tessOuterHi)
+
+These tessellation levels can be computed by OsdGetTessLevelsAdaptiveLimitPoints()
+in the tessellation control shader. Note that this function requires all 16 bezier control
+points, you need to call barrier() to ensure the conversion is done for all invocations.
+See osd/glslPatchBSpline.glsl for more detail.
+
+.. code:: glsl
+
+  void OsdGetTessLevelsAdaptiveLimitPoints(OsdPerPatchVertexBezier cpBezier[16],
+                                           ivec3 patchParam,
+                                           out vec4 tessLevelOuter, out vec2 tessLevelInner,
+                                           out vec4 tessOuterLo, out vec4 tessOuterHi)
+
+.. container:: impnotip
+
+ **Release Notes (3.0.0)**
+
+ * Currently OsdGetTessParameterization doesn't support fraction spacing.
+   It will be fixed in the future release.
+
--- a/documentation/tutorials.rst
+++ b/documentation/tutorials.rst
@ -114,8 +114,8 @@ or in your local ``<repository root>/turorials``.
   :widths: 50 50

   * - | **Tutorial 0**
-       | This tutorial demonstrates the manipulation of Osd 'Compute' 'Contexts' and
-         'Controllers'.  `[code] <osd_tutorial_0.html>`__
+       | This tutorial demonstrates the manipulation of Osd Evaluator and BufferDescriptor.
+         `[code] <osd_tutorial_0.html>`__
       |
     - |

--- a/tutorials/osd/tutorial_0/osd_tutorial_0.cpp
+++ b/tutorials/osd/tutorial_0/osd_tutorial_0.cpp
@ -26,8 +26,8 @@
 //------------------------------------------------------------------------------
 // Tutorial description:
 //
-// This tutorial demonstrates the manipulation of Osd 'Compute' 'Contexts' and
-// 'Controllers'.
+// This tutorial demonstrates the manipulation of Osd Evaluator and
+// BufferDescriptor.
 //

 #include <opensubdiv/far/topologyDescriptor.h>
@ -76,7 +76,7 @@ int main(int, char **) {
    // Setup phase
    //
    Far::StencilTable const * stencilTable = NULL;
-    { // Setup Context
+    { // Setup Far::StencilTable
        Far::TopologyRefiner const * refiner = createTopologyRefiner(maxlevel);

        // Setup a factory to create FarStencilTable (for more details see