【计算机视觉】Gaussian Splatting源码解读补充(三)

2024-03-23 18:36

本文主要是介绍【计算机视觉】Gaussian Splatting源码解读补充(三),希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

  • 第一部分
  • 第二部分

本文是对学习笔记之——3D Gaussian Splatting源码解读的补充,并订正了一些错误。

目录

  • 五、反向传播
    • 1. `rasterizer_impl.cu`: `CudaRasterizer::Rasterizer::backward`
    • 2. `backward.cu`: `renderCUDA`
    • 3. `backward.cu`: `preprocess`

五、反向传播

原文第6节:

During the backward pass, we must therefore recover the full sequence of blended points per-pixel in the forward pass. One solution would be to store arbitrarily long lists of blended points per-pixel in global memory [Kopanas et al. 2021]. To avoid the implied dynamic memory management overhead, we instead choose to traverse the per-tile lists again; we can reuse the sorted array of Gaussians and tile ranges from the forward pass. To facilitate gradient computation, we now traverse them back-to-front.

The traversal starts from the last point that affected any pixel in the tile, and loading of points into shared memory again happens collaboratively. Additionally, each pixel will only start (expensive) overlap testing and processing of points if their depth is lower than or equal to the depth of the last point that contributed to its color during the forward pass. Computation of the gradients described in Sec. 4 requires the accumulated opacity values at each step during the original blending process. Rather than trasversing an explicit list of progressively shrinking opacities in the backward pass, we can recover these intermediate opacities by storing only the total accumulated opacity at the end of the forward pass. Specifically, each point stores the final accumulated opacity 𝛼 in the forward process; we divide this by each point’s 𝛼 in our back-to-front traversal to obtain the required coefficients for gradient computation.

反向传播是根据loss对每个像素颜色的导求出loss对Gaussian的2维中心坐标、3维中心坐标、椭圆二次型矩阵、不透明度、相机看到的Gaussian颜色、球谐系数、三维协方差矩阵、缩放和旋转的导数。

代码中出现的类似于dL_dx的变量的意思是loss对参数x的导数,即 ∂ L ∂ x \frac{\partial L}{\partial x} xL

1. rasterizer_impl.cu: CudaRasterizer::Rasterizer::backward

// Produce necessary gradients for optimization, corresponding
// to forward render pass
void CudaRasterizer::Rasterizer::backward(const int P, // Gaussian个数int D, // 球谐度数int M, // 三通道球谐系数个数int R,// R对应CudaRasterizer::Rasterizer::forward函数中的num_rendered// 即排序数组的个数(等于每个Gaussian覆盖的tile的个数之和)const float* background, // 背景颜色const int width, int height,const float* means3D,const float* shs,const float* colors_precomp,const float* scales,const float scale_modifier,const float* rotations,const float* cov3D_precomp,const float* viewmatrix,const float* projmatrix,const float* campos,const float tan_fovx, float tan_fovy,const int* radii,char* geom_buffer,char* binning_buffer,char* img_buffer,// 上面的参数不解释const float* dL_dpix, // loss对每个像素颜色的导数float* dL_dmean2D, // loss对Gaussian二维中心坐标的导数float* dL_dconic, // loss对椭圆二次型矩阵的导数float* dL_dopacity, // loss对不透明度的导数float* dL_dcolor, // loss对Gaussian颜色的导数(颜色是从相机中心看向Gaussian的颜色)float* dL_dmean3D, // loss对Gaussian三维中心坐标的导数float* dL_dcov3D, // loss对Gaussian三维协方差矩阵的导数float* dL_dsh, // loss对Gaussian的球谐系数的导数float* dL_dscale, // loss对Gaussian的缩放参数的导数float* dL_drot, // loss对Gaussian旋转四元数的导数bool debug)
{GeometryState geomState = GeometryState::fromChunk(geom_buffer, P);BinningState binningState = BinningState::fromChunk(binning_buffer, R);ImageState imgState = ImageState::fromChunk(img_buffer, width * height);// 上面这些缓冲区都是在前向传播的时候存下来的,现在拿出来用if (radii == nullptr){radii = geomState.internal_radii;}const float focal_y = height / (2.0f * tan_fovy);const float focal_x = width / (2.0f * tan_fovx);const dim3 tile_grid((width + BLOCK_X - 1) / BLOCK_X, (height + BLOCK_Y - 1) / BLOCK_Y, 1);const dim3 block(BLOCK_X, BLOCK_Y, 1);// Compute loss gradients w.r.t. 2D mean position, conic matrix,// opacity and RGB of Gaussians from per-pixel loss gradients.// If we were given precomputed colors and not SHs, use them.const float* color_ptr = (colors_precomp != nullptr) ? colors_precomp : geomState.rgb;CHECK_CUDA(BACKWARD::render(tile_grid,block,imgState.ranges,binningState.point_list,width, height,background,geomState.means2D,geomState.conic_opacity,color_ptr,imgState.accum_alpha,imgState.n_contrib,dL_dpix,(float3*)dL_dmean2D,(float4*)dL_dconic,dL_dopacity,dL_dcolor), debug)// Take care of the rest of preprocessing. Was the precomputed covariance// given to us or a scales/rot pair? If precomputed, pass that. If not,// use the one we computed ourselves.const float* cov3D_ptr = (cov3D_precomp != nullptr) ? cov3D_precomp : geomState.cov3D;// 因为是反向传播,所以preprocess放在后面了(❁´◡`❁)CHECK_CUDA(BACKWARD::preprocess(P, D, M,(float3*)means3D,radii,shs,geomState.clamped,(glm::vec3*)scales,(glm::vec4*)rotations,scale_modifier,cov3D_ptr,viewmatrix,projmatrix,focal_x, focal_y,tan_fovx, tan_fovy,(glm::vec3*)campos,(float3*)dL_dmean2D,dL_dconic,(glm::vec3*)dL_dmean3D,dL_dcolor,dL_dcov3D,dL_dsh,(glm::vec3*)dL_dscale,(glm::vec4*)dL_drot), debug)
}

2. backward.cu: renderCUDA

这个函数中每个线程负责一个像素,计算loss对Gaussian的二维中心坐标、椭圆二次型矩阵、不透明度和Gaussian颜色的导。

这里求导利用的公式是颜色的递推公式,与前向传播有所不同。设 b i \boldsymbol{b}_i bi是第 i i i个即其后的Gaussians渲染出来的颜色(三个通道), g i \boldsymbol{g}_i gi是第 i i i个Gaussian的颜色,则有递推公式: b i = α i g i + ( 1 − α i ) b i + 1 \boldsymbol{b}_i=\alpha_i \boldsymbol{g}_i+(1-\alpha_i)\boldsymbol{b}_{i+1} bi=αigi+(1αi)bi+1 T i = ( 1 − α 1 ) ( 1 − α 2 ) ⋯ ( 1 − α i − 1 ) T_i=(1-\alpha_1)(1-\alpha_2)\cdots(1-\alpha_{i-1}) Ti=(1α1)(1α2)(1αi1)为第 i i i个Gaussian对像素点的透光率, C \boldsymbol{C} C是像素点的颜色,则有 ∂ L ∂ b i = ∂ L ∂ C ⋅ ∂ b 1 ∂ b 2 ⋅ ∂ b 2 ∂ b 3 ⋯ ∂ b i − 1 ∂ b i = ∂ L ∂ C ⋅ ( 1 − α 1 ) I ⋅ ( 1 − α 2 ) I ⋯ ( 1 − α i − 1 ) I = T i ∂ L ∂ C \begin{aligned} \cfrac{\partial L}{\partial\boldsymbol{b}_i}&=\cfrac{\partial L}{\partial\boldsymbol{C}}\cdot\cfrac{\partial \boldsymbol{b}_1}{\partial\boldsymbol{b}_2}\cdot\cfrac{\partial \boldsymbol{b}_2}{\partial\boldsymbol{b}_3}\cdots\cfrac{\partial \boldsymbol{b}_{i-1}}{\partial\boldsymbol{b}_i}\\ &=\cfrac{\partial L}{\partial\boldsymbol{C}}\cdot(1-\alpha_1)I\cdot(1-\alpha_2)I\cdots(1-\alpha_{i-1})I\\ &=T_i\cfrac{\partial L}{\partial\boldsymbol{C}}\\ \end{aligned} biL=CLb2b1b3b2bibi1=CL(1α1)I(1α2)I(1αi1)I=TiCL ∂ L ∂ α i = T i ∂ L ∂ C ( g i − b i + 1 ) ∂ L ∂ g i = T i α i ∂ L ∂ C \cfrac{\partial L}{\partial \alpha_i}=T_i\cfrac{\partial L}{\partial\boldsymbol{C}}(\boldsymbol{g}_i-\boldsymbol{b}_{i+1})\\ \cfrac{\partial L}{\partial \boldsymbol{g}_i}=T_i\alpha_i\cfrac{\partial L}{\partial\boldsymbol{C}} αiL=TiCL(gibi+1)giL=TiαiCL α = min ⁡ ( 0.99 , σ G ) \alpha=\min(0.99,\sigma G) α=min(0.99,σG),其中 σ \sigma σ是“opacity”, G G G是正态分布给出的“exponential falloff”,则 ∂ L ∂ σ = G ∂ L ∂ α ∂ L ∂ G = σ ∂ L ∂ α \frac{\partial L}{\partial \sigma}=G\frac{\partial L}{\partial \alpha}\\\ \\ \frac{\partial L}{\partial G}=\sigma\frac{\partial L}{\partial \alpha} σL=GαL GL=σαL(注:代码中似乎不考虑 α > 0.99 \alpha>0.99 α>0.99的情况;也许是因为想让 σ \sigma σ过大时也有梯度从而变小?)
G = exp ⁡ { − 1 2 d T A d } G=\exp\{-\frac{1}{2}\boldsymbol{d}^T A\boldsymbol{d}\} G=exp{21dTAd} d = p − μ \boldsymbol{d}=\boldsymbol{p}-\boldsymbol\mu d=pμ,其中 p \boldsymbol p p为像素坐标, μ \boldsymbol\mu μ为Gausian中心的2D坐标, d \boldsymbol d d为它们之间的位移向量, A A A为椭圆二次型的矩阵,因此 ∂ L ∂ d = − 1 2 G ∂ L ∂ G [ d T ( A T + A ) ] ∂ L ∂ μ = − ∂ L ∂ d ∂ L ∂ A 00 = − 1 2 G ∂ L ∂ G d x 2 ∂ L ∂ A 11 = − 1 2 G ∂ L ∂ G d y 2 ∂ L ∂ A 01 = − 1 2 G ∂ L ∂ G d x d y \frac{\partial L}{\partial \boldsymbol d}=-\cfrac{1}{2}G\frac{\partial L}{\partial G}[\boldsymbol d^T(A^T+A)]\\ \cfrac{\partial L}{\partial \boldsymbol \mu}=-\cfrac{\partial L}{\partial \boldsymbol d}\\ \frac{\partial L}{\partial \boldsymbol A_{00}}=-\cfrac{1}{2}G\frac{\partial L}{\partial G}d_x^2\\ \frac{\partial L}{\partial \boldsymbol A_{11}}=-\cfrac{1}{2}G\frac{\partial L}{\partial G}d_y^2\\ \frac{\partial L}{\partial \boldsymbol A_{01}}=-\cfrac{1}{2}G\frac{\partial L}{\partial G}d_x d_y dL=21GGL[dT(AT+A)]μL=dLA00L=21GGLdx2A11L=21GGLdy2A01L=21GGLdxdy注意计算 ∂ L ∂ μ \frac{\partial L}{\partial \boldsymbol \mu} μL时要乘以像素坐标对像平面坐标的导。

公式中出现的变量和代码中变量的关系如下表:

公式变量代码变量
b i + 1 \boldsymbol{b}_{i+1} bi+1accum_rec
g i \boldsymbol{g}_i gic
T i T_i TiT
∂ L ∂ C \cfrac{\partial L}{\partial\boldsymbol{C}} CLdL_dpixel
σ i \sigma_i σicon_o.w
d \boldsymbol{d} dd
μ \boldsymbol\mu μxy
p \boldsymbol p ppixf
A 00 , A 01 , A 11 A_{00},A_{01},A_{11} A00,A01,A11con_o.x,con_o.y,con_o.z
// Backward version of the rendering procedure.
template <uint32_t C>
__global__ void __launch_bounds__(BLOCK_X * BLOCK_Y)
renderCUDA(const uint2* __restrict__ ranges,const uint32_t* __restrict__ point_list,int W, int H,const float* __restrict__ bg_color,const float2* __restrict__ points_xy_image,const float4* __restrict__ conic_opacity,const float* __restrict__ colors,const float* __restrict__ final_Ts,const uint32_t* __restrict__ n_contrib,const float* __restrict__ dL_dpixels,float3* __restrict__ dL_dmean2D,float4* __restrict__ dL_dconic2D,float* __restrict__ dL_dopacity,float* __restrict__ dL_dcolors)
{// We rasterize again. Compute necessary block info.auto block = cg::this_thread_block();const uint32_t horizontal_blocks = (W + BLOCK_X - 1) / BLOCK_X;const uint2 pix_min = { block.group_index().x * BLOCK_X, block.group_index().y * BLOCK_Y };const uint2 pix_max = { min(pix_min.x + BLOCK_X, W), min(pix_min.y + BLOCK_Y , H) };const uint2 pix = { pix_min.x + block.thread_index().x, pix_min.y + block.thread_index().y };const uint32_t pix_id = W * pix.y + pix.x;const float2 pixf = { (float)pix.x, (float)pix.y };// 上面这些就是当前线程负责的像素点的信息const bool inside = pix.x < W&& pix.y < H;const uint2 range = ranges[block.group_index().y * horizontal_blocks + block.group_index().x];const int rounds = ((range.y - range.x + BLOCK_SIZE - 1) / BLOCK_SIZE);// 轮数,和前向传播一样bool done = !inside;int toDo = range.y - range.x;__shared__ int collected_id[BLOCK_SIZE];__shared__ float2 collected_xy[BLOCK_SIZE];__shared__ float4 collected_conic_opacity[BLOCK_SIZE];__shared__ float collected_colors[C * BLOCK_SIZE];// In the forward, we stored the final value for T, the// product of all (1 - alpha) factors. const float T_final = inside ? final_Ts[pix_id] : 0;float T = T_final; // 最后的透光度// We start from the back. The ID of the last contributing// Gaussian is known from each pixel from the forward.uint32_t contributor = toDo;const int last_contributor = inside ? n_contrib[pix_id] : 0;float accum_rec[C] = { 0 };float dL_dpixel[C]; // loss对该像素RGB通道的导数if (inside)for (int i = 0; i < C; i++)dL_dpixel[i] = dL_dpixels[i * H * W + pix_id];float last_alpha = 0;float last_color[C] = { 0 };// Gradient of pixel coordinate w.r.t. normalized // screen-space viewport corrdinates (-1 to 1)// 像素坐标对像平面坐标的导(参见ndc2Pix函数)const float ddelx_dx = 0.5 * W;const float ddely_dy = 0.5 * H;// Traverse all Gaussiansfor (int i = 0; i < rounds; i++, toDo -= BLOCK_SIZE){// Load auxiliary data into shared memory, start in the BACK// and load them in revers order.block.sync();const int progress = i * BLOCK_SIZE + block.thread_rank();if (range.x + progress < range.y){const int coll_id = point_list[range.y - progress - 1];collected_id[block.thread_rank()] = coll_id;collected_xy[block.thread_rank()] = points_xy_image[coll_id];collected_conic_opacity[block.thread_rank()] = conic_opacity[coll_id];for (int i = 0; i < C; i++)collected_colors[i * BLOCK_SIZE + block.thread_rank()] = colors[coll_id * C + i];}block.sync();// Iterate over Gaussiansfor (int j = 0; !done && j < min(BLOCK_SIZE, toDo); j++){// Keep track of current Gaussian ID. Skip, if this one// is behind the last contributor for this pixel.contributor--;if (contributor >= last_contributor)continue;// 之所以不直接把contributor设置成last_contributor,// 是因为排在我的last_contributor后面的Gaussian可能// 对于其他像素来说是可见的// Compute blending values, as before.const float2 xy = collected_xy[j];const float2 d = { xy.x - pixf.x, xy.y - pixf.y };const float4 con_o = collected_conic_opacity[j];const float power = -0.5f * (con_o.x * d.x * d.x + con_o.z * d.y * d.y) - con_o.y * d.x * d.y;if (power > 0.0f)continue;const float G = exp(power);const float alpha = min(0.99f, con_o.w * G); // 先把alpha算出来if (alpha < 1.0f / 255.0f)continue;T = T / (1.f - alpha); // 一点一点把T除掉(T原本是累乘)const float dchannel_dcolor = alpha * T;// Propagate gradients to per-Gaussian colors and keep// gradients w.r.t. alpha (blending factor for a Gaussian/pixel// pair).float dL_dalpha = 0.0f;const int global_id = collected_id[j];for (int ch = 0; ch < C; ch++){const float c = collected_colors[ch * BLOCK_SIZE + j];// Gaussian的color// Update last color (to be used in the next iteration)accum_rec[ch] = last_alpha * last_color[ch] + (1.f - last_alpha) * accum_rec[ch];// 更新我以及后面所有Gaussian的颜色贡献last_color[ch] = c; // 我后面的Gaussian的颜色const float dL_dchannel = dL_dpixel[ch];dL_dalpha += (c - accum_rec[ch]) * dL_dchannel;// Update the gradients w.r.t. color of the Gaussian. // Atomic, since this pixel is just one of potentially// many that were affected by this Gaussian.atomicAdd(&(dL_dcolors[global_id * C + ch]), dchannel_dcolor * dL_dchannel);}dL_dalpha *= T;// Update last alpha (to be used in the next iteration)last_alpha = alpha;// Account for fact that alpha also influences how much of// the background color is added if nothing left to blendfloat bg_dot_dpixel = 0;for (int i = 0; i < C; i++)bg_dot_dpixel += bg_color[i] * dL_dpixel[i];dL_dalpha += (-T_final / (1.f - alpha)) * bg_dot_dpixel;// Helpful reusable temporary variablesconst float dL_dG = con_o.w * dL_dalpha;const float gdx = G * d.x;const float gdy = G * d.y;const float dG_ddelx = -gdx * con_o.x - gdy * con_o.y;const float dG_ddely = -gdy * con_o.z - gdx * con_o.y;// 下面累积这些参数的梯度// Update gradients w.r.t. 2D mean position of the GaussianatomicAdd(&dL_dmean2D[global_id].x, dL_dG * dG_ddelx * ddelx_dx);atomicAdd(&dL_dmean2D[global_id].y, dL_dG * dG_ddely * ddely_dy);// Update gradients w.r.t. 2D covariance (2x2 matrix, symmetric)atomicAdd(&dL_dconic2D[global_id].x, -0.5f * gdx * d.x * dL_dG);atomicAdd(&dL_dconic2D[global_id].y, -0.5f * gdx * d.y * dL_dG);atomicAdd(&dL_dconic2D[global_id].w, -0.5f * gdy * d.y * dL_dG);// Update gradients w.r.t. opacity of the GaussianatomicAdd(&(dL_dopacity[global_id]), G * dL_dalpha);}}
}

3. backward.cu: preprocess

这个函数主要干了四件事:

  1. 从Gaussian中心在像平面上的二维坐标的梯度计算其三维坐标的梯度;
  2. 从Gaussian颜色的梯度计算球谐系数的梯度(利用computeColorFromSH函数);
  3. 从Gaussian在像平面上投影的2D椭圆二次型矩阵的梯度计算其3D协方差矩阵梯度;
  4. 从协方差矩阵梯度计算缩放和旋转参数的梯度。

该函数及其调用的函数的代码我就不贴了,都是一些纷繁复杂的数学运算。


至此,我们大体上读完了Gaussian Splatting的代码。

完结撒花!!🎆​🎆​

这篇关于【计算机视觉】Gaussian Splatting源码解读补充(三)的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/839182

相关文章

SpringBoot中配置文件的加载顺序解读

《SpringBoot中配置文件的加载顺序解读》:本文主要介绍SpringBoot中配置文件的加载顺序,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录SpringBoot配置文件的加载顺序1、命令⾏参数2、Java系统属性3、操作系统环境变量5、项目【外部】的ap

Mysql用户授权(GRANT)语法及示例解读

《Mysql用户授权(GRANT)语法及示例解读》:本文主要介绍Mysql用户授权(GRANT)语法及示例,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录mysql用户授权(GRANT)语法授予用户权限语法GRANT语句中的<权限类型>的使用WITH GRANT

Java 正则表达式URL 匹配与源码全解析

《Java正则表达式URL匹配与源码全解析》在Web应用开发中,我们经常需要对URL进行格式验证,今天我们结合Java的Pattern和Matcher类,深入理解正则表达式在实际应用中... 目录1.正则表达式分解:2. 添加域名匹配 (2)3. 添加路径和查询参数匹配 (3) 4. 最终优化版本5.设计思

python3 gunicorn配置文件的用法解读

《python3gunicorn配置文件的用法解读》:本文主要介绍python3gunicorn配置文件的使用,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录python3 gunicorn配置文件配置文件服务启动、重启、关闭启动重启关闭总结python3 gun

关于pandas的read_csv方法使用解读

《关于pandas的read_csv方法使用解读》:本文主要介绍关于pandas的read_csv方法使用,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录pandas的read_csv方法解读read_csv中的参数基本参数通用解析参数空值处理相关参数时间处理相关

Java调用C++动态库超详细步骤讲解(附源码)

《Java调用C++动态库超详细步骤讲解(附源码)》C语言因其高效和接近硬件的特性,时常会被用在性能要求较高或者需要直接操作硬件的场合,:本文主要介绍Java调用C++动态库的相关资料,文中通过代... 目录一、直接调用C++库第一步:动态库生成(vs2017+qt5.12.10)第二步:Java调用C++

java之Objects.nonNull用法代码解读

《java之Objects.nonNull用法代码解读》:本文主要介绍java之Objects.nonNull用法代码,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐... 目录Java之Objects.nonwww.chinasem.cnNull用法代码Objects.nonN

Python实现无痛修改第三方库源码的方法详解

《Python实现无痛修改第三方库源码的方法详解》很多时候,我们下载的第三方库是不会有需求不满足的情况,但也有极少的情况,第三方库没有兼顾到需求,本文将介绍几个修改源码的操作,大家可以根据需求进行选择... 目录需求不符合模拟示例 1. 修改源文件2. 继承修改3. 猴子补丁4. 追踪局部变量需求不符合很

SpringCloud负载均衡spring-cloud-starter-loadbalancer解读

《SpringCloud负载均衡spring-cloud-starter-loadbalancer解读》:本文主要介绍SpringCloud负载均衡spring-cloud-starter-loa... 目录简述主要特点使用负载均衡算法1. 轮询负载均衡策略(Round Robin)2. 随机负载均衡策略(

解读spring.factories文件配置详情

《解读spring.factories文件配置详情》:本文主要介绍解读spring.factories文件配置详情,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录使用场景作用内部原理机制SPI机制Spring Factories 实现原理用法及配置spring.f