A good paper feed millions of researchers who can plagiarize some simple idea to publish papers.
CT Scan Reconstruction MRI Reconstruction Course Deep Learning for MRI MRI Basics Part 1 - Image Formation
It uses an analytical method relies on signal processing.
NeRF Paper Reading: Youtube
NeRF's objective is to reconstruct 3D geometry using an array of images. However, the product of a NeRF is not an actual geometry, but a neural network that represent both the rendering function and a geometry.
Radiance Fields: It is a space in \mathbb{R}^3 such that each point contains a spherical function instead of containing a single value. This is essentially represent 3D volumes with 2D view-dependent appearance, basically geometry data viewed from different angle. It is a function of view direction on an object to color.
Input: ((\theta, \phi), (x, y, z)). Output: ((r, g, b), \alpha). Loss: difference between \int_D f(\cdot)_\alpha \cdot f(\cdot)_D and ground truth (r, g, b) where D is line along the ray.
Procedural:
// QUESTION: I kinda not sure how exactly to integrate \alpha.
We don't uniformly select voxel locations. We do two passes. The first pass is done with uniform voxel locations. And the second pass can be concentrated on the surface of the object.
However, if you just do that, the result will be poor, because for some reason, networks have hard time overfitting. Below is an example of a network trained with input (x, y) and output (r, g, b). The result is not great.
So the idea is to split the signal into different frequency layer to, in a sense, augment loss in higher frequency. This strategy can be found in transformers.
Further Readings: Fourier Features Let Networks Learn High Frequency
NTK: Infinite width fully connected layer, initialized with reasonable weights, and trained with infinite small steps. It is a good mathematical tool for giving good insights for fully connected layers.
Big Idea: Deal with small, real-life image that has anti-aliasing and poor resolution. It also improve NeRF by not require camera to center at object's central location.
pixel: becomes an area on image
ray: becomes a cone
sample point: become a weighted sampling surface
// TODO: what is blurpool
It use characteristics function to determine whether the point is in the cone. And then calculate the complex 3D gaussian expectation with fourier transformed coordinates. // TODO: think about math if you have time
Idea: the reason why MLP works is because the players serve as a prior to assume smoothness of color (and therefore shape) on geometry surface.
In original design of NeRF, the direction d is smooth with respect to color c more than position x with respect to color c. Therefore, putting input d in later layers help the accuracy of the model.
For 360 degree captures of unbounded scenes, NeRF’s parameterization of space either models only a portion of the scene, leading to significant artifacts in background elements (a), or models the full scene and suffers from an overall loss of detail due to finite sampling resolution (b).
So we separate NeRF (into two MLPs) to one for foreground and one for background.
We parameterize location encoding for background scene to encode space outside of the sphere as a 4D coordinate where the 4th coordinate 1/r decrease with distance. The idea is that the original sparse encoding for background becomes more dense and therefore more images can contribute to the color of far backgrounds to resolve background ambiguity.
Parameterization: kinda Mip-NeRF + another way to do NeRF++
Convergence: Separate into two network but with one gradient, first for density and the second for density and color using first network's density to reduce training cost.
// QUESTION: don't quiet understand how this would work
Distillation: // TODO: didn't read
Above methods is topology-free and can be rendered in real-time but is memory intensive (can't capture detailed resolution).
NeRFs can be sampled with arbitrary resolution since the function is continuous. However, they are slow to train and test. Methods to accelerate NeRF includes
Spherical harmonics: used to speed up the process of converting NeRF to PlenOctree. This is because we put view-dependent calculation to evaluation time instead of PlenOctree-convertion time. (also the model looks cleaner)
Comments:
The article makes use spherical harmonic basis for decomposition of view direction-depended color as spherical functions on a 3D space origin. Although most object can be encoded well with spherical harmonics, 3D scenes with pores and camera inside the geometry can hardly be viewed as spherical functions, which is not an issue here. Therefore, encoding geometry as spherical function is not possible.
I am thinking throwing away the network entirely and parameterize color by spherical harmonics and density by spherical harmonics but with continuous functional coefficients (It is a function that take in radius and spit out actual coefficient).
Also the PlenOctree isn't spherical, and it not continuous. It is not a good representation as it isn't rotational invariant, meaning the object might look bad when rotated at a specific angle. It is not continuous and the voxels will exhibit minecraft-like looking if tree isn't deep enough. Well, on the other hand, png
compression is more popular than "fourier compression" in practice.
What is the benifit of represending geometry in neuralnetwork and then in octree? Why not directly in octree? If you want to directly tune octree, it is not possible because tree structure is fixed. You can instead generate tree structure first and then tune the leaf values. This is the same as paper's approach in which a neural network is used to obtain corse tree structure.
The paper use tree leaf separated by occupancy, but we can do that for colors too, in fact, for each channel to further compress the model. So we have in total of 4 octree with different shape each containing (r, g, b, alpha) values. This might be a better compression while it might be more costly to evaluate.
The Octree extraction is too slow: 15 minutes
Normalized device coordinates: ?
multi-sphere images: ?
Trilinear Interpolation is crucial: it converts discrete representation to a continuous one to minimize reconstruction loss.
// QUESTION: I don't understand how optimizing voxel coefficients, and regularization formula work, haven't read into it.
Comments:
Procedural:
NeRF
, however, for each point (x, y. z) we calculate its value by interpolation from its nearby vertex value (with course to fine multi-layer) that is randomly distributed in the hash.RGBD
target from interpolated latent space.
Morton code is used to map n
dimensional space to linear space. Morton code defines a Z-shape space filler, which preserves n
dimensional locality.
Assuming we already stored a 3D map where each cell is an int into a Morton Coded array, and we want to extract the int value in (x, y, z) = (5, 9, 1) = (0101b, 1001b, 0001b)
then there is an easy method to know which array position we need to look up. We need to look up (010 001 000 111b)
. This is because from z, y, x
, we extract the most significant bits in every dimension to the least significant bits.
Note that to invert morton code to 3D coordinate, we only need to
code >> 0
forx
,code >> 1
fory
, andcode >> 2
forz
and pass to the same decoding function.
If for
loop, we can write the code like this:
#include <stdint.h>
#include <limits.h>
using namespace std;
inline uint64_t mortonEncode_for(unsigned int x, unsigned int y, unsigned int z) {
uint64_t answer = 0;
for (uint64_t i = 0; i < (sizeof(uint64_t)* CHAR_BIT)/3; ++i) {
answer |= ((x & ((uint64_t)1 << i)) << 2*i) | ((y & ((uint64_t)1 << i)) << (2*i + 1)) | ((z & ((uint64_t)1 << i)) << (2*i + 2));
}
return answer;
}
To achieve better performance, we could use magic bits:
#include <stdint.h>
#include <limits.h>
using namespace std;
// method to seperate bits from a given integer 3 positions apart
inline uint64_t splitBy3(unsigned int a){
uint64_t x = a & 0x1fffff; // we only look at the first 21 bits
x = (x | x << 32) & 0x1f00000000ffff; // shift left 32 bits, OR with self, and 00011111000000000000000000000000000000001111111111111111
x = (x | x << 16) & 0x1f0000ff0000ff; // shift left 32 bits, OR with self, and 00011111000000000000000011111111000000000000000011111111
x = (x | x << 8) & 0x100f00f00f00f00f; // shift left 32 bits, OR with self, and 0001000000001111000000001111000000001111000000001111000000000000
x = (x | x << 4) & 0x10c30c30c30c30c3; // shift left 32 bits, OR with self, and 0001000011000011000011000011000011000011000011000011000100000000
x = (x | x << 2) & 0x1249249249249249;
return x;
}
inline uint64_t mortonEncode_magicbits(unsigned int x, unsigned int y, unsigned int z){
uint64_t answer = 0;
answer |= splitBy3(x) | splitBy3(y) << 1 | splitBy3(z) << 2;
return answer;
}
or to use a giant table to achieve the best performance
#include <stdint.h>
#include <limits.h>
using namespace std;
static const uint32_t morton256_x[256] = {
0x00000000,
0x00000001, 0x00000008, 0x00000009, 0x00000040, 0x00000041, 0x00000048, 0x00000049, 0x00000200,
0x00000201, 0x00000208, 0x00000209, 0x00000240, 0x00000241, 0x00000248, 0x00000249, 0x00001000,
0x00001001, 0x00001008, 0x00001009, 0x00001040, 0x00001041, 0x00001048, 0x00001049, 0x00001200,
0x00001201, 0x00001208, 0x00001209, 0x00001240, 0x00001241, 0x00001248, 0x00001249, 0x00008000,
0x00008001, 0x00008008, 0x00008009, 0x00008040, 0x00008041, 0x00008048, 0x00008049, 0x00008200,
0x00008201, 0x00008208, 0x00008209, 0x00008240, 0x00008241, 0x00008248, 0x00008249, 0x00009000,
0x00009001, 0x00009008, 0x00009009, 0x00009040, 0x00009041, 0x00009048, 0x00009049, 0x00009200,
0x00009201, 0x00009208, 0x00009209, 0x00009240, 0x00009241, 0x00009248, 0x00009249, 0x00040000,
0x00040001, 0x00040008, 0x00040009, 0x00040040, 0x00040041, 0x00040048, 0x00040049, 0x00040200,
0x00040201, 0x00040208, 0x00040209, 0x00040240, 0x00040241, 0x00040248, 0x00040249, 0x00041000,
0x00041001, 0x00041008, 0x00041009, 0x00041040, 0x00041041, 0x00041048, 0x00041049, 0x00041200,
0x00041201, 0x00041208, 0x00041209, 0x00041240, 0x00041241, 0x00041248, 0x00041249, 0x00048000,
0x00048001, 0x00048008, 0x00048009, 0x00048040, 0x00048041, 0x00048048, 0x00048049, 0x00048200,
0x00048201, 0x00048208, 0x00048209, 0x00048240, 0x00048241, 0x00048248, 0x00048249, 0x00049000,
0x00049001, 0x00049008, 0x00049009, 0x00049040, 0x00049041, 0x00049048, 0x00049049, 0x00049200,
0x00049201, 0x00049208, 0x00049209, 0x00049240, 0x00049241, 0x00049248, 0x00049249, 0x00200000,
0x00200001, 0x00200008, 0x00200009, 0x00200040, 0x00200041, 0x00200048, 0x00200049, 0x00200200,
0x00200201, 0x00200208, 0x00200209, 0x00200240, 0x00200241, 0x00200248, 0x00200249, 0x00201000,
0x00201001, 0x00201008, 0x00201009, 0x00201040, 0x00201041, 0x00201048, 0x00201049, 0x00201200,
0x00201201, 0x00201208, 0x00201209, 0x00201240, 0x00201241, 0x00201248, 0x00201249, 0x00208000,
0x00208001, 0x00208008, 0x00208009, 0x00208040, 0x00208041, 0x00208048, 0x00208049, 0x00208200,
0x00208201, 0x00208208, 0x00208209, 0x00208240, 0x00208241, 0x00208248, 0x00208249, 0x00209000,
0x00209001, 0x00209008, 0x00209009, 0x00209040, 0x00209041, 0x00209048, 0x00209049, 0x00209200,
0x00209201, 0x00209208, 0x00209209, 0x00209240, 0x00209241, 0x00209248, 0x00209249, 0x00240000,
0x00240001, 0x00240008, 0x00240009, 0x00240040, 0x00240041, 0x00240048, 0x00240049, 0x00240200,
0x00240201, 0x00240208, 0x00240209, 0x00240240, 0x00240241, 0x00240248, 0x00240249, 0x00241000,
0x00241001, 0x00241008, 0x00241009, 0x00241040, 0x00241041, 0x00241048, 0x00241049, 0x00241200,
0x00241201, 0x00241208, 0x00241209, 0x00241240, 0x00241241, 0x00241248, 0x00241249, 0x00248000,
0x00248001, 0x00248008, 0x00248009, 0x00248040, 0x00248041, 0x00248048, 0x00248049, 0x00248200,
0x00248201, 0x00248208, 0x00248209, 0x00248240, 0x00248241, 0x00248248, 0x00248249, 0x00249000,
0x00249001, 0x00249008, 0x00249009, 0x00249040, 0x00249041, 0x00249048, 0x00249049, 0x00249200,
0x00249201, 0x00249208, 0x00249209, 0x00249240, 0x00249241, 0x00249248, 0x00249249
};
// pre-shifted table for Y coordinates (1 bit to the left)
static const uint32_t morton256_y[256] = {
0x00000000,
0x00000002, 0x00000010, 0x00000012, 0x00000080, 0x00000082, 0x00000090, 0x00000092, 0x00000400,
0x00000402, 0x00000410, 0x00000412, 0x00000480, 0x00000482, 0x00000490, 0x00000492, 0x00002000,
0x00002002, 0x00002010, 0x00002012, 0x00002080, 0x00002082, 0x00002090, 0x00002092, 0x00002400,
0x00002402, 0x00002410, 0x00002412, 0x00002480, 0x00002482, 0x00002490, 0x00002492, 0x00010000,
0x00010002, 0x00010010, 0x00010012, 0x00010080, 0x00010082, 0x00010090, 0x00010092, 0x00010400,
0x00010402, 0x00010410, 0x00010412, 0x00010480, 0x00010482, 0x00010490, 0x00010492, 0x00012000,
0x00012002, 0x00012010, 0x00012012, 0x00012080, 0x00012082, 0x00012090, 0x00012092, 0x00012400,
0x00012402, 0x00012410, 0x00012412, 0x00012480, 0x00012482, 0x00012490, 0x00012492, 0x00080000,
0x00080002, 0x00080010, 0x00080012, 0x00080080, 0x00080082, 0x00080090, 0x00080092, 0x00080400,
0x00080402, 0x00080410, 0x00080412, 0x00080480, 0x00080482, 0x00080490, 0x00080492, 0x00082000,
0x00082002, 0x00082010, 0x00082012, 0x00082080, 0x00082082, 0x00082090, 0x00082092, 0x00082400,
0x00082402, 0x00082410, 0x00082412, 0x00082480, 0x00082482, 0x00082490, 0x00082492, 0x00090000,
0x00090002, 0x00090010, 0x00090012, 0x00090080, 0x00090082, 0x00090090, 0x00090092, 0x00090400,
0x00090402, 0x00090410, 0x00090412, 0x00090480, 0x00090482, 0x00090490, 0x00090492, 0x00092000,
0x00092002, 0x00092010, 0x00092012, 0x00092080, 0x00092082, 0x00092090, 0x00092092, 0x00092400,
0x00092402, 0x00092410, 0x00092412, 0x00092480, 0x00092482, 0x00092490, 0x00092492, 0x00400000,
0x00400002, 0x00400010, 0x00400012, 0x00400080, 0x00400082, 0x00400090, 0x00400092, 0x00400400,
0x00400402, 0x00400410, 0x00400412, 0x00400480, 0x00400482, 0x00400490, 0x00400492, 0x00402000,
0x00402002, 0x00402010, 0x00402012, 0x00402080, 0x00402082, 0x00402090, 0x00402092, 0x00402400,
0x00402402, 0x00402410, 0x00402412, 0x00402480, 0x00402482, 0x00402490, 0x00402492, 0x00410000,
0x00410002, 0x00410010, 0x00410012, 0x00410080, 0x00410082, 0x00410090, 0x00410092, 0x00410400,
0x00410402, 0x00410410, 0x00410412, 0x00410480, 0x00410482, 0x00410490, 0x00410492, 0x00412000,
0x00412002, 0x00412010, 0x00412012, 0x00412080, 0x00412082, 0x00412090, 0x00412092, 0x00412400,
0x00412402, 0x00412410, 0x00412412, 0x00412480, 0x00412482, 0x00412490, 0x00412492, 0x00480000,
0x00480002, 0x00480010, 0x00480012, 0x00480080, 0x00480082, 0x00480090, 0x00480092, 0x00480400,
0x00480402, 0x00480410, 0x00480412, 0x00480480, 0x00480482, 0x00480490, 0x00480492, 0x00482000,
0x00482002, 0x00482010, 0x00482012, 0x00482080, 0x00482082, 0x00482090, 0x00482092, 0x00482400,
0x00482402, 0x00482410, 0x00482412, 0x00482480, 0x00482482, 0x00482490, 0x00482492, 0x00490000,
0x00490002, 0x00490010, 0x00490012, 0x00490080, 0x00490082, 0x00490090, 0x00490092, 0x00490400,
0x00490402, 0x00490410, 0x00490412, 0x00490480, 0x00490482, 0x00490490, 0x00490492, 0x00492000,
0x00492002, 0x00492010, 0x00492012, 0x00492080, 0x00492082, 0x00492090, 0x00492092, 0x00492400,
0x00492402, 0x00492410, 0x00492412, 0x00492480, 0x00492482, 0x00492490, 0x00492492
};
// Pre-shifted table for z (2 bits to the left)
static const uint32_t morton256_z[256] = {
0x00000000,
0x00000004, 0x00000020, 0x00000024, 0x00000100, 0x00000104, 0x00000120, 0x00000124, 0x00000800,
0x00000804, 0x00000820, 0x00000824, 0x00000900, 0x00000904, 0x00000920, 0x00000924, 0x00004000,
0x00004004, 0x00004020, 0x00004024, 0x00004100, 0x00004104, 0x00004120, 0x00004124, 0x00004800,
0x00004804, 0x00004820, 0x00004824, 0x00004900, 0x00004904, 0x00004920, 0x00004924, 0x00020000,
0x00020004, 0x00020020, 0x00020024, 0x00020100, 0x00020104, 0x00020120, 0x00020124, 0x00020800,
0x00020804, 0x00020820, 0x00020824, 0x00020900, 0x00020904, 0x00020920, 0x00020924, 0x00024000,
0x00024004, 0x00024020, 0x00024024, 0x00024100, 0x00024104, 0x00024120, 0x00024124, 0x00024800,
0x00024804, 0x00024820, 0x00024824, 0x00024900, 0x00024904, 0x00024920, 0x00024924, 0x00100000,
0x00100004, 0x00100020, 0x00100024, 0x00100100, 0x00100104, 0x00100120, 0x00100124, 0x00100800,
0x00100804, 0x00100820, 0x00100824, 0x00100900, 0x00100904, 0x00100920, 0x00100924, 0x00104000,
0x00104004, 0x00104020, 0x00104024, 0x00104100, 0x00104104, 0x00104120, 0x00104124, 0x00104800,
0x00104804, 0x00104820, 0x00104824, 0x00104900, 0x00104904, 0x00104920, 0x00104924, 0x00120000,
0x00120004, 0x00120020, 0x00120024, 0x00120100, 0x00120104, 0x00120120, 0x00120124, 0x00120800,
0x00120804, 0x00120820, 0x00120824, 0x00120900, 0x00120904, 0x00120920, 0x00120924, 0x00124000,
0x00124004, 0x00124020, 0x00124024, 0x00124100, 0x00124104, 0x00124120, 0x00124124, 0x00124800,
0x00124804, 0x00124820, 0x00124824, 0x00124900, 0x00124904, 0x00124920, 0x00124924, 0x00800000,
0x00800004, 0x00800020, 0x00800024, 0x00800100, 0x00800104, 0x00800120, 0x00800124, 0x00800800,
0x00800804, 0x00800820, 0x00800824, 0x00800900, 0x00800904, 0x00800920, 0x00800924, 0x00804000,
0x00804004, 0x00804020, 0x00804024, 0x00804100, 0x00804104, 0x00804120, 0x00804124, 0x00804800,
0x00804804, 0x00804820, 0x00804824, 0x00804900, 0x00804904, 0x00804920, 0x00804924, 0x00820000,
0x00820004, 0x00820020, 0x00820024, 0x00820100, 0x00820104, 0x00820120, 0x00820124, 0x00820800,
0x00820804, 0x00820820, 0x00820824, 0x00820900, 0x00820904, 0x00820920, 0x00820924, 0x00824000,
0x00824004, 0x00824020, 0x00824024, 0x00824100, 0x00824104, 0x00824120, 0x00824124, 0x00824800,
0x00824804, 0x00824820, 0x00824824, 0x00824900, 0x00824904, 0x00824920, 0x00824924, 0x00900000,
0x00900004, 0x00900020, 0x00900024, 0x00900100, 0x00900104, 0x00900120, 0x00900124, 0x00900800,
0x00900804, 0x00900820, 0x00900824, 0x00900900, 0x00900904, 0x00900920, 0x00900924, 0x00904000,
0x00904004, 0x00904020, 0x00904024, 0x00904100, 0x00904104, 0x00904120, 0x00904124, 0x00904800,
0x00904804, 0x00904820, 0x00904824, 0x00904900, 0x00904904, 0x00904920, 0x00904924, 0x00920000,
0x00920004, 0x00920020, 0x00920024, 0x00920100, 0x00920104, 0x00920120, 0x00920124, 0x00920800,
0x00920804, 0x00920820, 0x00920824, 0x00920900, 0x00920904, 0x00920920, 0x00920924, 0x00924000,
0x00924004, 0x00924020, 0x00924024, 0x00924100, 0x00924104, 0x00924120, 0x00924124, 0x00924800,
0x00924804, 0x00924820, 0x00924824, 0x00924900, 0x00924904, 0x00924920, 0x00924924
};
inline uint64_t mortonEncode_LUT(unsigned int x, unsigned int y, unsigned int z){
uint64_t answer = 0;
answer = morton256_z[(z >> 16) & 0xFF ] | // we start by shifting the third byte, since we only look at the first 21 bits
morton256_y[(y >> 16) & 0xFF ] |
morton256_x[(x >> 16) & 0xFF ];
answer = answer << 48 | morton256_z[(z >> 8) & 0xFF ] | // shifting second byte
morton256_y[(y >> 8) & 0xFF ] |
morton256_x[(x >> 8) & 0xFF ];
answer = answer << 24 |
morton256_z[(z) & 0xFF ] | // first byte
morton256_y[(y) & 0xFF ] |
morton256_x[(x) & 0xFF ];
return answer;
}
Here is the source code for octree:
// Expands a 10-bit integer into 30 bits
// by inserting 2 zeros after each bit.
__host__ __device__ inline uint32_t expand_bits(uint32_t v) {
v = (v * 0x00010001u) & 0xFF0000FFu;
v = (v * 0x00000101u) & 0x0F00F00Fu;
v = (v * 0x00000011u) & 0xC30C30C3u;
v = (v * 0x00000005u) & 0x49249249u;
return v;
}
// Calculates a 30-bit Morton code for the
// given 3D point located within the unit cube [0,1].
__host__ __device__ inline uint32_t morton3D(uint32_t x, uint32_t y, uint32_t z) {
uint32_t xx = expand_bits(x);
uint32_t yy = expand_bits(y);
uint32_t zz = expand_bits(z);
return xx | (yy << 1) | (zz << 2);
}
__host__ __device__ inline uint32_t morton3D_invert(uint32_t x) {
x = x & 0x49249249;
x = (x | (x >> 2)) & 0xc30c30c3;
x = (x | (x >> 4)) & 0x0f00f00f;
x = (x | (x >> 8)) & 0xff0000ff;
x = (x | (x >> 16)) & 0x0000ffff;
return x;
}
For details, read Out-of-Core Construction of Sparse Voxel Octrees and Morton encoding/decoding through bit interleaving: Implementations
idx = ((i+step*n_elements) * 56924617 + j * 19349663 + 96925573) % (NERF_GRIDSIZE()*NERF_GRIDSIZE()*NERF_GRIDSIZE());
A linear congruential generator (LCG) is an algorithm that yields a sequence of pseudo-randomized numbers calculated with a discontinuous piecewise linear equation. The method represents one of the oldest and best-known pseudorandom number generator algorithms. The theory behind them is relatively easy to understand, and they are easily implemented and fast, especially on computer hardware which can provide modular arithmetic by storage-bit truncation. However, the statistical properties are bad.
Here, we don't actually care much about its statistical properties. Rather, we care about its property of producing a permutation: this use case distributes the density grid update samples more-or-less uniformly over space (due to the pseudo-random nature), but ensures good coverage by never visiting a grid cell twice without having visited all other cells (due to the permutation property).
Background: general modality models (transformers, MLP mixer, Perceiver), can they beat domain-specific architecture?
NeRF: data distillation by overfitting, cross-view interpolation
photorealistic result
view-consistent geometry with view-dependent lighting
network optimization cost is high
limited material and optical effects
interpolation is restricting our representation
over-simplified rendering equation
PixelNeRF & IBRNet: acquire RGBA of each point by weighted summing the image features of its 2D projections
MVSNeRF: predict RGBA of each point from a cost volume induced by MVSNet using 2 cameras
Side note: Transformer - attention is all you need replacing LSTM and early RNNs.
Generalizable NeRF Transformer (GNT):
// QUESTION: yes, the
// TODO: at 20min https://mit.zoom.us/rec/play/O-E4BZQZLc4km4Xd9EFXrMleMBPVoxK73HzZwo7iEmndSZb--QJXH
Tianshu Huang 10:36 PM Specifically last 5 minutes
There are additional 9 articles in my reading list.
// QUESTION: Benchmark for Nvidia - how many images. and what is the training time.
Paper:
how many camera, how many gpu
Report
Hands on Experience without Code
Short Video: Youtube
COLMAP with NGP with real camera data and detailed procedural: Youtube
From Video Automatic Tool: Youtube
Questions
Table of Content