In 2020, a sensational new view synthesis technique called NeRF was announced. This method is based on volume rendering and represents the distribution of radiance in space (radiance field) using neural networks, demonstrating the new possibilities of deep learning. Since the announcement of NeRF, numerous derivative methods have been proposed in the field of computer vision, creating a buzz. This paper focuses on explaining the original paper by Mildenhall et al.
This short article discusses the difference between photogrammetry and NeRF. Firstly, the theoretical difference between these two methods is described. Introduction of the researches which discuss precision of NeRF in comparison with SfM/MVS method follows. We also show some intriguing aspect of NeRF with an actual analysis of a real object.
NeRF (Neural Radiance Fields) has been attracting attention due to recent advancements in 3D scene reconstruction technology. This method uses a fully-connected deep network to generate images from arbitrary viewpoints, using photographs captured from multiple perspectives. Specifically, it learns the position (x, y, z) and viewing direction (θ, φ) of each point in the captured images and estimates the density and color of each point within the 3D space. This enables high-quality reconstruction of complex scenes, a task that traditional techniques struggled with. NeRF shows promise for applications in diverse fields such as surveying, autonomous driving, robotics, medical imaging, entertainment, and more. This article introduces specific applications of NeRF and explains in detail how it is used in various fields.
Neural Radiance Fields (NeRF) represent a groundbreaking technique at the intersection of classical computer graphics and deep learning. It facilitates the generation of 3D objects from 2D images by employing an interpolation approach to produce novel 3D reconstructed views of intricate scenes. In contrast to traditional methods that directly reconstruct entire 3D scene geometry, NeRF utilizes a volumetric representation known as a “radiance field.” This field enables the generation of color and density for every point within the relevant 3D space. As NeRF is a relatively recent technique, ongoing efforts are focused on exploring and refining its capabilities and limitations. This paper reviews the deficiencies of the original NeRF and introduces methods aimed at addressing these shortcomings.
Neural Radiance Fields (NeRF) has revolutionized novel-view synthesis of scenes captured through multiple photographs. Despite its remarkable visual quality, NeRF requires intensive computational resources for training and rendering due to the ray casting operations intertwined with neural network modules. Additionally, the definition of the 3D model as an implicit function of NeRF poses challenges in terms of editability. To overcome these limitations, 3D Gaussian Splatting (3DGS) has been proposed, enabling real-time rendering at Full-HD resolution using explicit 3D expression and efficient rasterization. This paper reviews the framework of 3DGS in comparison with NeRF and presents its results and applications.
In Japan, new information provision for the enjoyment of sports spectating has been considered for the Tokyo Olympics. In particular, in the popular marathon race, RFID tags are attached to athletes and receivers are installed at specific points to measure their passing times, and this information is used in marathon broadcasts to provide viewers with lap times and predicted finishing times, in addition to commentary on the race development. However, because it is difficult to measure performance information such as the speed, pitch, and stride of each athlete, this information has not yet been provided. Therefore, this research aims to develop a technology to detect and track athletes using deep learning from video images and estimate their speed, pitch, and stride.
Japan Aerospace Exploration Agency (JAXA) launched the Advanced Land Observing Satellite-4 (ALOS-4) on July 1st, 2024. The first light images were taken by Phased Array type L-band Synthetic Aperture Radar-3 (PALSAR-3) onboard the ALOS-4 from July 15th to 17th, 2024. PALSAR-3, featuring Digital Beam Forming (DBF) technology, increases observation swath to 200 km with high resolution. This enables both wide-area disaster monitoring and short-term basic observation. The first light image of the Kanto region in Japan, which was successfully taken with a 200 km observation swath with 3 m resolution, clearly demonstrates the advancements from the ALOS-2. The Amazon rainforest in Brazil was also observed, enabling the monitoring of deforestation by comparing the data with observations from the ALOS and the ALOS-2. This long-term time series data of L-band SAR is expected to contribute to global environmental monitoring. The ALOS-4 also supports various observation modes, such as Spotlight mode for detailed monitoring and ScanSAR mode for large-scale monitoring.