Transactions of the Japan Society for Computational Engineering and Science

Memory saving technique using linear list to search neighboring particles in meshfree particle methods

Hisayoshi HIRABAYASHI, Masahiro SATO

2010 Volume 2010 Pages 20100001
Published: February 10, 2010
Released on J-STAGE: February 10, 2010

DOIhttps://doi.org/10.11421/jsces.2010.20100001

JOURNAL FREE ACCESS

Show abstractHide abstract

In particle methods, uniform grid is often used to search neighboring particles. However, this technique has a crucial problem that computer memory usage becomes very high, as simulation system size is larger. We propose the method of uniform grid with linear list, which can effectively decrease memory usage. We demonstrated that the memory usage was decreased about 90% for dam break simulation with linear list. Hence this method can be useful for the particle methods.

View full abstract

Download PDF (704K)
Optimization of finite element analysis code for software controlled local memory

Noriyuki Kushida, Hiroshi Takemiya

2010 Volume 2010 Pages 20100002
Published: March 02, 2010
Released on J-STAGE: March 02, 2010

DOIhttps://doi.org/10.11421/jsces.2010.20100002

JOURNAL FREE ACCESS

Show abstractHide abstract

In this study, we introduce a novel implementation method of finite element method (FEM) targeting the Cell which has a software controlled local memory. In recent days, developers of scientific or engineering numerical simulation code suffer from the big latency of data transfer between a processor and main memory, and power consumption. In order to overcome these problems, several researchers have proposed new computer architecture, which combines software controlled local memory and SIMD processing unit. FEM is the one of the most famous numerical simulation method especially for the engineering. Such new computer architecture is, however, less effective for FEM than the other application. The Cell, which is developed by Sony, Toshiba, and IBM, is well know as the accelerator of the Roadrunner which is the fastest computer in the Top500 supercomputer list, and is only existing processor which has software controlled local memory. In this study we developed new FEM implementation method which provides faster computation than conventional method by reducing main memory access frequency. As a result, we achieved 10 times acceleration by comparing with ordinal finite element code, which run on PowerPC processing unit (PPU) of the Cell.

View full abstract

Download PDF (350K)
Parallelization of Intelligent Multi-Agent based Traffic and Environment Simulator MATES

Toshihiro KOHASHI, Shintaro BUNYA, Hideki FUJII, Shinobu YOSHIMURA

2010 Volume 2010 Pages 20100003
Published: March 19, 2010
Released on J-STAGE: March 19, 2010

DOIhttps://doi.org/10.11421/jsces.2010.20100003

JOURNAL FREE ACCESS

Show abstractHide abstract

These days practical traffic simulators are required to be able to solve large area problems simulating microscopic vehicle behaviors. However few simulators can do so. Multi-Agent based Traffic and Environment Simulator (MATES), a microscopic traffic simulator developed by the present authors, elucidates complex traffic systems, solves various problems, and reproduces the microscopic features of traffic phenomenon. In this paper, we parallelize MATES to accomplish large area simulations. First, we introduce a parallelization method based on a domain decomposition technique with an adaptive load balancing functionality. Next, we evaluate its performance such as parallel speedup. In the last experiment, we have achieved a simulation in which 1.7 million vehicles and 1.2 million intersections exist(the area size is as large as the Kanto region) in 3.1 hours on a 15-node PC cluster of which each node has a quad core CPU.

View full abstract

Download PDF (2352K)
A stress estimation method with current configuration for arbitrary curvilinear surfaces and its application to risk evaluation of aortic diseases

Eiji NUNOBIKI, Takumi WASHIO, Toshiaki HISADA

2010 Volume 2010 Pages 20100004
Published: May 21, 2010
Released on J-STAGE: May 21, 2010

DOIhttps://doi.org/10.11421/jsces.2010.20100004

JOURNAL FREE ACCESS

Show abstractHide abstract

The stress analysis on the aortic wall can be used as a diagnostic tool to evaluate the degree of risk for aortic diseases like aortic aneurysm. However, it is difficult to obtain accurate material parameters in vivo and the geometry of aortic wall in unloaded condition. In this study, we propose a simple technique to evaluate the stress distribution along the aortic wall which requires only the current shape of the inner wall and the corresponding blood pressure as input data. We investigated the validity of this approach by comparing the stress distributions of torus structures and aortic models analyzed by this approach with those analyzed by the three-dimensional hyper-elastic stress analysis, and satisfactory results were obtained.

View full abstract

Download PDF (819K)
A formulation of elastoplasticity with tensorial internal variable at finite strain

WATANABE Ikumu, Noritoshi IWATA, Kokichi NAKANISHI

2007 Volume 2010 Pages 20100005
Published: May 25, 2010
Released on J-STAGE: May 25, 2010

DOIhttps://doi.org/10.11421/jsces.2010.20100005

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper addresses the formulation of a set of constitutive equations including a tensorial internal variable for a finite strain elastoplasticity with an anisotropic plastic hardening. The associative flow rules for tensorial and scalar internal variables are derived with the principle of maximum dissipation and a standard optimization method under stress constraint of a yield function. Also the implicit stress update algorithm and the consistent tangent modulus are derived with the linearization of the constitutive model for finite element analyses. As an example of the anisotropic hardening plasticity, the proposed framework is applied to the elastoplastic constitutive model of a metallic material with the non-linear kinematic hardening and the subloading surface. In this constitutive model, the back stress and the subloading variable are defined as functions of the corresponding tensorial and scalar internal variables. And the axial cyclic stress-strain curves are simulated with the presenting constitutive model for the validation.

View full abstract

Download PDF (84K)
Coupling of physical processes and its performance evaluation in climate model MIROC

Takashi ARAKAWA, Hiromasa YOSHIMURA

2010 Volume 2010 Pages 20100006
Published: June 01, 2010
Released on J-STAGE: June 01, 2010

DOIhttps://doi.org/10.11421/jsces.2010.20100006

JOURNAL FREE ACCESS

Show abstractHide abstract

A coupler is software that couples each component model composing a huge, complex simulation model such as climate model. In this study, an atmospheric model and an aerosol model, which are component models of climate model MIROC, are coupled by a model coupler Jcup to evaluate its performance of three-dimensional data exchange. Comparisons of various data exchange strategies have revealed that data buffering is the main bottleneck of the coupling process. Our results suggest the direct data exchange is superior to other coupling strategies in exchange of three-dimensional data in atmospheric and aerosol models.

View full abstract

Download PDF (267K)
High-Efficiency Algorithm for DEM Simulation on GPU

Daisuke NISHIURA, Hide SAKAGUCHI

2010 Volume 2010 Pages 20100007
Published: June 01, 2010
Released on J-STAGE: June 01, 2010

DOIhttps://doi.org/10.11421/jsces.2010.20100007

JOURNAL FREE ACCESS

Show abstractHide abstract

New algorithms are proposed for implementing the high-performance discrete element method (DEM) on a graphics processing unit (GPU), and a GPU-DEM code is developed on the basis of the proposed algorithms. This newly developed GPU-DEM code considers the contact logic of Voigt model in normal and tangential directions, Coulomb type frictional slider, and rolling friction. The following two methods are designed to prevent con ict between GPU memory accesses: (1) generation of contact candidate pairs from neighboring particles and (2) force summation for each particle. In the first method, each particle is assigned a cell index in the domain where the particles exist. Then, a list of contact candidate pairs is prepared by pairing the particle labels that are sorted according to the cell index. In the summation process, a table is constructed to store the list of indexes of the particles in contact with contact candidate pairs. The contact forces acting on a particle for all the contacts are summed by referencing this table in order to apply the action-reaction law to the contact force. Using the two new algorithms, the global memory access con ict is avoided without additional redundant procedures. The calculation speed of our GPU-DEM code that uses the proposed algorithms is approximately 50 times faster than that of the conventional DEM implemented on a CPU.

View full abstract

Download PDF (1637K)
GPU Acceleration Techniques for DEM Simulations of Granular Materials with Broad Particle Size Distribution

Daisuke NISHIURA, Hide SAKAGUCHI

2010 Volume 2010 Pages 20100008
Published: June 01, 2010
Released on J-STAGE: June 01, 2010

DOIhttps://doi.org/10.11421/jsces.2010.20100008

JOURNAL FREE ACCESS

Show abstractHide abstract

New algorithms were developed to implement the high-performance discrete element method (DEM) on a graphics processing unit (GPU) for simulating particle systems with a broad particle size distribution. First, we focused on minimizing the usage of GPU memory required to construct a table that allows reference to the interparticle contact force; this memory usage had increased drastically with an increase in the width of the particle size distribution. As a result, the memory usage during GPU computing remains almost constant irrespective of the width of the particle size distribution. Second, we improved our previous method for generating a list of contact candidate pairs; the particle label was sorted according to not only the cell label but also the particle size. The searching method for adjacent cells was improved by using a reordered particle label because the number of memory accesses could be reduced by preventing the search for neighboring particles that are impossible to contact. By using the developed algorithms, we investigated the effects of cell size, particle size ratio, and number of particle size components on the computational efficiency. The total computational speed of the DEM increased with an increase in the number of components in the particle size distribution; however, the computational efficiency deteriorated with an increase in the particle size ratio. In addition, we confirmed that the optimum cell size changed in accordance with the particle size distribution. Thus, we showed that the new algorithms can be used to improve the computational efficiency of the DEM on a GPU for particles with a broad particle size distribution with preventing the waste of GPU memory usage.

View full abstract

Download PDF (4859K)
Multi-Phase-Field Simulation using a GPU

Akinori YAMANAKA, Satoi OGAWA, Takayuki AOKI, Tomohiro TAKAKI

2010 Volume 2010 Pages 20100009
Published: June 07, 2010
Released on J-STAGE: June 07, 2010

DOIhttps://doi.org/10.11421/jsces.2010.20100009

JOURNAL FREE ACCESS

Show abstractHide abstract

The phase-field method has been actively studied as a powerful tool for predicting microstructure evolution in various materials. In particular, the multi-phase-field method is proposed to simulate the microstructure evolutions during solidification and phase transformation in polycrystal and multiphase materials. In this study, in order to demonstrate a improvement of calculation performance of the multi-phase-field simulation by using a GPU, the program code is newly developed with CUDA. Furthermore, since the capacity of video memory on the present GPU is not large, we employ the active parameter tracking method and reduce the memory usage. With the developed code, we conduct two-dimensional multi-phase-field simulations of grain growth with a single GPU and evaluate the calculation performance. According to the results, it is found that, by using the shared memory on GPU as a software-managed cache, the performance of the multi-phase-field simulation achieves about 5-times speed-up to that with a single CPU.

View full abstract

Download PDF (762K)
Improved HSMAC method: An improvement based on Helmholtz-Hodge theorem

Junya IMAMURA, Takahiko TANAHASHI

2010 Volume 2010 Pages 20100010
Published: July 02, 2010
Released on J-STAGE: July 02, 2010

DOIhttps://doi.org/10.11421/jsces.2010.20100010

JOURNAL FREE ACCESS

Show abstractHide abstract

We will propose in this paper an improved HSMAC (Highly Simplified MAC) scheme. The proposed scheme is constructed based on the Helmholtz-Hodge theorem; i.e. U = gradφ+curlψ {ψ : divψ=0}, and numerically verified with basic and comparable problem; i.e. velocity profiles on the symmetric section of 3D driven cavity with proportin l_x×l_y×l_z=1×1×6 in comparison with 2D cavity result by Ghia et al. Velocity profiles by the conventional HSMAC method for this proportion with Re-number1000 do not perfectly agree with 2D result, although 2D result perfectly agree. This study began from this recognition point. The substantial difference of the Navier-Stokes equation between 3D and 2D is that the vorticity stretching term exists or not. The influence of the stretching term to the symmetric section appears only with the term ω₃∂U₃/∂x₃ while the symmetry of the whole system is kept. The above 3D numerical result kept sill symmetry. The objective of this paper is to improve the HSMAC scheme using Helmholtz's expression; i.e. the velocity expression in the HSMAC scheme is further modified by scalar potential elements φ: curlψ=U-gradφ{φ: div(gradφ)≠0}.

View full abstract

Download PDF (411K)
Crash Analysis of Composite Materials by Homogenization Method

Part 1; Applications for Large Displacement Elastic Problems

Gaku NAKAMURA, Kohei YUGE

2010 Volume 2010 Pages 20100011
Published: August 12, 2010
Released on J-STAGE: August 12, 2010

DOIhttps://doi.org/10.11421/jsces.2010.20100011

JOURNAL FREE ACCESS

Show abstractHide abstract

The present paper reports an algorithm for crash analyses of composite materials by the homogenization method. According to geometrical properties of some composite materials such as honeycomb material, shell and solid elements are used to discretize the micro- and macrostructures respectively. Then the updated-Lagrange formulation is employed for both the micro- and macrostructures to deal with large deformations. Microstructural bifurcations are efficiently handled by branch-switching with approximated bifurcation modes. In our algorithm, homogenized material constitutive equations are used to update macro stresses as reasonable alternatives to the microstructural analyses. This enables the algorithm to reduce the inherent cost of the multiscale computations. Numerical examples are presented to discuss the cost-effectiveness of the algorithm compared with those obtained by the direct method, which uses very fine finite elements. As the results, our algorithm dramatically reduced the cost in particular situations.

View full abstract

Download PDF (596K)
Nonlinear Voxel Finite Element Procedure for Compressive Strength Evaluation of Hardened Cement Paste and Its Applicability

Gakuji NAGAI, Shota IKEDA, Kiyofumi KURUMISAWA

2010 Volume 2010 Pages 20100012
Published: August 09, 2010
Released on J-STAGE: August 09, 2010

DOIhttps://doi.org/10.11421/jsces.2010.20100012

JOURNAL FREE ACCESS

Show abstractHide abstract

To predict the macroscopic compressive strengths of hardened cement paste, a digital-image-based finite element procedure for damage evolution due to local tension is developed and its applicability from practical viewpoint is studied through numerical experiments. In the procedure, microscopic three-dimensional geometries of hardened cement paste are assumed to be periodic and each phase is randomly generated by using auto-correlation function evaluated from a two-dimensional SEM image of specimen. Nonlocal isotropic damage model is employed to represented crack evolutions in the geometries. Predicted macroscopic uni-axial compressive strengths are qualitatively consistent with experimental results in terms of water-cement ratio and material age.

View full abstract

Download PDF (994K)
Explicit MPS Algorithm for Free Surface Flow Analysis

Masatoshi OOCHI, Seiichi KOSHIZUKA, Mikio SAKAI

2010 Volume 2010 Pages 20100013
Published: September 07, 2010
Released on J-STAGE: September 07, 2010

DOIhttps://doi.org/10.11421/jsces.2010.20100013

JOURNAL FREE ACCESS

Show abstractHide abstract

An explicit algorithm of a particle method is proposed to analyze incompressible flow with free surface. The calculation is stable when the Courant numbers with respect to the sound speed and the flow velocity are less than 1.0 and 0.2, respectively. Thus, Mach number of 0.2 is the optimum to speed up the calculations. The leading edge position of collapse of a water column shows good agreement with that using the semi-implicit algorithm. The explicit algorithm is more effective as the number of particles n increases because the calculation time is O(n^1.0) where n is the total number of the particles. The calculation is accelerated by multi-core CPU and GPU.

View full abstract

Download PDF (649K)
Implementation of Immersed Boundary Method to CIP Multi-Moment Finite Volume Formulation

Keita MATSUMOTO, Koji WAKASHIMA, Feng XIAO

2010 Volume 2010 Pages 20100014
Published: October 06, 2010
Released on J-STAGE: October 06, 2010

DOIhttps://doi.org/10.11421/jsces.2010.20100014

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper presents the implementation of the immersed boundary method to the CIP multi-moment finite volume method to deal with solid body in fluid flows. The presence of the solid body and its effects up on the surrounding fluid are formulated by additional forcing to the momentum equations, and the divergence is computed from the effective flux where the fluid fraction on the edges of each mesh cell is taken into account. Furthermore, we have extended the method to be applicable to moving boundary as well as thermal boundary simulations. The proposed method has been extensively validated by a wide spectrum of numerical tests. With the competitive numerical results, we believe that the present method can be an efficient and practical tool to simulate fluid/solid interactive flows in various engineering applications.

View full abstract

Download PDF (600K)
Imposition of interface continuity and discontinuity on flows by eXtended finite element method with Lagrange-multiplier

Tomohiro SAWADA, Akira TEZUKA

2010 Volume 2010 Pages 20100015
Published: October 19, 2010
Released on J-STAGE: October 19, 2010

DOIhttps://doi.org/10.11421/jsces.2010.20100015

JOURNAL FREE ACCESS

Show abstractHide abstract

Impositions of proper continuity and discontinuity at interfaces between fluid and structures, and internal Dirichlet boundaries in a fluid domain are one of the major topics in development of high accuracy fluid computational method with non-boundary fitted meshes. This paper proposes an extended finite element method (X-FEM) with Lagrange-multiplier method as an advanced solution technique to the problem. The X-FEM with Lagrange-multiplier method, however, has two practical issues in completing the implementation thereof. The first is the domain integral of enriched fluid elements that reproduce the discontinuity. The second is construction of Lagrange-multiplier mesh for imposing the Dirichlet condition at the boundary. This paper examines high-order Gaussian quadrature of the enriched elements and the non-intersection point method for the Lagrange-multiplier. Detailed numerical tests in a fixed boundary problem provide computational guideline of the proposed method. Application to a moving boundary problem demonstrates that the proposed method has scalability to fluid-thin structure interfaces.

View full abstract

Download PDF (2099K)
Development of Thin Plate Model using Hamiltonian Particle Method

Masahiro KONDO, Seiichi KOSHIZUKA

2010 Volume 2010 Pages 20100016
Published: November 11, 2010
Released on J-STAGE: November 11, 2010

DOIhttps://doi.org/10.11421/jsces.2010.20100016

JOURNAL FREE ACCESS

Show abstractHide abstract

A thin plate calculation model is developed in the framework of Hamiltonian particle dynamics. An objective bending potential is formulated, and the calculation model is derived from the potential so as to analyze deformation with rigid rotation. The model conserves linear momentum, angular momentum and total energy of the system. Oscillation cycle of a square plate agrees well with the theoretical value.

View full abstract

Download PDF (532K)
Consistent interface modeling of Eulerian fluid and Lagrangian structures by eXtended finite element method with Lagrange-multiplier

Tomohiro SAWADA, Akira TEZUKA

2010 Volume 2010 Pages 20100017
Published: November 12, 2010
Released on J-STAGE: November 12, 2010

DOIhttps://doi.org/10.11421/jsces.2010.20100017

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper proposes a non-interface-fitted mesh method for simulating complex fluid-structure interaction (FSI) problems that is consistent with two physical conditions at the FSI interface. The first is the continuity of velocities and surface forces at the interface, and the second is the discontinuity of fluid velocity gradients and pressures over the interface. An extended finite element method (X-FEM) combined with sharp interface enrichment functions is introduced to reproduce the two discontinuities without inconsistent factors resulting from natural usage of non-interface-fitted meshes. Lagrangian Lagrange-multiplier technique is employed to impose the FSI coupling conditions onto the Eulerian fluid and Lagrangian structural meshes. This method enables us to handle general FSI problems being comparable to reliable interface-fitted methods. Fundamental formulations and performance assessment of the proposed method are presented in this paper.

View full abstract

Download PDF (825K)
Acceleration of large scale high accurate advection calculation for multiple GPUs and its strong scalability

[in Japanese], [in Japanese]

2010 Volume 2010 Pages 20100018
Published: December 03, 2010
Released on J-STAGE: December 03, 2010

DOIhttps://doi.org/10.11421/jsces.2010.20100018

JOURNAL FREE ACCESS

Show abstractHide abstract

Recent GPUs (Graphics Processing Unit) have great advantages in performance and memory bandwidth for general-purpose computing. The CUDA programming environment enables us the GPU computing easily as a SIMT(single-instruction, multiple-thread)-type accelerator. High-order Finite Difference Methods (FDM) have been applied to CFD (Computational Fluid Dynamics) and the advection equation has been examined as a typical benchmark. We study the computational performances depending on the arithmetic intensity for several high-accurate FDMs. The detail description of the GPU implementation of the 5th-order WENO scheme is given with respect to the usage of the shared memory and registers. Multiple-GPU computing is required for further speedups and large-scale computing beyond the memory size limitation on a graphics card. The computational domain is decomposed three-dimensionally and the overall performances depend on not only the computation but also the GPU to GPU communication. The overlapping techniques between the computation and the communication are well organized with changing the order of the GPU kernels. The strong scalability is shown on the TSUBAME grid cluster and the performance of 7.8 TFlops is achieved by using 60 GPUs, when we compute the advection equation with the 5th-order WENO scheme.

View full abstract

Download PDF (1862K)
High-order Multi-Moment Compact Difference Scheme

Naoyuki ONODERA, Takayuki AOKI, Kenta SUGIHARA

2010 Volume 2010 Pages 20100019
Published: December 17, 2010
Released on J-STAGE: December 17, 2010

DOIhttps://doi.org/10.11421/jsces.2010.20100019

JOURNAL FREE ACCESS

Show abstractHide abstract

High accuracy of numerical schemes is one of the most important requirements to simulate complex flow phenomena. A conservative form of the interpolated differential operator (IDO-CF) scheme is a multi-moment Eulerian scheme explicitly solving all the moments with higher spectral resolution than the conventional finite difference method (FDM). A compact difference (CD) scheme is one of high-order FDMs implicitly solving additional spatial derivatives as non-time-integrated variables. We propose multi-moment compact schemes combining both the concept of the multi-moment schemes and the CD scheme. The proposed schemes have very high-order accuracy and spectral like resolution in Fourier space. In addition, the spatial derivative matrix is reduced to half size so that computational cost gets much smaller than the same-order CD scheme with the same number of independent variables. A linear wave propagation is examined and the results are dramatically improved to compare to the conventional multi-moment scheme. In a non-linear problem, we carry out the direct numerical simulation (DNS) of two-dimensional homogeneous isotropic turbulence, and it is found that the high-order multi-moment compact scheme improves the energy spectral at high wave number region.

View full abstract

Download PDF (733K)
Development of a method to surpress the numerical diffusion in advection prediction of VOF function

Susumu FUJIOKA, Satoru USHIJIMA

2010 Volume 2010 Pages 20100020
Published: December 16, 2010
Released on J-STAGE: December 16, 2010

DOIhttps://doi.org/10.11421/jsces.2010.20100020

JOURNAL FREE ACCESS

Show abstractHide abstract

In a simulation of multi phase fluid dynamics by volume of fluid method, numerical diffusion in advection is one of the major problems in an accurate prediction of the interface between the fluids. This paper proposes a new method, an anti-numerical diffusion (AND) filter, to correct the diffused scalar function after the advection. Supposed that the volume of fluid function should be between 0 and 1, AND filter transmit scalar value along with local gradient of scalar function to correct numerical diffused scalar distribution. One and two dimensional advection benchmark tests are carried out to validate the proposed method. The results show the effectiveness of AND filter.

View full abstract

Download PDF (444K)
Model Parameter Estimation by Using the Ensemble Kalman Filter

—Application to Nonlinear Complex Structure System with Large Deformation—

Takeshi AKITA, Ryoji TAKAKI, Eiji SHIMA

2010 Volume 2010 Pages 20100021
Published: December 24, 2010
Released on J-STAGE: December 24, 2010

DOIhttps://doi.org/10.11421/jsces.2010.20100021

JOURNAL FREE ACCESS

Show abstractHide abstract

An effective parameter estimation method of nonlinear structure systems is presented. In the method, uncertain structural parameters contained in numerical models are estimated by using the ensemble Kalman filter, which can automatically provide optimum estimations of system state variables while assimilating the numerical model with the experimental data. A numerical example of a deployable structure system, which is highly nonlinear one, is provided to verify the effectiveness of the presented method.

View full abstract

Download PDF (1670K)

Register with J-STAGE for free!