Mwrf 1018 Cuda Technology 0

CUDA Technology Reduces FDTD Simulation Time

Oct. 8, 2013
The computational time of finite-difference-time-domain (FDTD) electromagnetic simulations can be significantly reduced with the use of CUDA technology.

The finite-difference-time-domain method (FDTD) is widely used for electromagnetic (EM) simulations due to its accuracy, flexibility, and simplicity. Yet the benefits provided by the FDTD technique come at the cost of increased computational time. Using NVIDIA’s Compute Unified Device Architecture (CUDA) technology, computational time can supposedly be reduced by over two orders of magnitude compared to conventional computing. In a paper from Remcom titled, “Accelerating the Finite Difference Time Domain (FDTD) Method with CUDA,” the migration from a traditional C implementation of a three-dimensional FDTD method to NVIDIA’s CUDA architecture is discussed.

The six-page paper discusses the challenges and techniques that are involved in migrating the FDTD algorithm from a traditional C implementation to a form suitable for leveraging modern graphics processor units (GPUs) through NVIDIA’s CUDA framework. With the GPU approach, thousands of threads are used simultaneously. To achieve maximum speed, special design considerations are needed. Proper understanding of CUDA can enable speed to be raised beyond two orders of magnitude over traditional central processing units (CPUs).

Although GPUs were originally used for the sole purpose of driving graphical displays, they have evolved into powerful computational devices. The Tesla C1060 GPU, for example, can yield significant performance gains over a 2.66-GHz Intel Core 2 Quad processor. The document provides additional details on the CUDA GPU, including its architecture and the several types of memory that CUDA devices offer. Functions targeted for the GPU are implemented in CUDA as kernels, which are written in a similar manner to the C programming language. The document discusses the optimizations that were implemented, which reduced memory operations to less than 14% of the original total. These optimizations, along with some others, were applied to the FDTD algorithm, which was integrated into Remcom’s XFdtd software.

A modern cellular phone design was chosen for the final test simulation, which included all major device components. Tests were performed using a single thread for the CPU baseline and all four Tesla C1060s for the GPU implementation. The test results demonstrated that the GPU implementation was consistent in achieving speeds that exceeded the CPU implementation by more than two orders of magnitude.

Remcom, Inc., 315 S. Allen St., Ste. 416, State College, PA 16801; (814) 861-1299.

Sponsored Recommendations

UHF to mmWave Cavity Filter Solutions

April 12, 2024
Cavity filters achieve much higher Q, steeper rejection skirts, and higher power handling than other filter technologies, such as ceramic resonator filters, and are utilized where...

Wideband MMIC Variable Gain Amplifier

April 12, 2024
The PVGA-273+ low noise, variable gain MMIC amplifier features an NF of 2.6 dB, 13.9 dB gain, +15 dBm P1dB, and +29 dBm OIP3. This VGA affords a gain control range of 30 dB with...

Fast-Switching GaAs Switches Are a High-Performance, Low-Cost Alternative to SOI

April 12, 2024
While many MMIC switch designs have gravitated toward Silicon-on-Insulator (SOI) technology due to its ability to achieve fast switching, high power handling and wide bandwidths...

Request a free Micro 3D Printed sample part

April 11, 2024
The best way to understand the part quality we can achieve is by seeing it first-hand. Request a free 3D printed high-precision sample part.