Vol. 135

Front:[PDF file] Back:[PDF file]
Latest Volume
All Volumes
All Issues
2012-12-24

Implementation of FDTD-Compatible Green's Function on Heterogeneous Cpu-GPU Parallel Processing System

By Tomasz P. Stefanski
Progress In Electromagnetics Research, Vol. 135, 297-316, 2013
doi:10.2528/PIER12111702

Abstract

This paper presents an implementation of the FDTD-compatible Green's function on a heterogeneous parallel processing system. The developed implementation simultaneously utilizes computational power of the central processing unit (CPU) and the graphics processing unit (GPU) to the computational tasks best suited to each architecture. Recently, closed-form expression for this discrete Green's function (DGF) was derived, which facilitates its applications in the FDTD simulations of radiation and scattering problems. Unfortunately, implementation of the new DGF formula in software requires a multiple precision arithmetic and may cause long runtimes. Therefore, an acceleration of the DGF computations on a CPU-GPU heterogeneous parallel processing system was developed using the multiple precision arithmetic and the OpenMP and CUDA parallel programming interfaces. The method avoids drawbacks of the CPU- and GPU-only accelerated implementations of the DGF, i.e. long runtime on the CPU and significant overhead of the GPU initialization respectively for long and short lengths of the DGF waveform. As a result, the seven-fold speedup was obtained relative to the reference DGF implementation on a multicore CPU thus applicability of the DGF in FDTD simulations was significantly improved.

Citation


Tomasz P. Stefanski, "Implementation of FDTD-Compatible Green's Function on Heterogeneous Cpu-GPU Parallel Processing System," Progress In Electromagnetics Research, Vol. 135, 297-316, 2013.
doi:10.2528/PIER12111702
http://jpier.org/PIER/pier.php?paper=12111702

References


    1. Chew, , W. C., , "Electromagnetic theory on a lattice," Journal of Applied Physics, Vol. 75, No. 10, 4843-4850, 1994.
    doi:10.1063/1.355770

    2. Clemens, , M. and T. Weiland, "Discrete electromagnetism with the finite integration technique," Progress In Electromagnetics Research, Vol. 32, 65-87, 2001.
    doi:10.2528/PIER00080103

    3. Schuhmann, , R., T. Weiland, and , "Conservation of discrete energy and related laws in the finite integration technique," Progress In Electromagnetics Research, Vol. 32, 301-316, 2001.
    doi:10.2528/PIER00080112

    4. Bossavit, , A., Progress In Electromagnetics Research, and , "`Generalized finite differences' in computational electromagnetics,", Vol. 32, 45-64, 2001.
    doi:10.2528/PIER00080102

    5. Teixeira, F. L., "Geometric aspects of the simplicial discretization of Maxwell's equations," Progress In Electromagnetics Research, Vol. 32, 171-188, 2001.
    doi:10.2528/PIER00080107

    6. Vazquez, , J. , C. G. Parini, and , "Discrete Green's function formulation of FDTD method for electromagnetic modelling," Electron. Lett., Vol. 35, No. 7, 554-555, 1999.
    doi:10.1049/el:19990416

    7. Holtzman, , R. , R. Kastner, and , "The time-domain discrete Green's function method (GFM) characterizing the FDTD grid boundary," IEEE Trans. Antennas Propag., , Vol. 49, No. 7, 1079-1093, 2001.
    doi:10.1109/8.933488

    8. Holtzman, , R, , R. Kastner, E. Heyman, and R. W. Ziolkowski, "Stability analysis of the Green's function method (GFM) used as an ABC for arbitrarily shaped boundaries," IEEE Trans. Antennas Propag., Vol. 50, No. 7, 1017-1029, 2002.
    doi:10.1109/TAP.2002.802272

    9. Jeng, S.-K., "An analytical expression for 3-D dyadic FDTD-compatible Green's function in infinite free space via z-transform and partial di®erence operators," IEEE Trans. Antennas Propag.,, Vol. 59, No. 4, 1347-1355, 2011.
    doi:10.1109/TAP.2011.2109363

    10. Vazquez, , J., C. G. Parini, and , "Antenna modelling using discrete Green's function formulation of FDTD method," Electron. Lett.,, Vol. 35, No. 13, 1033-1034, 1999.
    doi:10.1049/el:19990741

    11. Ma, W., , M. R. Rayner, and C. G. Parini, "Discrete Green's function formulation of the FDTD method and its application in antenna modeling," IEEE Trans. Antennas Propag., Vol. 53, No. 1, 339-346, 2005.
    doi:10.1109/TAP.2004.838797

    12. Holtzman, , R, , R. Kastner, E. Heyman, and R. W. Ziolkowski, "Ultra-wideband cylindrical antenna design using the Green's function method (GFM) as an absorbing boundary condition (ABC) and the radiated ¯eld propagator in a genetic optimization ," Microw. Opt. Tech. Lett., Vol. 48, No. 2, 348-354, 2006.
    doi:10.1002/mop.21346

    13. De Hon, B. P. , J. M. Arnold, and , "Stable FDTD on disjoint domains --- A discrete Green's function diakoptics approach," Proc. The 2nd European Conf. on Antennas and Propag., 1-6, 2007.

    14. Malevsky, , S., E. Heyman, and R. Kastner, "Source decomposition as a diakoptic boundary condition in FDTD with reflecting external regions," IEEE Trans. Antennas Propag., Vol. 58, No. 11, 3602-3609, 2010.
    doi:10.1109/TAP.2010.2052577

    15. Schneider, J. B., K. Abdijalilov, and , "Analytic fleld propagation TFSF boundary for FDTD problems involving planar interfaces: PECs, TE, and TM," IEEE Trans. Antennas Propag., Vol. 54, No. 9, 2531-2542, 2006.
    doi:10.1109/TAP.2006.880757

    16. Stefanski, , T. P., "Fast implementation of FDTD-compatible Green's function on multicore processor," IEEE Antennas Wireless Propag. Lett., Vol. 11, 81-84, 2012.
    doi:10.1109/LAWP.2012.2183632

    17. Stefanski, T. P. and K. Krzyzanowska, "Implementation of FDTD-compatible Green's function on graphics processing unit," IEEE Antennas Wireless Propag. Lett., Vol. 11, 1422-1425, 2012.
    doi:10.1109/LAWP.2012.2229380

    18. Sypek, , P., A. Dziekonski, and M. Mrozowski, "How to render FDTD computations more effective using a graphics accelerator," IEEE Trans. Magn., Vol. 45, No. 3, 1324-1327, 2009.
    doi:10.1109/TMAG.2009.2012614

    19. Toivanen, , J. I., , T. P. Stefanski, N. Kuster, and N. Chavannes, "Comparison of CPML implementations for the GPU-accelerated FDTD solver ," Progress In Electromagnetics Research M,, Vol. 19, 61-75, 2011.
    doi:10.2528/PIERM11061002

    20. Tay, , W. C., , D. Y. Heh, and E. L. Tan, "GPU-accelerated funda-mental ADI-FDTD with complex frequency shifted convolutional perfectly matched layer," Progress In Electromagnetics Research M, Vol. 14, 177-192, 2010 .
    doi:10.2528/PIERM10090605

    21. Stefanski, T. P. and Acceleration of the 3D, "Acceleration of the 3D ADI-FDTD method using graphics processor units," IEEE MTT-S International Microwave Symposium Digest, 241-244, 2009.

    22. Xu, , K., , Z. Fan, D.-Z. Ding, and R.-S. Chen, "GPU accelerated unconditionally stable Crank-Nicolson FDTD method for the analysis of three-dimensional microwave circuits," Progress In Electromagnetics Research, Vol. 102, 381-395, 2010.
    doi:10.2528/PIER10020606

    23. Shahmansouri, , A. , B. Rashidian, and , "GPU implementation of split-field finite-difference time-domain method for Drude-Lorentz dispersive media," Progress In Electromagnetics Research , Vol. 125, 55-77, 2012.
    doi:10.2528/PIER12010505

    24. Zainud-Deen, , S. H. , E. El-Deen, and , "Electromagnetic scattering using GPU-based finite difference frequency domain method," Progress In Electromagnetics Research B, Vol. 16, 351-369, 2009..
    doi:10.2528/PIERB09060703

    25. Demir, , V., "Graphics processor unit (GPU) acceleration of finite-difference frequency-domain (FDFD) method," Progress In Electromagnetics Research M, Vol. 23, 29-51, 2012.
    doi:10.2528/PIERM11090909

    26. Dziekonski, , A., , A. Lamecki, and M. Mrozowski, "GPU acceleration of multilevel solvers for analysis of microwave components with finite element method," IEEE Microw. Wireless Comp. Lett., Vol. 21, No. 1, 1-3, 2011.
    doi:10.1109/LMWC.2010.2089974

    27. Dziekonski, , A., , A. Lamecki, and M. Mrozowski, , "Tuning a hybrid GPU-CPU V-cycle multilevel preconditioner for solving large real and complex systems of FEM equations," IEEE Antennas Wireless Propag. Lett., Vol. 10, 619-622, 2011.
    doi:10.1109/LAWP.2011.2159769

    28. Dziekonski, , A., P. Sypek, A. Lamecki, and M. Mrozowski, "Finite element matrix generation on a GPU," Progress In Electromagnetics Research, Vol. 249, 249-265, 2012.

    29. Dziekonski, A., , A. Lamecki, and M. Mrozowski, "A memory e±cient and fast sparse matrix vector product on a GPU," Progress In Electromagnetics Research, Vol. 116, 49-63, 2011.

    30. Peng, , S. , Z. Nie, and , "Acceleration of the method of moments calculations by using graphics processing units," IEEE Trans. Antennas Propag., Vol. 56, No. 7, 2130-2133, 2008..
    doi:10.1109/TAP.2008.924768

    31. Xu, , K., , D. Z. Ding, Z. H. Fan, and R. S. Chen, "Multilevel fast multipole algorithm enhanced by GPU parallel technique for electromagnetic scattering problems," Microw. Opt. Technol. Lett., Vol. 52, No. 3, 502-507, 2010.
    doi:10.1002/mop.24963

    32. Lopez-Fernandez, J. A., , M. Lopez-Portugues, Y. Alvarez-Lopez, C. Garcia-Gonzalez, D. Martinez, and F. Las-Heras, "Fast antenna characterization using the sources reconstruction method on graphics processors," Progress In Electromagnetics Research , Vol. 126, 185-201, , 2012.
    doi:10.2528/PIER11121408

    33. Gao, , P. C., Y. B. Tao, Z. H. Bai, and H. Lin, , "Mapping the SBR and TW-ILDCs to heterogeneous CPU-GPU architecture for fast computation of electromagnetic scattering," Progress In Electromagnetics Research, Vol. 122, 137-154, 2012.

    34. Granlund, , T., "The multiple precision integers and ratio-nals library," Edition 2.2.1, GMP Development Team, 2010,.
    doi:http://www.mpir.org.

    35. Nakayama, , T., D. Takahashi, and , "Implementation of multiple-precision floating-point arithmetic library for GPU computing," Proc. 23rd IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS), 343-349, , 2011.

    36. OpenMP Architecture Review Board, "OpenMP application program interface," Version 3.1, 2011.
    doi:www.openmp.org.

    37. Nvidia, "CUDA C programming guide," Version 4.2,.
    doi:http://developer.nvidia.com/cuda/nvidia-gpu-computing-docum-enta

    38. Harris, , M., "Optimizing parallel reduction in CUDA," NVIDIA.
    doi:http://developer.download.nvidia.com/co-mpute/cuda/1.1-Beta/x86

    39. Shen, , W., , D. Wei, W. Xu, X. Zhu, and S. Yuan, "Parallelized computation for computer simulation of electrocardiograms using personal computers with multi-core CPU and general-purpose GPU ," Computer Methods and Programs in Biomedicine,, Vol. 100, No. 1, 87-96, 2010 .
    doi:10.1016/j.cmpb.2010.06.015