引用本文:
【打印本页】   【HTML】   【下载PDF全文】   View/Add Comment  【EndNote】   【RefMan】   【BibTex】
←前一篇|后一篇→ 过刊浏览    高级检索
本文已被:浏览 1828次   下载 1587 本文二维码信息
码上扫一扫!
分享到: 微信 更多
Time-dependent Density Functional-based Tight-bind Method Efficiently Implemented with OpenMP Parallel and GPU Acceleration
Guo-hong Fan, Ke-li Han*, Guo-zhong He
State Key Laboratory of Molecular Reaction Dynamics, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, China
Abstract:
The time-dependent density functional-based tight-bind (TD-DFTB) method is implemented on the multi-core and the graphical processing unit (GPU) system for excited state calcu-lations of large system with hundreds or thousands of atoms. Sparse matrix and OpenMP multithreaded are used for building the Hamiltonian matrix. The diagonal of the eigenvalue problem in the ground state is implemented on the GPUs with double precision. The GPU-based acceleration fully preserves all the properties, and a considerable total speedup of 8.73 can be achieved. A Krylov-space-based algorithm with the OpenMP parallel and GPU acceleration is used for finding the lowest eigenvalue and eigenvector of the large TDDFT matrix, which greatly reduces the iterations taken and the time spent on the excited states eigenvalue problem. The Krylov solver with the GPU acceleration of matrix-vector product can converge quickly to obtain the final result and a notable speed-up of 206 times can be observed for system size of 812 atoms. The calculations on serials of small and large sys-tems show that the fast TD-DFTB code can obtain reasonable result with a much cheaper computational requirement compared with the first-principle results of CIS and full TDDFT calculation.
Key words:  Density-functional theory, Tight-binding method, Time-dependent density functional theory, Excited state, Graphical processing unit, Krylov iterative algorithm, Sparse matrix, OpenMP
FundProject:
紧束缚近似含时密度泛函理论的高效OpenMP并行化和GPU加速实现
范果红, 韩克利*, 何国钟
中国科学院大连化学物理研究所分子反应动力学国家重点实验室,大连116023
摘要:
紧束缚近似的含时密度泛函理论在多核和GPU系统下的高效加速实现,并应用于拥有成百上千原子体系的激发态电子结构计算.程序中采用了稀疏矩阵和OpenMP并行化来加速哈密顿矩阵的构建,而最为耗时的基态对角化部分通过双精度的GPU加速来实现.基态的GPU加速能够在保持计算精度的基础上达到8.73倍的加速比.激发态计算采用了基于Krylov子空间迭代算法,OpenMP并行化和GPU加速等方法对激发态计算的大规模TDDFT矩阵进行求解,从而得到本征值和本征矢,大大减少了迭代的次数和最终的求解时间.采用GPU对矩阵矢量相乘进行加速后的Krylov算法能够很快地达到收敛,使得相比于采用常规算法和CPU并行化的程序能够加速206倍.程序在一系列的小分子体系和大分子体系上的计算表明,相比基于第一性原理的CIS方法和含时密度泛函方法,程序能够花费很少的计算量取得合理而精确结果.
关键词:  密度泛函理论,紧束缚近似方法,含时密度泛函理论,激发态,GPU计算,Krylov迭代子空间算法,稀疏矩阵,OpenMP并行化
DOI:10.1063/1674-0068/26/06/635-645
分类号: