Volume 34 Issue 5
Oct.  2021
Turn off MathJax
Article Contents
Zhen-lin Zhang, Shi-zhe Jiao, Jie-lan Li, Wen-tiao Wu, Ling-yun Wan, Xin-ming Qin, Wei Hu, Jin-long Yang. KSSOLV-GPU: an Efficient GPU-Enabled MATLAB Toolbox for Solving the Kohn-Sham Equations within Density Functional Theory in Plane-Wave Basis Set†[J]. Chinese Journal of Chemical Physics , 2021, 34(5): 552-564. doi: 10.1063/1674-0068/cjcp2108139
Citation: Zhen-lin Zhang, Shi-zhe Jiao, Jie-lan Li, Wen-tiao Wu, Ling-yun Wan, Xin-ming Qin, Wei Hu, Jin-long Yang. KSSOLV-GPU: an Efficient GPU-Enabled MATLAB Toolbox for Solving the Kohn-Sham Equations within Density Functional Theory in Plane-Wave Basis Set[J]. Chinese Journal of Chemical Physics , 2021, 34(5): 552-564. doi: 10.1063/1674-0068/cjcp2108139

KSSOLV-GPU: an Efficient GPU-Enabled MATLAB Toolbox for Solving the Kohn-Sham Equations within Density Functional Theory in Plane-Wave Basis Set

doi: 10.1063/1674-0068/cjcp2108139
More Information
  • Corresponding author: Wei Hu, E-mail: whuustc@ustc.edu.cn
  • Received Date: 2021-08-16
  • Accepted Date: 2021-10-08
  • Publish Date: 2021-10-27
  • KSSOLV (Kohn-Sham Solver) is a MATLAB (Matrix Laboratory) toolbox for solving the Kohn-Sham density functional theory (KS-DFT) with the plane-wave basis set. In the KS-DFT calculations, the most expensive part is commonly the diagonalization of Kohn-Sham Hamiltonian in the self-consistent field (SCF) scheme. To enable a personal computer to perform medium-sized KS-DFT calculations that contain hundreds of atoms, we present a hybrid CPU-GPU implementation to accelerate the iterative diagonalization algorithms implemented in KSSOLV by using the MATLAB built-in Parallel Computing Toolbox. We compare the performance of KSSOLV-GPU on three types of GPU, including RTX3090, V100, and A100, with conventional CPU implementation of KSSOLV respectively and numerical results demonstrate that hybrid CPU-GPU implementation can achieve a speedup of about 10 times compared with sequential CPU calculations for bulk silicon systems containing up to 128 atoms.

     

  • Part of special topic of “the Young Scientist Forum on Chemical Physics: Theoretical and Computational Chemistry Workshop 2020”.
  • loading
  • [1]
    P. Hohenberg and W. Kohn, Phys. Rev. 136, B864 (1964). doi: 10.1103/PhysRev.136.B864
    [2]
    W. Kohn and L. J. Sham, Phys. Rev. 140, A1133 (1965). doi: 10.1103/PhysRev.140.A1133
    [3]
    G. B. Wang, H. Q. Zhao, Z. L. Zhang, W. L. Wang, and D. M. Chen, Chin. J. Chem. Phys. 28, 579 (2015). doi: 10.1063/1674-0068/28/cjcp1504067
    [4]
    J. Y. Weng, T. T. Zhou, and Y. H. Zhang, Chin. J. Chem. Phys. 27, 285 (2014). doi: 10.1063/1674-0068/27/03/285-290
    [5]
    D. Wu, G. D. Chen, C. Y. Ge, Z. P. Hu, X. H. He, and X. G. Li, Chin. J. Chem. Phys. 30, 295 (2017). doi: 10.1063/1674-0068/30/cjcp1703035
    [6]
    H. Ehrenreich and M. H. Cohen, Phys. Rev. 115, 786 (1959). doi: 10.1103/PhysRev.115.786
    [7]
    Q. Jiang, L. Wan, S. Jiao, W. Hu, J. Chen, and H. An, in 2020 IEEE 22nd International Conference on High Performance Computing and Communications; IEEE 18th International Conference on Smart City; IEEE 6th International Conference on Data Science and Systems (HPCC/SmartCity/DSS) IEEE, 197-205 (2020).
    [8]
    S. Das, P. Motamarri, V. Gavini, B. Turcksin, Y. W. Li, and B. Leback, in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 1-11 (2019).
    [9]
    W. Jia, L. W. Wang, and L. Lin, in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 1-23 (2019).
    [10]
    L. Wang, Y. Wu, W. Jia, W. Gao, X. Chi, and L. W. Wang, in SC'11: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, IEEE, 1-10 (2011).
    [11]
    W. Jia, J. Fu, Z. Cao, L. Wang, X. Chi, W. Gao, and L. W. Wang, J. Comput. Phys. 251, 102 (2013). doi: 10.1016/j.jcp.2013.05.005
    [12]
    C. Yang, J. C. Meza, B. Lee, and L. W. Wang, ACM Trans. Math. Softw. 36, 1 (2009).
    [13]
    G. Kresse and J. Hafner, Phys. Rev. B 47, 558 (1993). doi: 10.1103/PhysRevB.47.558
    [14]
    P. Giannozzi, S. Baroni, N. Bonini, M. Calandra, R. Car, C. Cavazzoni, D. Ceresoli, G. L. Chiarotti, M. Cococcioni, I. Dabo, A. D. Corso, S. d. Gironcoli, S. Fabris, G. Fratesi, R. Gebauer, U. Gerstmann, C. Gougoussis, A. Kokalj, M. Lazzeri, L. Martin-Samos, N. Marzari, F. Mauri, R. Mazzarello, S. Paolini, A. Pasquarello, L. Paulatto, C. Sbraccia, S. Scandolo, G. Sclauzero, A. P. Seitsonen, A. Smogunov, P. Umari, and R. M. Wentzcovitch, J. Phys. : Condens. Matter 21, 395502 (2009). doi: 10.1088/0953-8984/21/39/395502
    [15]
    J. Hutter and M. Iannuzzi, Z. Kristallogr. Cryst. Mater. 220, 549 (2005). doi: 10.1524/zkri.220.5.549.65080
    [16]
    S. J. Clark, M. D. Segall, C. J. Pickard, P. J. Hasnip, M. I. Probert, K. Refson, and M. C. Payne, Z. Kristallogr. Cryst. Mater. 220, 567 (2005). doi: 10.1524/zkri.220.5.567.65075
    [17]
    J. Noffsinger, F. Giustino, B. D. Malone, C. H. Park, S. G. Louie, and M. L. Cohen, Comput. Phys. Commun. 181, 2140 (2010). doi: 10.1016/j.cpc.2010.08.027
    [18]
    F. Gygi, IBM J. Res. Dev. 52, 137 (2008). doi: 10.1147/rd.521.0137
    [19]
    S. Boeck, C. Freysoldt, A. Dick, L. Ismer, and J. Neugebauer, Comput. Phys. Commun. 182, 543 (2011). doi: 10.1016/j.cpc.2010.09.016
    [20]
    M. F. Herbst, A. Levitt, and E. Cancès, Proc. JuliaCon Conf. 3, 69 (2021). doi: 10.21105/jcon.00069
    [21]
    X. Gonze, J. M. Beuken, R. Caracas, F. Detraux, M. Fuchs, G. M. Rig-nanese, L. Sindic, M. Verstraete, G. Zerah, F. Jollet, M. Torrent, A. Roy, M. Mikami, P. Ghosez, J. Y. Raty, and D. C. Allan, Comput. Mater. Sci. 25, 478 (2002). doi: 10.1016/S0927-0256(02)00325-7
    [22]
    W. Jia, Z. Cao, L. Wang, J. Fu, X. Chi, W. Gao, and L. W. Wang, Comput. Phys. Commun. 184, 9 (2013). doi: 10.1016/j.cpc.2012.08.002
    [23]
    C. K. Skylaris, P. D. Haynes, A. A. Mostofi, and M. C. Payne, J. Chem. Phys. 122, 084119 (2005). doi: 10.1063/1.1839852
    [24]
    A. Marini, C. Hogan, M. Grning, and D. Varsano, Comput. Phys. Commun. 180, 1392 (2009). doi: 10.1016/j.cpc.2009.02.003
    [25]
    W. Hu, L. Lin, A. S. Banerjee, E. Vecharynski, and C. Yang, J. Chem. Theory Comput. 13, 1188 (2017). doi: 10.1021/acs.jctc.6b01184
    [26]
    R. Sundararaman, K. Letchworth-Weaver, K. A. Schwarz, D. Gunceler, Y. Ozhabes, and T. Arias, SoftwareX 6, 278 (2017). doi: 10.1016/j.softx.2017.10.006
    [27]
    N. Jain, E. Bohm, E. Mikida, S. Mandal, M. Kim, P. Jindal, Q. Li, S. Ismail-Beigi, G. J. Martyna, and L. V. Kale, International Conference on High Performance Computing, Springer, 139 (2016).
    [28]
    J. J. Mortensen, L. B. Hansen, and K. W. Jacobsen, Phys. Rev. B 71, 035109 (2005). doi: 10.1103/PhysRevB.71.035109
    [29]
    J. A. Duersch, M. Shao, C. Yang, and M. Gu, SIAM J. Sci. Comput. 40, C655 (2018). doi: 10.1137/17M1129830
    [30]
    C. Davidson, J. Comput. Phys. 17, 87 (1975). doi: 10.1016/0021-9991(75)90065-0
    [31]
    D. M. Ceperley and B. J. Alder, Phys. Rev. Lett. 45, 566 (1980). doi: 10.1103/PhysRevLett.45.566
    [32]
    J. P. Perdew and A. Zunger, Phys. Rev. B 23, 5048 (1981). doi: 10.1103/PhysRevB.23.5048
    [33]
    A. D. Becke, Phys. Rev. A 38, 3098 (1988). doi: 10.1103/PhysRevA.38.3098
    [34]
    C. Lee, W. Yang, and R. G. Parr, Phys. Rev. B 37, 785 (1988). doi: 10.1103/PhysRevB.37.785
    [35]
    J. P. Perdew, K. Burke, and M. Ernzerhof, Phys. Rev. Lett. 77, 3865 (1996). doi: 10.1103/PhysRevLett.77.3865
    [36]
    J. Tao, J. P. Perdew, V. N. Staroverov, and G. E. Scuseria, Phys. Rev. Lett. 91, 146401 (2003). doi: 10.1103/PhysRevLett.91.146401
    [37]
    J. Sun, A. Ruzsinszky, and J. P. Perdew, Phys. Rev. Lett. 115, 036402 (2015). doi: 10.1103/PhysRevLett.115.036402
    [38]
    A. D. Becke, J. Chem. Phys. 98, 1372 (1993). doi: 10.1063/1.464304
    [39]
    J. P. Perdew, M. Ernzerhof, and K. Burke, J. Chem. Phys. 105, 9982 (1996). doi: 10.1063/1.472933
    [40]
    J. Heyd, G. E. Scuseria, and M. Ernzerhof, J. Chem. Phys. 118, 8207 (2003). doi: 10.1063/1.1564060
    [41]
    A. Stroppa and G. Kresse, New J. Phys. 10, 063020 (2008). doi: 10.1088/1367-2630/10/6/063020
    [42]
    L. Schimka, J. Harl, A. Stroppa, A. Grneis, M. Marsman, F. Mittendorfer, and G. Kresse, Nat. Mater. 9, 741 (2010). doi: 10.1038/nmat2806
    [43]
    H. Sun, D. J. Mowbray, A. Migani, J. Zhao, H. Petek, and A. Rubio, ACS Catal. 5, 4242 (2015). doi: 10.1021/acscatal.5b00529
    [44]
    Y. Zhou, Y. Saad, M. L. Tiago, and J. R. Chelikowsky, J. Comput. Phys. 219, 172 (2006). doi: 10.1016/j.jcp.2006.03.017
    [45]
    J. Lu and H. Yang, Multiscale Model. Simul. 15, 254 (2017). doi: 10.1137/16M1068670
    [46]
    L. Lin, J. Lu, and L. Ying, Acta Numer. 28, 405 (2019). doi: 10.1017/S0962492919000047
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(6)  / Tables(5)

    Article Metrics

    Article views (1090) PDF downloads(77) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return