-
Abstract: With the bloom of deep learning algorithms, various models have been widely utilized in quantum chemistry calculation to design new molecules and explore molecular properties. However, limited studies focus on multi-task molecular property prediction, which offers more efficient ways to simultaneously learn different but related properties by leveraging the inter-task relationship. In this work, we apply the hard parameter sharing framework and advanced loss weighting methods to multi-task molecular property prediction. Based on the performance comparison between single-task baseline and multi-task models on several task sets, we find that the prediction accuracy largely depends on the inter-task relationship, and hard parameter sharing improves the performance when the correlation becomes complex. In addition, we show that proper loss weighting methods help achieve more balanced multi-task optimization and enhance the prediction accuracy. Our additional experiments on varying amount of training data further validate the multi-task advantages and show that multi-task models with proper loss weighting methods can achieve more accurate prediction of molecular properties with much less computational cost.
-
Table I. Performance comparison od differnt models on learning properties of atomization energies (in meV). Task-specific MAE and
$ {\rm{std._{MAE} }}$ (in %) on test dataset are both shown in the table. The best results are shown in bold.Model $ U_0 $ $ U $ $ G $ $ H $ $ {\rm{std._{MAE} }}$ STL 31.8 32.9 33.7 33.4 0.324 Adapted STL 31.0 31.3 31.0 31.3 0.306 Uniform 34.3 34.6 34.1 34.6 0.338 Uncertainty 34.1 34.3 34.0 34.3 0.336 Revised uncertainty 34.1 34.5 34.0 34.4 0.337 DWA 34.1 34.3 34.1 34.4 0.337 Table II. Performance comparison on 4 learning electronic properties (
$ \langle R^2\rangle $ in$a_0^2 $ , ε in meV, std.MAE in %). The bold fonts show the best results.Model $ \langle R^2\rangle $ $ \varepsilon_{\rm{HOMO}} $ $ \varepsilon_{\rm{LUMO}} $ $ \Delta\varepsilon $ $ {\rm{std._{MAE} }}$ STL 0.486 95.2 67.6 119 7.66 Adapted STL 0.511 104 98.2 142 9.10 Uniform 0.470 97.9 83.7 130 8.29 Uncertainty 0.686 89.8 71.6 117 7.50 Revised uncertainty 0.581 90.2 73.9 118 7.56 DWA 0.468 97.7 84.3 130 8.30 Table III. Performance comparison on learning all 12 properties from QM9 (U0, U, G, H, ZPVE, ε in meV, Cv in cal·mol−1·K, α in
$a_0^3 $ , μ in D,$\langle R^2\rangle $ in$ a_0^2 $ , std.MAE in %). The bold fonts show the best results.Model $ U_0 $ $ U $ $ G $ $ H $ $ C_{\rm{v}} $ ZPVE $ \alpha $ $ \mu $ $ \langle R^2\rangle $ $ \varepsilon_{\rm{HOMO}} $ $ \varepsilon_{\rm{LUMO}} $ $ \Delta\varepsilon $ $ {\rm{std._{MAE} }}$ STL 31.8 32.9 33.7 33.4 0.0662 2.94 0.160 0.0600 0.486 95.2 67.6 119 3.32 Adapted STL 39.6 39.7 38.9 40.0 0.0673 11.1 0.161 0.0807 0.484 96.4 88.2 130 3.76 Uniform 35.0 34.4 34.2 34.3 0.0648 3.23 0.157 0.0743 0.424 87.9 79.0 120 3.38 Uncertainty 32.2 31.4 31.2 31.8 0.0611 2.58 0.162 0.0694 0.524 85.0 73.2 114 3.22 Revised uncertainty 33.0 32.6 32.5 34.6 0.0618 2.97 0.155 0.0700 0.482 85.8 74.0 114 3.25 DWA 36.0 34.6 35.6 34.8 0.0638 2.92 0.157 0.0757 0.429 88.5 78.2 119 3.38 Table IV. Efficiency comparison in terms of number of parameters and training time cost per epoch.
Model Number of parameter Time/s STL $2.28\times 10^7 $ 660 Adapted STL $1.91\times 10^6 $ 64 Uniform $1.34\times 10^7 $ 123 Uncertainty $1.34\times 10^7 $ 122 -
[1] R. Ramakrishnan, M. Hartmann, E. Tapavicza, and O. A. Von Lilienfeld, J. Chem. Phys. 143, 084111 (2015). doi: 10.1063/1.4928757 [2] R. Ramakrishnan, P. O. Dral, M. Rupp, and O. A. Von Lilienfeld, Sci. Data 1, 140022 (2014). doi: 10.1038/sdata.2014.22 [3] L. Ruddigkeit, R. van Deursen, L. C. Blum, and J. L. Reymond, J. Chem. Inf. Model. 52, 2864 (2012). doi: 10.1021/ci300415d [4] G. Chen, P. Chen, C. Y. Hsieh, C. K. Lee, B. Liao, R. Liao, W. Liu, J. Qiu, Q. Sun, J. Tang, R. Zemel, and S. Zhang, arXiv: 1906.09427 (2019). [5] K. T. Butler, D. W. Davies, H. Cartwright, O. Isayev, and A. Walsh, Nature 559, 547 (2018). doi: 10.1038/s41586-018-0337-2 [6] J. Wei, X. Chu, X. Y. Sun, K. Xu, H. X. Deng, J. Chen, Z. Wei, and M. Lei, InfoMat 1, 338 (2019). doi: 10.1002/inf2.12028 [7] B. Sanchez-Lengeling and A. Aspuru-Guzik, Science 361, 360 (2018). doi: 10.1126/science.aat2663 [8] J. Wang, C. Y. Hsieh, M. Wang, X. Wang, Z. Wu, D. Jiang, B. Liao, X. Zhang, B. Yang, Q. He, D. Cao, X. Chen, and T. Hou, Nat. Mach. Intell. 3, 914 (2021). doi: 10.1038/s42256-021-00403-1 [9] Z. Wu, B. Ramsundar, E. N. Feinberg, J. Gomes, C. Geniesse, A. S. Pappu, K. Leswing, and V. Pande, Chem. Sci. 9, 513 (2018). doi: 10.1039/C7SC02664A [10] J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, in Proceedings of the 34th International Conference on Machine Learning, 1263 (2017). [11] K. T. Schütt, H. E. Sauceda, P. J. Kindermans, A. Tkatchenko, and K. R. Müller, J. Chem. Phys. 148, 241722 (2018). doi: 10.1063/1.5019779 [12] C. Lu, Q. Liu, C. Wang, Z. Huang, P. Lin, and L. He, in Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, 1052 (2019). [13] O. T. Unke and M. Meuwly, J. Chem. Theory Comput. 15, 3678 (2019). doi: 10.1021/acs.jctc.9b00181 [14] C. Chen, W. Ye, Y. Zuo, C. Zheng, and S. P. Ong, Chem. Mater. 31, 3564 (2019). doi: 10.1021/acs.chemmater.9b01294 [15] J. Klicpera, J. Groß, and S. Günnemann, in International Conference on Learning Representations, (2020). [16] X. Wang, S. Ye, W. Hu, E. Sharman, R. Liu, Y. Liu, Y. Luo, and J. Jiang, J. Am. Chem. Soc. 142, 7737 (2020). doi: 10.1021/jacs.0c01825 [17] Z. Qiao, M. Welborn, A. Anandkumar, F. R. Manby, and T. F. Miller III, J. Chem. Phys. 153, 124111 (2020). doi: 10.1063/5.0021955 [18] Z. Zhang, J. Guan, and S. Zhou, Bioinformatics 37, 2981 (2021). doi: 10.1093/bioinformatics/btab195 [19] Z. Zhang, Q. Liu, H. Wang, C. Lu, and C. Lee, arXiv: 2112.00911 (2021). [20] J. Behler, Chem. Rev. 121, 10037 (2021). doi: 10.1021/acs.chemrev.0c00868 [21] Y. Zhang, C. Hu, and B. Jiang, J. Phys. Chem. Lett. 10, 4962 (2019). doi: 10.1021/acs.jpclett.9b02037 [22] H. Wang, L. Zhang, J. Han, and W. E, Comput. Phys. Commun. 228, 178 (2018). doi: 10.1016/j.cpc.2018.03.016 [23] B. Jiang, J. Li, and H. Guo, J. Phys. Chem. Lett. 11, 5120 (2020). doi: 10.1021/acs.jpclett.0c00989 [24] I. Misra, A. Shrivastava, A. Gupta, and M. Hebert, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3994 (2016). [25] V. Sanh, T. Wolf, and S. Ruder, in Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, 6949 (2019). [26] S. Sosnin, M. Vashurina, M. Withnall, P. Karpov, M. Fedorov, and I. V. Tetko, Mol. Inf. 38, 1800108 (2019). doi: 10.1002/minf.201800108 [27] M. Crawshaw, arXiv: 2009.09796 (2020). [28] Z. Tan, Y. Li, W. Shi, and S. Yang, J. Chem. Inf. Model. 61, 3824 (2021). doi: 10.1021/acs.jcim.1c00646 [29] Z. Liu, L. Lin, Q. Jia, Z. Cheng, Y. Jiang, Y. Guo, and J. Ma, J. Chem. Inf. Model. 61, 1066 (2021). doi: 10.1021/acs.jcim.0c01224 [30] Y. Liu, L. Wang, M. Liu, X. Zhang, B. Oztekin, and S. Ji, arXiv: 2102.05013 (2021). [31] S. Ruder, arXiv: 1706.05098 (2017). [32] M. Liu, Y. Luo, L. Wang, Y. Xie, H. Yuan, S. Gui, H. Yu, Z. Xu, J. Zhang, Y. Liu, K. Yan, H. Liu, C. Fu, B. M. Oztekin, X. Zhang, and S. Ji, J. Mach. Learn. Res. 22, 1 (2021). [33] D. Eigen and R. Fergus, in Proceedings of the IEEE International Conference on Computer Vision, 2650 (2015). [34] A. Kendall, Y. Gal, and R. Cipolla, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 7482 (2018). [35] L. Liebel and M. Körner, arXiv: 1805.06334 (2018). [36] S. Liu, E. Johns, and A. J. Davison, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1871 (2019). [37] S. Jean, O. Firat, and M. Johnson, arXiv: 1909.06434 (2019). [38] T. Gong, T. Lee, C. Stephenson, V. Renduchintala, S. Padhy, A. Ndirango, G. Keskin, and O. H. Elibol, IEEE Access 7, 141627 (2019). doi: 10.1109/ACCESS.2019.2943604 [39] Z. Chen, V. Badrinarayanan, C. Y. Lee, and A. Rabinovich, in International Conference on Machine Learning, 794 (2018). [40] K. Schutt, P. Kessel, M. Gastegger, K. Nicoli, A. Tkatchenko, and K. R. Muller, J. Chem. Theory Comput. 15, 448 (2019). doi: 10.1021/acs.jctc.8b00908 [41] H. B. Lee, E. Yang, and S. J. Hwang, in International Conference on Machine Learning, 2956 (2018). [42] Y. Gao, J. Ma, M. Zhao, W. Liu, and A. L. Yuille, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3205 (2019). [43] D. Xu, W. Ouyang, X. Wang, and N. Sebe, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 675 (2018). -