机器学习辅助高性能化学信息数据库促进金属有机框架材料基气体吸附材料筛选
High-Performance Chemical Information Database towards Accelerating Discovery of Metal-Organic Frameworks for Gas Adsorption with Machine Learning
-
摘要: 基于数据库化学结构搜索和机器学习快速筛选特定功能材料是近年的研究热点. 本文建立了基于MYSQL的高性能化学结构数据库, 即MYDB. 数据库利用新的检索算法收集和存储了超过16万个金属有机框架材料, 可以实现了高效检索和推荐. 测试结果显示MYDB能够在百万数量级的材料中实现快速高效的关键词搜索, 并对相似结构提供实时推荐. 结合机器学习方法和材料数据库, 训练了气体吸附模型, 以确定一定热力学条件下金属有机框架材料对氩气和氢气的吸附能力. 结合MYDB数据库和机器学习算法训练出的模型能够支持大规模、低成本且方便快捷的结构筛选, 从而推进计算材料研究领域中特定功能材料的发现.Abstract: Chemical structure searching based on databases and machine learning has attracted great attention recently for fast screening materials with target functionalities. To this end, we established a high-performance chemical structure database based on MYSQL engines, named MYDB. More than 160000 metal-organic frameworks (MOFs) have been collected and stored by using new retrieval algorithms for efficient searching and recommendation. The evaluations results show that MYDB could realize fast and efficient keyword searching against millions of records and provide real-time recommendations for similar structures. Combining machine learning method and materials database, we developed an adsorption model to determine the adsorption capacitor of metal-organic frameworks toward argon and hydrogen under certain conditions. We expect that MYDB together with the developed machine learning techniques could support large-scale, low-cost, and highly convenient structural research towards accelerating discovery of materials with target functionalities in the field of computational materials research.