• 中文核心期刊要目总览
  • 中国科技核心期刊
  • 中国科学引文数据库(CSCD)
  • 中国科技论文与引文数据库(CSTPCD)
  • 中国学术期刊文摘数据库(CSAD)
  • 中国学术期刊(网络版)(CNKI)
  • 中文科技期刊数据库
  • 万方数据知识服务平台
  • 中国超星期刊域出版平台
  • 国家科技学术期刊开放平台
  • 荷兰文摘与引文数据库(SCOPUS)
  • 日本科学技术振兴机构数据库(JST)

基于CouchDB和Elastic Search的高性能化学结构搜索引擎与数据库的构建

A High-Performance and Flexible Chemical Structure & Data Search Engine Built on CouchDB & ElasticSearch

  • 摘要: 计算机辅助的化学结构搜索在化学信息学中地位十分重要,本文设计了一套高性能的化学结构和化学数据搜索系统,称为DCAIKU.DCAIKU基于CouchDB无模式数据库和ElasticSearch基础架构构建,通过将结构相似性搜索变换为文字搜索实现了高性能和高灵活性的检索引擎:在满足化学信息存储的高灵活性条件下,仍然可以做到低延迟和高准确性,同时拥有良好的伸缩性,可以大规模并行化和集群化.

     

    Abstract: Computer-assisted chemical structure searching plays a critical role for efficient structure screening in cheminformatics. We designed a high-performance chemical structure & data search engine called DCAIKU, built on CouchDB and ElasticSearch engines. DCAIKU converts the chemical structure similarity search problem into a general text search problem to utilize off-the-shelf full-text search engines. DCAIKU also supports flexible document structures and heterogeneous datasets with the help of schema-less document database. Our evaluations show that DCAIKU can handle both keyword search and structural search against millions of records with both high accuracy and low latency. We expect that DCAIKU will lay the foundation towards large-scale and cost-effective structural search in materials science and chemistry research.

     

/

返回文章
返回