reading notes of《Artificial Intelligence in Drug Design》


文章目录

  • 1.Introduction
  • 2.MMP Algorithms
  • 3.BioDig: The GSK Transform Database
  • 4.Large Scale Molecule Ideation Using MMPs
  • 5.Quantifying the Value of an MMP-Based Knowledge Base
  • 6.The Ever-Growing Tail of New Transforms
  • 7.The Subset of Useful MedChem Transforms
  • 8.Assessing MMPs as a Molecule Generation Tool
  • 9.First Test - Human Inclusion
  • 10.Scond Test - Human Imitation
  • 11.Third Test - Legacy Projects
  • 12.Conclusion

1.Introduction

  • Matched Molecular Pair (MMP) analysis is one of the many ways medicinal chemists can understand SAR data. The attraction of MMP analysis lies in its ability to intuitively relate structural changes to changes in a rele- vant property.

2.MMP Algorithms

  • There are several implementations of the MMP algorithm in the literature. One of the most used MMP generation algorithm that has been adapted by many institutions was originally published by Hussain and Rea.
  • The common core fragment is termed the context (typically >50% of the molecule by heavy atom count). Two molecules with the same context are termed an MMP. The variable part between the molecule pair is termed the transform and encodes a change from fragment X to fragment Y. The transform is typically represented as a SMIRKS reaction.
  • A similar procedure has been extended for MMPs with a chemical core change. In this case multiple cuts or fragmentation operations are applied to the molecules. Where the terminal groups are all the same, but the core is different, an MMP is defined with a core or scaffold change encoded. Figure 1 shows a pictorial demonstration of the MMP algorithm.
  • Deriving MMP’s across a large set of molecules with associated physicochemical properties or assay readouts allows for generalization of the Transforms across the dataset. If two or more com- pound pairs share the same transform the data can be aggregated. For each transform, statistics are derived to express the change for a chosen endpoint as a mean change with associated standard deviation or related statistics.

3.BioDig: The GSK Transform Database

  • For a dataset of 300K compounds approximately 2.3 million MMPs can be extracted. This necessitates a solution for bulk storage and fast query reporting. These requirements along with the process of indexing transforms lend themselves to a relational database. This database is named BioDig at GSK.

4.Large Scale Molecule Ideation Using MMPs

  • MMPs have been historically used to interrogate the effect of a chemical transform on physicochemical properties such as LogD, clearance, and membrane permeability.
  • At GSK we have extended its applicability as a molecule library generation tool.
  • For example, the effect on solubility when a primary amide is replaced by a secondary amide is different for an aliphatic and an aromatic context (Refer Fig. 3).
  • SMARTS patterns can be generalized with aliphatic and aromatic flags as opposed to full atom type information. This extends a single transform into 6 related forms as shown in Fig. 4.

5.Quantifying the Value of an MMP-Based Knowledge Base

  • A key aspect in the application of an MMP-based knowledge base is quantifying its usefulness in a medicinal chemistry design scenario. Ideally, the database must be comprehensive enough to cover the full range of transforms that could be used. Each transform in the database must also be derived from enough data to make it statistically valid.
  • To help answer these questions, a comparison was made of transforms in the Eli Lilly ADME/Tox knowledge database as compared to those in a larger 2.1 million compound diversity set. A second comparison was made of transforms in the Eli Lilly ADME/Tox knowledge database against a subset of transforms seen in historical small molecule discovery projects.

6.The Ever-Growing Tail of New Transforms

  • A linear relationship was seen between the number of molecules in the dataset and the final number of derived matched pairs and transforms. This is seen in Table 1 and Fig. 5.

7.The Subset of Useful MedChem Transforms

  • The knowledge database was analyzed to assess how many of the Top 100, 500, 1000, 2500, 5K, 10K, 25K, 50K, and 100K MedChem project transforms were contained in the database. The results are given in Table 2.

8.Assessing MMPs as a Molecule Generation Tool

  • Three tests were used to assess the performance of molecule generators used at GSK including an MMP-based molecule generator.

    • BioDig—a matched molecular pair-based algorithm described earlier in this chapter.

    • BRICS—a fragment replacement-based algorithm.

    • RG2Smi—a language processing machine learning algorithm that translates a reduced graph input to a SMILES output.

    • The first explored the ability of the algorithms to reproduce ideas generated by a team of medicinal chemists.

    • The second test explored whether the additional ~ 103 molecules generated by the algorithms were considered good ideas by the medicinal chemists.

    • Finally, the algorithms were assessed for their ability to generate molecules in legacy drug discovery programs from a single starting molecule in the series.

  • The tests were comparing three inhouse molecule generators (Fig. 6).

9.First Test - Human Inclusion

10.Scond Test - Human Imitation

11.Third Test - Legacy Projects

12.Conclusion

  • MMP analysis has emerged as a key method in the medicinal chemistry toolbox and there are many examples of publicly available algorithms and applications. Many companies have worked to sum- marize MMPs into databases of transforms.

Chapter23: Molecule Ideation Using Matched Molecular Pairs相关推荐

  1. Chem. Sci. | SyntaLinker: 基于Transformer神经网络的片段连接生成器

    作者 | 杨禹尧 今天给大家介绍的是生物岛实验室陈红明研究员的团队,联合中山大学药学院药物分子设计中心的徐峻教授,发表在英国皇家化学学会出版的化学核心期刊Chemical Science上的一篇论文. ...

  2. 2022 ICML | Pocket2Mol: Efficient Molecular Sampling Based on 3D Protein Pockets

    论文:https://arxiv.org/abs/2205.07249 代码:https://github.com/pengxingang/Pocket2Mol Pocket2Mol : 基于3D蛋白 ...

  3. 【论文阅读】A Gentle Introduction to Graph Neural Networks [图神经网络入门](6)

    [论文阅读]A Gentle Introduction to Graph Neural Networks [图神经网络入门](6) GNN playground Some empirical GNN ...

  4. Crosstalk高速信号质量测试仪

    Crosstalk高速信号质量测试仪 XTK-28/32 Crosstalk Modeling Platform Fully test SERDES – Dial in channel optimiz ...

  5. PEP 634 – Structural Pattern Matching: Specification

    PEP 634 – Structural Pattern Matching: Specification PEP 634 – 结构化模式匹配:规范 PEP: 634 Title: Structural ...

  6. 2018_Semantic SLAM Based on Object Detection and Improved Octomap_note

    注释 (2022/4/15 上午9:14:24) "ABSTRACT" (Zhang 等., 2018, p. 1) (pdf) 提出了什么: "In this pape ...

  7. Zoom to learn, learn to zoom超分辨网络

    目录 论文 主要贡献 背景 创新点一.SR-RAW数据集 创新点二.CoBi损失函数 结果 结论 论文 Zhang X, Chen Q, Ng R, et al. Zoom to learn, lea ...

  8. Accurate prediction of molecular targets using a self-supervised image rep...(论文解读)

    Accurate prediction of molecular targets using a self-supervised image representation learning frame ...

  9. 什么是分子优化(Molecule Optimization)以及相关论文

    药物与生物大分子的相互关系(分子与药物以及人体关系)_马鹏森的博客-CSDN博客 这里说的"分子优化",其实就是"药物中的分子优化"的简称 ,药物中的分子与人体 ...

最新文章

  1. leetcode--反转链表--python
  2. R语言使用ggplot2包geom_jitter()函数绘制分组(strip plot,一维散点图)带状图(自定义调色板填充色、dark2、灰度比例)实战
  3. 什么是清华大学的“三好”学生?
  4. [网络安全自学篇] 四十三.恶意样本原理及远程服务器IPC$安全缺陷解析
  5. 2018年10月28日宁波dotnet社区活动回顾及下次活动预告
  6. 针对JDK 14提议的另外六个JEP
  7. 关于体育的python毕业设计_Python实例13:体育竞技分析
  8. C++ - STL迭代器失效
  9. 根据中心点、半径长度和角度画点
  10. Oracle PL/SQL之NEXT_DAY - 取得下一个星期几所在的日期
  11. C++第五章课后习题16-字符串按逆序输出
  12. 中value大小_如何在Spring/SpringBoot 中做参数校验?你需要了解的都在这里!
  13. PC建立WIFI热点
  14. International Journal of Rock Mechanics and Mining Sciences (Vol 124-12月期最新研究译文)
  15. java tomcat jvm内存_【转】Linux下tomcat JVM内存
  16. Java之Socket实现文件传输
  17. oracle查看redo文件,Oracle Redo文件恢复
  18. 无线基础知识学习(一)
  19. 计算机任务管理器无法响应,Win7系统电脑在任务管理器中关闭进程时总是未响应的解决方法...
  20. 虚拟机如何安装优麒麟19.10

热门文章

  1. CKEditor编辑器的用法
  2. CSS3 matrix矩阵
  3. vc idispatchimpl 怎么实例化_京东APP订单业务楼层化技术实践解密
  4. python 类 实例 方法 涉及到的名称定义
  5. 《Adobe Photoshop CS5中文版经典教程(全彩版)》—第2课2.7节使用海绵工具调整饱和度...
  6. Nginx学习日记1
  7. 如何把pdf转换成excel转换器免费使用
  8. 中国二氯甲烷行业研究与投资前景预测报告(2022版)
  9. Mindjet MindManager思维导图使用技巧
  10. Python开发的Web在线学习教育培训网课系统