Professor Val Gillet
MA (Cambridge); MSc (Sheffield); PhD (Sheffield)
Information School
Professor of Chemoinformatics
+44 114 222 2652
Full contact details
Information School
Room C221
The Wave
2 Whitham Road
Sheffield
S10 2AH
- Profile
-
After completing a degree in Natural Sciences at Cambridge, I took a short term research post in the Department of Information Studies (as the Information School was then called) and contributed to the building of a database of generic chemical structures as found in the patent literature. This post fueled my interest in computing and I took the MSc in Information Science, returning to the generic structures project for my master’s dissertation and subsequent PhD - this time developing novel search algorithms to retrieve structural information from the database. I then moved to the Chemistry Department at Leeds University where I led a team working on de novo design (that is, the in-silico design of chemical compounds to fit a set of constraints such as a protein binding site) before returning to Sheffield in the mid-nineties and taking up an academic post.
University Responsibilities
- Head of School 2012-2016 & 2019-2023
- REF Coordinator 2017-
- Deputy Director of Research 2017-
- Director of Research 2010-2012
- Member of School’s Strategy Group 2009-
- Member of School Promotions Panel
- Workload Allocation Model lead 2010-2016.
- Programme Coordinator for the MSc (Res) in Chemoinformatics (formerly the MSc in Chemoinformatics) 2000-2010.
- Departmental EPSRC DTG Coordinator.
- Staff Review and Development Scheme reviewer.
- School Research Ethics Coordinator 2008-2010.
- Chair of Staff/Student Committee from 2004 to 2007.
- School Representative on Teaching Affairs Committee, Pure Science, 1999-2004.
- Editor of School Newsletter, 1999-2004.
- Research interests
-
My research interests are focussed on the development and application of chemoinformatics techniques that are used primarily in the design of novel bioactive compounds. I have expertise in data mining and machine learning methods including multiobjective evolutionary algorithms, emerging pattern mining and graph theory. Particular application areas include virtual screening, the identification of structure-activity relationships, toxicity prediction, 3D similarity methods and the de novo design of novel compounds. I also have expertise in developing novel representation methods for chemical structures with recent areas including reduced graphs, spectral geometry, wavelet analysis and reaction vectors.
Much of my research has been carried out in collaboration with pharmaceutical and software companies including: AstraZeneca, Cambridge Crystallographic Data Centre, Eli Lilly, GlaxoSmithKline, Janssen Pharmaceutials, Lhasa Limited, Novartis Pharamceuticals, Pfizer Central Research (UK and US), Sanofi-Aventis and Unilever. I also have collaborators across the University including Dr Beining Chen in the Department of Chemistry, Prof Rob Harrison in the Department of Automatic Control and Systems Engineering, Prof Tanya Whitfield in Biomedical Sciences, Prof Jon Sayers in the Medical School and Dr Heather Mortiboys in SITraN (Sheffield Institute for Translational Neuroscience).
I am a member of the Chemoinformatics Research Group.
- Publications
-
Books
- Preface.
- Reviews in Computational Chemistry: Preface.
- Reviews in Computational Chemistry. John Wiley & Sons, Inc..
Journal articles
- Interpreting neural network models for toxicity prediction by extracting learned chemical features. Journal of Chemical Information and Modeling, 64(9), 3670-3688. View this article in WRRO
- Synthetically accessible de novo design using reaction vectors: application to PARP1 inhibitors. Molecular Informatics. View this article in WRRO
- Analysis of the benefits of imputation models over traditional QSAR models for toxicity prediction. Journal of Cheminformatics, 14(1).
- RENATE : a pseudo-retrosynthetic tool for synthetically accessible de novo design. Molecular Informatics. View this article in WRRO
- Amyloid binding and beyond: a new approach for Alzheimer's disease drug discovery targeting Aβo–PrPC binding and downstream pathways. Chemical Science. View this article in WRRO
- Enhancing reaction-based de novo design using a multi-label reaction class recommender. Journal of Computer-Aided Molecular Design. View this article in WRRO
- Development and Application of a Data-Driven Reaction Classification Model: Comparison of an Electronic Lab Notebook and Medicinal Chemistry Literature. Journal of Chemical Information and Modeling, 59(10), 4167-4187. View this article in WRRO
- Alignment-Free Molecular Shape Comparison Using Spectral Geometry: The Framework. Journal of Chemical Information and Modeling, 59(1), 98-116. View this article in WRRO
- Effect of missing data on multitask prediction methods. Journal of Cheminformatics, 10(-), ---. View this article in WRRO
- Bioisosteric Replacements Extracted from High‐Quality Structures in the Protein Databank. ChemMedChem, 13(6), 607-613. View this article in WRRO
- Glossary of terms used in computational drug design, part II (IUPAC Recommendations 2015). Pure and Applied Chemistry, 88(3), 239-264.
- Chemoinformatics at the University of Sheffield 2002–2014. Molecular Informatics, 34(9), 598-607. View this article in WRRO
- Perspectives on Knowledge Discovery Algorithms Recently Introduced in Chemoinformatics: Rough Set Theory, Association Rule Mining, Emerging Patterns, and Formal Concept Analysis. Journal of Chemical Information and Modeling, 55(9), 1781-1803.
- Investigation of the Use of Spectral Clustering for the Analysis of Molecular Data. Journal of Chemical Information and Modeling, 54(12), 3302-3319. View this article in WRRO
- New structural alerts for Ames mutagenicity discovered using emerging pattern mining techniques. Toxicology Research, 4(1), 46-56.
- Emerging pattern mining to aid toxicological knowledge discovery.. Journal of chemical information and modeling, 54(7), 1864-1879. View this article in WRRO
- Toxicological knowledge discovery by mining emerging patterns from toxicity data. Journal of Cheminformatics, 5(S1).
- Deconvolution of molecular targets for small molecule anti-prion compounds using proteomic and microarray techniques. Prion, 7, 66-66.
- Automating knowledge discovery for toxicity prediction using jumping emerging pattern mining. Journal of Chemical Information and Modeling, 52(11), 3074-3087. View this article in WRRO
- Compression of molecular interaction fields using wavelet thumbnails: Application to molecular alignment. Journal of Chemical Information and Modeling, 52(3), 757-769.
- View this article in WRRO Development and validation of an improved algorithm for overlaying flexible molecules. Journal of Computer-Aided Molecular Design, 1-22.
- Deconvoluting molecular targets for small molecule anti-prion compounds using proteomic and microarray techniques. Prion, 6, 98-98.
- Diversity selection algorithms. Wiley Interdisciplinary Reviews: Computational Molecular Science, 1(4), 580-589.
- Validation of Reaction Vectors forde NovoDesign, 29-43.
- View this article in WRRO Reduced graphs and their applications in chemoinformatics.. Methods in molecular biology (Clifton, N.J.), 672, 197-212.
- Lead optimization using matched molecular pairs: Inclusion of contextual information for enhanced prediction of hERG inhibition, solubility, and lipophilicity. Journal of Chemical Information and Modeling, 50(10), 1872-1886. View this article in WRRO
- Wavelet approximation of GRID fields: Application to quantitative structure-activity relationships. Molecular Informatics, 29(8-9), 603-620.
- ChemInform Abstract: Multiobjective Optimization of Pharmacophore Hypotheses: Bias Toward Low-Energy Conformations.. ChemInform, 41(13).
- Three-dimensional pharmacophore methods in drug discovery. Journal of Medicinal Chemistry, 53(2), 539-558.
- A simulation study of the use of similarity fusion for virtual screening, 46-59.
- Multiobjective optimization of pharmacophore hypotheses: Bias toward low-energy conformations. Journal of Chemical Information and Modeling, 49(12), 2761-2773.
- Use of reduced graphs to encode bioisosterism for similarity-based virtual screening. Journal of Chemical Information and Modeling, 49(6), 1330-1346. View this article in WRRO
- Knowledge-based approach to de Novo design using reaction vectors. Journal of Chemical Information and Modeling, 49(5), 1163-1184.
- Analysis of neighborhood behavior in lead optimization and array design.. Journal of Chemical Information and Modeling, 49(2), 195-208. View this article in WRRO
- Turbo similarity searching: Effect of fingerprint and dataset on virtual-screening performance. Statistical Analysis and Data Mining, 2(2), 103-114. View this article in WRRO
- Assessment of additive/nonadditive effects in structure-activity relationships: Implications for iterative drug design. Journal of Medicinal Chemistry, 51(23), 7552-7562. View this article in WRRO
- ChemInform Abstract: Evolving Interpretable Structure-Activity Relationships. Part 1. Reduced Graph Queries.. ChemInform, 39(48).
- ChemInform Abstract: Evolving Interpretable Structure-Activity Relationship Models. Part 2. Using Multiobjective Optimization to Derive Multiple Models.. ChemInform, 39(48).
- Evolving interpretable structure - Activity relationships. 1. Reduced graph queries. Journal of Chemical Information and Modeling, 48(8), 1543-1557.
- Evolving interpretable structure - Activity relationship models. 2. Using multiobjective optimization to derive multiple models. Journal of Chemical Information and Modeling, 48(8), 1558-1570.
- ChemInform Abstract: A Comparison of Field-Based Similarity Searching Methods: CatShape, FBSS, and ROCS.. ChemInform, 39(29).
- New directions in library design and analysis. Current Opinion in Chemical Biology, 12(3), 372-378. View this article in WRRO
- A comparison of field-based similarity searching methods: CatShape, FBSS, and ROCS. Journal of Chemical Information and Modeling, 48(4), 719-729.
- Data mining of search engine logs. Journal of the American Society for Information Science and Technology, 58(14), 2382-2400.
- Representing Clusters Using a Maximum Common Edge Substructure Algorithm Applied to Reduced Graphs and Molecular Graphs.. ChemInform, 38(24).
- Representing clusters using a maximum common edge substructure algorithm applied to reduced graphs and molecular graphs. Journal of Chemical Information and Modeling, 47(2), 354-366.
- An introduction to chemoinformatics. An Introduction To Chemoinformatics, 1-255.
- Incorporating partial matches within multiobjective pharmacophore identification. Journal of Computer-Aided Molecular Design, 20(12), 735-749. View this article in WRRO
- Analysis of data fusion methods in virtual screening: Similarity and group fusion. Journal of Chemical Information and Modeling, 46(6), 2206-2219.
- Analysis of data fusion methods in virtual screening: Theoretical model. Journal of Chemical Information and Modeling, 46(6), 2193-2205.
- View this article in WRRO Query transformations and their role in Web searching by the general public. Information Research, 12(1).
- Using multiobjective optimization to study the strengths of different interaction energies in protein-ligand complexes. Abstracts of Papers of the American Chemical Society, 232, 411-411.
- CINF 22-Evolving reduced graph queries for extracting structure-activity relationships. Abstracts of Papers of the American Chemical Society, 232.
- Cluster representation using reduced graphs. Abstracts of Papers of the American Chemical Society, 232, 181-181.
- CINF 68-Studying the effects of individual interaction energies in a variety of protein-ligand complexes. Abstracts of Papers of the American Chemical Society, 232.
- Introducing the consensus modeling concept in genetic algorithms: Application to interpretable discriminant analysis. Journal of Chemical Information and Modeling, 46(5), 2110-2124.
- Chemoinformatics Techniques for Processing Chemical Structure Databases, 187-208.
- Training similarity measures for specific activities: Application to reduced graphs. Journal of Chemical Information and Modeling, 46(2), 577-586.
- Scaffold hopping using clique detection applied to reduced graphs. Journal of Chemical Information and Modeling, 46(2), 503-511.
- Library design, synthesis, and screening: Pyridine dicarbonitriles as potential prion disease therapeutics. Journal of Medicinal Chemistry, 49(2), 607-615.
- Comparison of Conformational Analysis Techniques to Generate Pharmacophore Hypotheses Using Catalyst.. ChemInform, 36(22).
- Comparison of conformational analysis techniques to generate pharmacophore hypotheses using catalyst. Journal of Chemical Information and Modeling, 45(2), 461-476.
- Generation of multiple pharmacophore hypotheses using multiobjective optimisation techniques. Journal of Computer-Aided Molecular Design, 18(11), 665-682. View this article in WRRO
- Enhancing the effectiveness of virtual screening by fusing nearest neighbor lists: A comparison of similarity coefficients. Journal of Chemical Information and Computer Sciences, 44(5), 1840-1848.
- Application of Evolutionary Algorithms to Combinatorial Library Design. ChemInform, 35(18).
- Applications of evolutionary computation in drug design. Structure and Bonding, 110, 133-152. View this article in WRRO
- Designing combinatorial libraries optimized on multiple objectives.. Methods in molecular biology (Clifton, N.J.), 275, 335-354. View this article in WRRO
- Similarity Searching Using Reduced Graphs.. ChemInform, 34(21).
- Optimizing the Size and Configuration of Combinatorial Libraries.. ChemInform, 34(21).
- Further development of reduced graphs for identifying bioactive compounds. Journal of Chemical Information and Computer Sciences, 43(2), 346-356.
- Optimizing the size and configuration of combinatorial libraries. Journal of Chemical Information and Computer Sciences, 43(2), 381-390.
- Similarity searching using reduced graphs. Journal of Chemical Information and Computer Sciences, 43(2), 338-345. View this article in WRRO
- Chemoinformatics research at the University of Sheffield: A history and citation analysis. Journal of Information Science, 29(4), 249-267. View this article in WRRO
- Multiobjective optimization in quantitative structure-activity relationships: Deriving accurate and interpretable QSARs. Journal of Medicinal Chemistry, 45(23), 5069-5080.
- A comparison of the pharmacophore identification programs: Catalyst, DISCO and GASP. Journal of Computer-Aided Molecular Design, 16(8-9), 653-681. View this article in WRRO
- Designing focused libraries using MoSELECT. Journal of Molecular Graphics and Modelling, 20(6), 491-498. View this article in WRRO
- Reactant- and product-based aproaches to the design of combinatorial libraries. Journal of Computer-Aided Molecular Design, 16(5-6), 371-380. View this article in WRRO
- Combinatorial library design using a multiobjective genetic algorithm. Journal of Chemical Information and Computer Sciences, 42(2), 375-385. View this article in WRRO
- Computational methods for the analysis of molecular diversity. Pharmacochemistry Library, 32(C), 125-133. View this article in WRRO
- Calculating the knowledge-based similarity of functional groups using crystallographic data. Journal of Computer-Aided Molecular Design, 15(9), 835-857.
- SuperStar: Improved knowledge-based interaction fields for protein binding sites. Journal of Molecular Biology, 307(3), 841-859.
- Reactant- and product-based approaches to the design of combinatorial libraries. Molecular Diversity, 5(4), 245-254.
- Similarity Searching in Files of Three-Dimensional Chemical Structures: Analysis of the BIOSTER Database Using Two-Dimensional Fingerprints and Molecular Field Descriptors.. J. Chem. Inf. Comput. Sci., 40, 295-307.
- Evaluation of reactant-based and product-based approaches to the design of combinatorial libraries. Perspectives in Drug Discovery and Design, 20, 265-287.
- Selecting combinatorial libraries to optimize diversity and physical properties. Journal of Chemical Information and Computer Sciences, 39(1), 169-177.
- Identification of biological activity profiles using substructural analysis and genetic algorithms. Journal of Chemical Information and Computer Sciences, 38(2), 165-179.
- Similarity and dissimilarity methods for processing chemical structure databases. Computer Journal, 41(8), 555-558.
- The effectiveness of reactant pools for generating structurally-diverse combinatorial libraries. Journal of Chemical Information and Computer Sciences, 37(4), 731-740.
- SPROUT: 3D Structure Generation Using Templates.. J. Chem. Inf. Comput. Sci., 35, 479-493.
- SPROUT, HIPPO and CAESA: Tools for de novo structure generation and estimation of synthetic accessibility. Perspectives in Drug Discovery and Design, 3(1), 34-50.
- Evaluation of the Screening Stages of the Sheffield Research Project on Computer Storage and Retrieval of Generic Chemical Structures in Patents. Journal of Chemical Information and Computer Sciences, 34(1), 39-46. View this article in WRRO
- SPROUT: Recent developments in the de novo design of molecules.. J. Chem. Inf. Comput. Sci., 34, 207-217.
- Computer Storage and Retrieval of Generic Chemical Structures in Patents. 15. Generation of Topological Fragment Descriptors from Nontopological Representations of Generic Structure Components. Journal of Chemical Information and Computer Sciences, 33(3), 369-377. View this article in WRRO
- SPROUT: A program for structure generation.. J. Comput. Aided Mol. Des., 7, 127-153.
- Computer Storage and Retrieval of Generic Chemical Structures in Patents. 14. Fragment Generation from Generic Structures. Journal of Chemical Information and Computer Sciences, 32(5), 453-462. View this article in WRRO
- Computer Storage and Retrieval of Generic Chemical Structures in Patents. 11. Theoretical Aspects of the Use of Structure Languages in a Retrieval System. Journal of Chemical Information and Computer Sciences, 31(2), 233-253. View this article in WRRO
- Computer Storage and Retrieval of Generic Chemical Structures in Patents. 12. Principles of Search Operations Involving Parameter Lists: Matching-Relations, User-Defined Match Levels, and Transition from the Reduced Graph Search to the Refined Search. Journal of Chemical Information and Computer Sciences, 31(2), 253-260. View this article in WRRO
- Computer Storage and Retrieval of Generic Chemical Structures in Patents. 13. Reduced Graph Generation. Journal of Chemical Information and Computer Sciences, 31(2), 260-270. View this article in WRRO
- Automated structure design in 3D. Tetrahedron Computer Methodology, 3(6), 681-696.
- ChemInform Abstract: Computer Storage and Retrieval of Generic Chemical Structures in Patents. Part 10. Assignment and Logical Bubble-Up of Ring Screens for Structurally Explicit Generics.. ChemInform, 20(50).
- ChemInform Abstract: Computer Storage and Retrieval of Generic Chemical Structures in Patents. Part 9. An Algorithm to Find the Extended Set of Smallest Rings in Structurally Explicit Generics.. ChemInform, 20(50).
- Computer storage and retrieval of generic chemical structures in patents. 9. An algorithm to find the extended set of smallest rings in structurally explicit generics. Journal of Chemical Information and Computer Sciences, 29(3), 207-214. View this article in WRRO
- Computer storage and retrieval of generic chemical structures in patents. 10. Assignment and logical bubble-up of ring screens for structurally explicit generics. Journal of Chemical Information and Computer Sciences, 29(3), 215-224. View this article in WRRO
- Review of Ring Perception Algorithms for Chemical Graphs. Journal of Chemical Information and Computer Sciences, 29(3), 172-187. View this article in WRRO
- Theoretical aspects of ring perception and development of the extended set of smallest rings concept. Journal of Chemical Information and Computer Sciences, 29(3), 187-206. View this article in WRRO
- Chemical graph matching using transputer networks. Parallel Computing, 8(1-3), 295-300.
- Computer Storage and Retrieval of Generic Chemical Structures in Patents. 8. Reduced Chemical Graphs and Their Applications in Generic Chemical Structure Retrieval
† . Journal of Chemical Information and Computer Sciences, 27(3), 126-137. - Computer storage and retrieval of generic chemical structures in patents. 7. Parallel simulation of a relaxation algorithm for chemical substructure search. Journal of Chemical Information and Computer Sciences®, 26, 118-126.
Chapters
- An analysis of classification approaches for hit song prediction using engineered metadata features with lyrics and audio features In Sserwanga I, Goulding A, Moulaison-Sandy H, Du JT, Soares AL, Hessami V & Frank RD (Ed.), Lecture Notes in Computer Science (pp. 303-311). Springer Nature Switzerland View this article in WRRO
- Virtual Screening Based on Electrostatic Similarity and Flexible Ligands, Computational Science and Its Applications – ICCSA 2022 Workshops (pp. 127-139). Springer International Publishing
- Compound Selection Using Measures of Similarity and Dissimilarity, Comprehensive Medicinal Chemistry II (pp. 167-192). Elsevier
- Chemoinformatics, Comprehensive Medicinal Chemistry II (pp. 235-264). Elsevier
- Compound selection using measures of similarity and dissimilarity, Comprehensive Medicinal Chemistry II (pp. 167-191).
- Application of Evolutionary Algorithms to Combinatorial Library Design, Soft Computing Approaches in Chemistry (pp. 1-30). Springer Berlin Heidelberg
- De Novo Molecular Design (pp. 49-69). Wiley
- Generic chemical structures in patents — an evaluation of the Sheffield University research work, Chemical Information (pp. 161-173). Springer Berlin Heidelberg
- The Sheffield University Generic Chemical Structures Project — A Review of Progress and of Outstanding Problems, Chemical Structures (pp. 151-167). Springer Berlin Heidelberg
- Applications of Chemoinformatics in Drug Discovery, Biomolecular and Bioanalytical Techniques (pp. 17-36). John Wiley & Sons, Ltd
- USING CHEMOINFORMATICS TOOLS TO ANALYZE CHEMICAL ARRAYS IN LEAD OPTIMIZATION, Chemoinformatics for Drug Discovery (pp. 179-204). John Wiley & Sons, Inc
- MultiobjectiveDe NovoDesign of Synthetically Accessible Compounds, De novo Molecular Design (pp. 267-285). Wiley-VCH Verlag GmbH & Co. KGaA
- Mining for Context-Sensitive Bioisosteric Replacements in Large Chemical Databases, Bioisosteres in Medicinal Chemistry (pp. 103-127). Wiley-VCH Verlag GmbH & Co. KGaA
Conference proceedings papers
- Phenotypic screening aided by multitask prediction methods. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, Vol. 256
- Using deep neural networks with heterogeneous chemical data to support phenotypic assay campaigns. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, Vol. 254
- Application of spectral and diffusion geometry descriptors to shape-based virtual screening. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, Vol. 253
- Spectral and diffusion geometry descriptions of molecular shape. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, Vol. 253
- Multiobjective transformation based de novo design: A case study of surfactants. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, Vol. 250
- Searching putative targets in silico for anti-prion compounds. Abstracts of Papers of the American Chemical Society, Vol. 243
- Spectral clustering of chemical data: A Lanczos-based approach. Abstracts of Papers of the American Chemical Society, Vol. 243
- De novo design of synthetically accessible compounds: Application to fragment-based drug design. Abstracts of Papers of the American Chemical Society, Vol. 243
- Pharmacophore methods: Past, present, and future. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, Vol. 242
- Applications of wavelets in virtual screening. Abstracts of Papers of the American Chemical Society, Vol. 240
- De novo design using reaction vectors: Application to library design. Abstracts of Papers of the American Chemical Society, Vol. 237
- Multiobjective approach to optimizing scoring functions for docking. Abstracts of Papers of the American Chemical Society, Vol. 237
- Development of test systems for pharmacophore elucidation. Abstracts of Papers of the American Chemical Society, Vol. 237
- Wavelet compression of GRID fields for similarity searching and virtual screening. Abstracts of Papers of the American Chemical Society, Vol. 237
- CINF 72-Graduate training in chemoinformatics at the University of Sheffield. Abstracts of Papers of the American Chemical Society, Vol. 235
- COMP 97-Combining clique-detection, MOGUL and MOGA for pharmacophore generation. Abstracts of Papers of the American Chemical Society, Vol. 235
- COMP 270-Assessment of additive/nonadditive effects in SAR: Implications in the drug discovery iterative process. Abstracts of Papers of the American Chemical Society, Vol. 235
- CINF 32-Academic-industrial collaboration in chemoinformatics: Experiences from the UK. Abstracts of Papers of the American Chemical Society, Vol. 235
- CINF 53-Structure generation using reaction vectors. Abstracts of Papers of the American Chemical Society, Vol. 235
- Generation of multiple pharmacophore hypotheses using a multiobjective optimization algorithm. Abstracts of Papers of the American Chemical Society, Vol. 231
- Virtual screening using reduced graphs.. Abstracts of Papers of the American Chemical Society, Vol. 228 (pp U362-U363)
- Computational analysis of molecular diversity for drug discovery. Proceedings of the Annual International Conference on Computational Molecular Biology, RECOMB (pp 321-330)
- Searching a Full Generics Database. Chemical Structures 2 (pp 87-103)
Preprints
- Research group
-
Current Researchers
- Emma Armstrong: De Novo Design using Evolutionary Strategies
- Toby King: Investigations of AI techniques in Drug Discovery
Current PhD students
- Terence Egbelo: Use of Knowledge Graphs in Drug Discovery
- Hanz Tantiango: Data Augmentation in De Novo Design
- James Middleton: Alignment-free Molecular Design Methods
Completed PhD students
- Christina Founti: New machine learning methods for the analysis and modelling of pharmaceutical data
- Gian Marco Ghiandoni: Reaction-based molecule design
- Jess Stacey: Data Mining for Lead Optimisation
- Moritz Walter: Understanding artificial intelligence models for toxicity prediction
- James Webster: Decision-theoretic methods for de novo design
- Matthew Seddon: Development of novel techniques for assessing bio-isosteric similarity of chemical fragments.
- Sonny Gan: The application of spectral clustering in drug discovery.
- Jorge Valencia Delgadillo: Multiobjective design of novel antiprion compounds.
- James Wallace: Reaction network for De Novo design
- Georgios Papadatos: Data Mining for Lead Optimisation
- Yogendra Patel: The Prediction of Molecular Properties Using Similarity Searching and Free-Wilson Analysis
- P Watson: Calculating the Knowledge-Based Similarity and Complementarity of Functional Groups based on their Non-Bonded Interactions
- Edward Barker: Chemical Similarity Searching Using Reduced Graphs
- Kristian Birchall: Reduced Graph Approaches to Analysing High- Throughput Screening Data
- Simon Cottrell: Generation of Multiple Pharmacophore Hypotheses Using Multiobjective Optimisation Techniques
- Sally Mardikian: The Application of Multiobjective Optimisation to Protein-Ligand Docking
- Richard Martin: Wavelet Approximation of GRID Fields for Virtual Screening
- Kirstin Moffat: Development Of Computational Methods For 3D Similarity And Structure-Based Design Techniques In Lead Optimisation
- Hina Patel: Patient and Practitioner Understanding of Traditional and Western Medical Acupuncture in Practice: A Qualitative Study
- Richard Sherhod: Development of a Data Mining Tool for the Identification of Toxicophores
- Trudi Wright: Multiobjective Optimisation of Combinatorial Libraries
- Tummala Reddy: Design, Synthesis and SAR Studies Of Pyridine Dicarbonitriles As Potential Prion Therapeutics
- Grants
-
Research Projects
EPSRC CASE studentship with GSK
Engineering and Physical Sciences Research Council Principal Investigator £81,792 1 October 2015 48 months BBSRC CASE studentship with Evotec (UK) Ltd.
Biotechnology and Biological Sciences Research Council Principal Investigator £98,212 1 October 2017 48 months BBSRC CASE studentship
Biotechnology and Biological Sciences Research Council Principal Investigator £70,000 1 October 2015 36 months Bio-renewable Formulation - 6 month extension
Unilever Principal Investigator £49,171 1 February 2015 12 months Diagnostic and Drug Discovery Initiative for Alzheimer's Disease
European Commission Investigator £739,714 1 October 2014 48 months Alzheimer's disease (AD) is the major cause of dementia which has no cure at the moment. The overall aim of the project is to create a long-term strategic partnership between Sheffield University (UK), Lisbon University (Portugal), Eli Lilly (UK) and Biofordrug (Italy) in order to develop chemical biology tools for better understanding the role of PrPC in AD and harnessing this understanding to develop novel chemical entities for diagnostic and therapeutics applications. Our proposed programme will lead to major increases in the knowledge and capacity of all consortium members, achieved through significant intersectoral exchange of personnel between the partners over the duration of the project (total 134 person months) and through temporary recruitment of 5 new experienced researchers (total 114 person months). The project will thus underpin a substantial programme of intersectoral knowledge transfer and research training and lead to significant innovation and advances in several areas of basic research as well as diagnostics and therapeutics development. These activities will strongly enhance EU standing and international competitiveness in this extremely challenging and increasingly important technological area.
Bio-renewable Formulation Information and Knowledge Management System
Technology Strategy Board Investigator £24,992 1 April 2014 24 months Innovative ICT can play a crucial role in many innovation processes, but its potential is not always exploited in many industries. A route to innovation in formulated product industries is the exploitation of materials in what would otherwise be lost to waste streams from current manufacturing processes. This is exciting both in terms of realising additional value from manufacturing, but also in reduced utilisation of unsustainable material sources and exploitation of novel feedstocks for novel functional materials with new application benefits. This project will develop an information system based on highly innovative information technologies with the capability to rapidly identify the feedstock and functional material opportunities for formulated products, and demonstrate its value in rapid bio-derived surfactant discovery. It aims to support chemical using industries where environmental impact, sustainability and materials security are increasingly significant drivers of innovation alongside improved performance in formulated products.
Project partners are Unilever, British Sugar, Croda, Cybula, University of Manchester and University of Liverpool.
N8 Biohub Information and Knowledge Management System
Technology Strategy Board Investigator £131,128 1 October 2013 28 months The overall aim of this project is to build, and demonstrate the value of, an information system (IS) to support the creation of a "Bio-Hub" centred on the N8 university group. The IS will demonstrate how functional ingredients from simple transformations of sustainable plant & waste feedstocks can be identified more quickly and recommend the best feedstocks for a particular function. It will address two big data problems using clever algorithms: semantic extraction of the available domain literature (terabytes) and optimised global search algorithms to explore the combinatorially large number of transformation products (up to petabytes). The innovations are in the creation of robust enough algorithms to run semi-autonomously in an information system and in bringing these together with all the other components. The value will be demonstrated for specific feedstocks and applications, but the ICTs will be selected for simple extension to, and maintenance of, the overall information domain. Project partners are Unilever, British Sugar, Croda, Cybula and University of Manchester.
Targeted dual drug screening in patient tissue to identify new treatments for Parkinson's
Parkinsons UK Investigator £249,795 2 September 2013 36 months The number of patients with Parkinson's disease (PD) will double by 2030. Almost all current treatments focus on replacing the brain chemical, dopamine, which is lost in the disease. These treatments are effective at masking the motor symptoms of PD but they lose effectiveness over time and cause troublesome side effects. There is a lack of treatments which slow or stop disease progression; so called disease modifying therapies which alter the underlying disease causing pathwas. To date drug screens have not proven effective at getting new drugs through discovery and safety testing into the clinic. Therefore alternative drug screening strategies should be explored. Drug re-positioning which identifies new activities/uses of currently licensed drugs has been successful in other diseases. The aim of this study is to design and carry out a re-positioning drug screen in order to identify lead drugs which (i) show beneficial effects on the two most important underlying pathways leading to cell death in PD patient tissue and (ii) are already clinically licensed. The project is being led by Dr. Heather Mortiboys in the Division of Neuroscience. My role is to provide Chemoinformatics support to enable the in silico design of a library of drug compounds for screening.
AstraZeneca Collaboration - Belief Theory
AstraZeneca Investigator £70,000 22 November 2010 14 months This project is focused on developing probabilistic similarity searching methods for searching databases of chemical structures. The aim is to develop computational tools that will support the application of similarity searching methods within AstraZeneca workflows, including effective ways of combining different search methods.
Knowledge Transfer Project - Lhasa Ltd
Technology Strategy Board Principal Investigator £132,024 1 October 2010 24 months Lhasa Ltd. is a not-for-profit software development company whose core business is the development of knowledge about the relationship between chemical structure and toxicity. The aim of the project is to develop data mining methods to automate the discovery of knowledge about chemical structure-toxicity relationships. Knowledge discovery is traditionally done by domain experts who manually examine data sets of chemical structures and their related toxicology, in a very time consuming process. New methods will be developed based on emerging pattern mining which can lead to signficant reductions in the time required to augment the knowledge base.
De Novo Design
Cambridge Crystallographic Data Centre Principal Investigator £120,576 1 April 2009 24 months The project involved the development of an evolutionary algorithm for the de novo design of novel chemical compounds to fit a variety of constraints. The project was a collaboration with the Cambridge Crystallographic Data Centre and Eli Lilly that addressed a key challenge in de novo design, which is ensuring that the compounds that are designed are synthetically accessibility. This was achieved by deriving the the molecular transformations used to generate novel structure from a knowledge-base of reactions and is not limited by reaction type or complexity.
CCDC Pharmacophore Collaboration
Cambridge Crystallographic Data Centre Principal Investigator £66,192 1 July 2008 11 months The project was a continuation of the "AstraZeneca Collaboration - Pharmacophores" project and involved the development of a new multiobjective optimisation method for pharmacophore identification from sets of active compounds. A pharmacophore describes the three-dimensional arrangement of chemical features required for a small molecule to bind to a receptor and the aim of this project was to deduce the pharmacophore from a series of active compounds in the absence of the structure of the receptor itself. This involves superposing the compounds so that their common features are overlaid.
AstraZeneca Collaboration - Pharmacophores
AstraZeneca Principal Investigator £74,512 1 January 2007 12 months The project involved the development of a new multiobjective optimisation method for pharmacophore identification from sets of active compounds. A pharmacophore describes the three-dimensional arrangement of chemical features required for a small molecule to bind to a receptor and the aim of this project was to deduce the pharmacophore from a series of active compounds in the absence of the structure of the receptor itself. This involves superposing the compounds so that their common features are overlaid.
Sanofi-Sheffield collaboration
Sanofi-Aventis Investigator £92,421 1 January 2007 12 months Array design for lead optimisation in pharmaceutical research
GlaxoSmithKline Principal Investigator £252,000 23 October 2006 48 months This EPSRC-funded project focused on the development of tools to assist medicinal chemists in the design of compound arrays during the lead optimisation stage of drug discovery. Lead optimisation is a complex, time-consuming task, in which chemists seek to obtain a promising balance among potency, off-target interactions, toxicity, and pharmacokinetic behaviour, to identify a candidate molecule to progress to clinical trials. The focus has been on inverse QSAR, that is, determining the structural change necessary to achieve a desired change in property. This has been approached through retrospective studies of lead optimisation projects within the GSK archive and the development of computational tools that can be applied in prospective array design to inform decision making by chemists. These included a novel context-sensitive approach to matched molecular-pairs analysis.
Novartis KTP
Knowledge Transfer Partnership Principal Investigator £143,754 1 May 2006 36 months The project involved the development of multiobjective optimisation methods to improve the accuracy with which predictions can be made about the properties of molecules in the early stages of the drug discovery process. The project focused on protein-ligand docking which is a technique that is widely used to prioritise compounds for biological testing but which is limited in its effectiveness due to inadequacies in the scoring functions that are used to drive the procedure. Several of the known limitations of scoring functions were investigated including: the prediction of the binding-pose of a known ligand; the docking of ligands to non-native proteins; and the re-ranking of a dataset of ligands to correlate more closely with known binding affinities.
Janssen MSc Summer Placement
Janssen-Cilag Spain Principal Investigator £1,500 5 May 2005 4 months Analysis of ADMET models
Janssen-Cilag Spain Principal Investigator £1,500 15 June 2004 15 months Support tools for automatic pharmacophore generation
Pfizer Principal Investigator £71,467 1 March 2004 22 months Johnson and Johnson PhD
Janssen Pharmaceuticals N.V. Investigator £76,305 1 January 2004 36 months Tuning Parameters for Multiobjective Library Design
GlaxoSmithKline Principal Investigator £4,277 1 December 2003 2 months Novel methods for the association of chemical similarity with biological response
GlaxoSmithKline Principal Investigator £21,000 1 October 2003 36 months Mining molecular bioassay data
GlaxoSmithKline Principal Investigator £21,000 1 October 2002 36 months The development of computational methods for 3D similarity and structure-based design techniques in lead optimisation
Pfizer Investigator £125,016 1 January 2003 24 months Novel methods for the alignment of flexible molecules
GlaxoSmithKline Principal Investigator £21,000 1 October 2002 36 months Multiple flexible alignment of molecules
Cambridge Crystallographic Data Centre Principal Investigator £19,500 1 May 2002 36 months
- Teaching interests
-
My teaching interests are in applied computational techniques including data mining and machine learning. I played a central role in the curriculum design of the MSc Data Science which was launched in the School in September 2014 and currently I coordinate the module INF6028: Data Mining and Data Visualisation as well as supervising dissertations in data science.
I have a particular interest in applying data science techniques to the processing of chemical structures for applications such as drug design: a field which has become known as Chemoinformatics. I teach INF105: Introduction to Chemoinformatics and INF215: Introduction to Computer Aided Drug Design to first and second year chemistry undergraduates, respectively. I organised and ran an annual short course on a Practical Introduction to Chemoinformatics for many years which attracted PhD students and industrial delegates from across the globe.
- Teaching activities
-
INF325 - Building AI Applications
INF6027 - Introduction to Data Science
INF6028 - Data Mining
INF6060 - Information Retrieval: Search Engines and Digital Libraries
- Professional activities and memberships
-
Membership of professional bodies
- Trustee. Molecular Graphics and Modelling Society, 2005 -
- Committee member of UK QSAR and Chemoinformatics Group, 2005 -
Editorial board membership
- Member of Editorial Board of Journal of Cheminformatics, 2008 -
- Member of Editorial Advisory Board of Journal of Chemical Information and Modeling, 2005 -
- Editor of Reviews in Computational Chemistry, 2005-2006
Committee and advisory group membership
- Member of the Advisory Board of AI3SD: AI for Chemical Discovery EPSRC Network
- Trustee of Lhasa Limited, 2014-to date
- Governor of the Cambridge Crystallographic Data Centre, Dec 2006-May 2013. Vice-Chair of board of Governors May 2013-2014
- Member of Scientific Advisory Committee of the triennial International Conference on Chemical Structures at Noordwijkerhout held in the Netherlands, 2005-to-date
- Member of the American Chemical Society, 2001 to date
- Member of Scientific Advisory Board of EBI Chemistry Services, 2016-2017. Chair of board 2017-to date
- Member of Strategy Advisory Board of EPSRC Prosperity Partnership grant to the Universities of Strathclyde and Nottingham and GlaxoSmithKline (GSK) on “Accelerated Discovery and Development of New Medicines” 2019-
- Member of Management Advisory Panel of EPSRC funded Physical Sciences Data Science service, 2019-
Invited presentations
- “Applications of Machine Learning in Molecular Design: Learning from Bioactivity and Reaction Databases” Summer School on Machine Learning, Leuven, Belgium. Aug. 2018
- “Molecular Shape Matching using Spectral Geometry Techniques.” GlaxoSmithKline. July 5, 2017
- “From Spectral Clustering to Spectral Geometry: Applications in Chemoinformatics.” Computational Chemistry Kitchen, Oxford, Mar 16, 2017
- “De Novo Design: from Drugs to the Design of Sustainable Cleaning Agents,” University of Leeds, Jan. 2017
- “De Novo Design: from Drug Discovery to the Design of Novel Chemicals from Biomasses” German Chemoinformatics Conference, Nov, 2016
- “Perspectives on De Novo Design”, Keynote at Third Oxford Meeting on Drug Design, University of Oxford, 2014.
- “Multiobjective Design of Drug-like Molecules” at Multi-disciplinary Integration and Optimisation in Science and Engineering. UCL, London, 2012
- “Applications of Wavelets in Virtual Screening” at Journal of Chemical Information and Modeling 50th Anniversary Symposium, American Chemical Society, Boston, 2010
Other roles
- Reviewer of research funding proposals for the Austrian Science Fund, BBSRC, EPSRC, Leverhulme Visiting Professorship Scheme, Medical Research Council, Royal Society, Science Foundation Ireland and the Wellcome Trust
- External examiner for PhD theses at the universities of Cambridge, Leeds, Imperial College, Manchester, Newcastle, Portsmouth, York and Cyprus
- Organiser and Chair of the triennial International Conferences in Chemoinformatics held in Sheffield since 1998 Chair of 2018 iConference in Sheffield.